|
|
@@ -132,7 +132,7 @@
|
|
|
"\n",
|
|
|
"- **Step 1:** compute unnormalized attention scores $\\omega$.\n",
|
|
|
"- Suppose we use the second input token as the query, that is, $q^{(2)} = x^{(2)}$, we compute the unnormalized attention scores via dot products:\n",
|
|
|
- " - $\\omega_{12} = x^{(1)} q^{(2)\\top}$\n",
|
|
|
+ " - $\\omega_{21} = x^{(1)} q^{(2)\\top}$\n",
|
|
|
" - $\\omega_{22} = x^{(2)} q^{(2)\\top}$\n",
|
|
|
" - $\\omega_{23} = x^{(3)} q^{(2)\\top}$\n",
|
|
|
" - ...\n",
|