瀏覽代碼

fix swiglu acronym

rasbt 1 年之前
父節點
當前提交
c735c21e87
共有 1 個文件被更改,包括 1 次插入1 次删除
  1. 1 1
      ch04/01_main-chapter-code/ch04.ipynb

+ 1 - 1
ch04/01_main-chapter-code/ch04.ipynb

@@ -581,7 +581,7 @@
     "- In this section, we implement a small neural network submodule that is used as part of the transformer block in LLMs\n",
     "- We start with the activation function\n",
     "- In deep learning, ReLU (Rectified Linear Unit) activation functions are commonly used due to their simplicity and effectiveness in various neural network architectures\n",
-    "- In LLMs, various other types of activation functions are used beyond the traditional ReLU; two notable examples are GELU (Gaussian Error Linear Unit) and SwiGLU (Sigmoid-Weighted Linear Unit)\n",
+    "- In LLMs, various other types of activation functions are used beyond the traditional ReLU; two notable examples are GELU (Gaussian Error Linear Unit) and SwiGLU (Swish-Gated Linear Unit)\n",
     "- GELU and SwiGLU are more complex, smooth activation functions incorporating Gaussian and sigmoid-gated linear units, respectively, offering better performance for deep learning models, unlike the simpler, piecewise linear function of ReLU"
    ]
   },