|
|
@@ -260,7 +260,7 @@
|
|
|
"id": "0f3d7ea2-637f-4490-bc76-e361fc81ae98"
|
|
|
},
|
|
|
"source": [
|
|
|
- "### 5.1.2 Calculating the text generation loss: cross entropy, and perplexity"
|
|
|
+ "### 5.1.2 Calculating the text generation loss: cross-entropy and perplexity"
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
@@ -558,7 +558,7 @@
|
|
|
"metadata": {},
|
|
|
"source": [
|
|
|
"- In deep learning, instead of maximizing the average log-probability, it's a standard convention to minimize the *negative* average log-probability value; in our case, instead of maximizing -10.7722 so that it approaches 0, in deep learning, we would minimize 10.7722 so that it approaches 0\n",
|
|
|
- "- The value negative of -10.7722, i.e., 10.7722, is also called cross entropy loss in deep learning"
|
|
|
+ "- The value negative of -10.7722, i.e., 10.7722, is also called cross-entropy loss in deep learning"
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
@@ -601,7 +601,7 @@
|
|
|
"id": "e8aaf9dd-3ee6-42bf-a63f-6e93dbfb989d",
|
|
|
"metadata": {},
|
|
|
"source": [
|
|
|
- "- Before we apply the cross entropy function, let's check the shape of the logits and targets"
|
|
|
+ "- Before we apply the `cross_entropy` function, let's check the shape of the logits and targets"
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
@@ -638,7 +638,7 @@
|
|
|
"id": "1d3d65f0-6566-4865-93e4-0c0bcb10cd06",
|
|
|
"metadata": {},
|
|
|
"source": [
|
|
|
- "- For the cross `entropy_loss` function in PyTorch, we want to flatten these tensors by combining them over the batch dimension:"
|
|
|
+ "- For the `cross_entropy` function in PyTorch, we want to flatten these tensors by combining them over the batch dimension:"
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
@@ -709,8 +709,8 @@
|
|
|
"id": "0f15ce17-fd7b-4d8e-99da-b237523a7a80",
|
|
|
"metadata": {},
|
|
|
"source": [
|
|
|
- "- A concept related to the cross entropy loss is the perplexity of an LLM\n",
|
|
|
- "- The perplexity is simply the exponential of the cross entropy loss"
|
|
|
+ "- A concept related to the cross-entropy loss is the perplexity of an LLM\n",
|
|
|
+ "- The perplexity is simply the exponential of the cross-entropy loss"
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
@@ -1077,7 +1077,7 @@
|
|
|
"id": "5c3085e8-665e-48eb-bb41-cdde61537e06",
|
|
|
"metadata": {},
|
|
|
"source": [
|
|
|
- "- Next, we implement a utility function to calculate the cross entropy loss of a given batch\n",
|
|
|
+ "- Next, we implement a utility function to calculate the cross-entropy loss of a given batch\n",
|
|
|
"- In addition, we implement a second utility function to compute the loss for a user-specified number of batches in a data loader"
|
|
|
]
|
|
|
},
|