rasbt 1 rok temu
rodzic
commit
cd7ea15e8d

+ 4 - 2
ch05/01_main-chapter-code/README.md

@@ -1,7 +1,9 @@
 # Chapter 5: Pretraining on Unlabeled Data
 
 - [ch05.ipynb](ch05.ipynb) contains all the code as it appears in the chapter
-- [previous_chapters.py](previous_chapters.py) is a Python module that contains the `MultiHeadAttention` module from the previous chapter, which we import in [ch05.ipynb](ch05.ipynb) to pretrain the GPT model
-- [gpt_train.py](gpt_train.py) is a standalone Python script file with the code that we implemented in [ch05.ipynb](ch05.ipynb) to train the GPT model
+- [previous_chapters.py](previous_chapters.py) is a Python module that contains the `MultiHeadAttention` module and `GPTModel` class from the previous chapters, which we import in [ch05.ipynb](ch05.ipynb) to pretrain the GPT model
+- [gpt_train.py](gpt_train.py) is a standalone Python script file with the code that we implemented in [ch05.ipynb](ch05.ipynb) to train the GPT model (you can think of it as a code file summarizing this chapter)
 - [gpt_generate.py](gpt_generate.py) is a standalone Python script file with the code that we implemented in [ch05.ipynb](ch05.ipynb) to load and use the pretrained model weights from OpenAI
+- [gpt_download.py](gpt_download.py) contains the utility functions for downloading the pretrained GPT model weights
+- [exercise-solutions.ipynb](exercise-solutions.ipynb) contains the exercise solutions for this chapter
 

+ 3 - 2
ch05/README.md

@@ -3,5 +3,6 @@
 - [01_main-chapter-code](01_main-chapter-code) contains the main chapter code
 - [02_alternative_weight_loading](02_alternative_weight_loading) contains code to load the GPT model weights from alternative places in case the model weights become unavailable from OpenAI
 - [03_bonus_pretraining_on_gutenberg](03_bonus_pretraining_on_gutenberg) contains code to pretrain the LLM longer on the whole corpus of books from Project Gutenberg
-- [04_learning_rate_schedulers] contains code implementing a more sophisticated training function including learning rate schedulers and gradient clipping
-- [05_bonus_hparam_tuning](05_bonus_hparam_tuning) contains an optional hyperparameter tuning script
+- [04_learning_rate_schedulers](04_learning_rate_schedulers) contains code implementing a more sophisticated training function including learning rate schedulers and gradient clipping
+- [05_bonus_hparam_tuning](05_bonus_hparam_tuning) contains an optional hyperparameter tuning script
+

+ 8 - 1
ch06/01_main-chapter-code/README.md

@@ -1 +1,8 @@
-In progress.
+# Chapter 6: Finetuning for Classification
+
+- [ch06.ipynb](ch06.ipynb) contains all the code as it appears in the chapter
+- [previous_chapters.py](previous_chapters.py) is a Python module that contains the GPT model we coded and trained in previous chapters, alongside many utility functions, which we reuse in this chapter
+- [gpt-class-finetune.py](gpt-class-finetune.py) is a standalone Python script file with the code that we implemented in [ch06.ipynb](ch06.ipynb) to finetune the GPT model (you can think of it as a chapter summary)
+- [gpt_download.py](gpt_download.py) contains the utility functions for downloading the pretrained GPT model weights
+- [exercise-solutions.ipynb](exercise-solutions.ipynb) contains the exercise solutions for this chapter
+

+ 5 - 0
ch06/README.md

@@ -0,0 +1,5 @@
+# Chapter 6: Finetuning for Classification
+
+- [01_main-chapter-code](01_main-chapter-code) contains the main chapter code
+- [02_bonus_additional-experiments](02_bonus_additional-experiments) includes additional experiments (e.g., training the last vs first token, extending the input length, etc.)
+- [03_bonus_imdb-classification](03_bonus_imdb-classification) compares the LLM from chapter 6 with other models on a 50k IMDB movie review sentiment classification dataset