Sebastian Raschka a08d7aaa84 Uv workflow improvements (#531)		9 mesiacov pred
..
scores	f5a4f9dee3 add spearman and kendall-tau analysis	1 rok pred
README.md	4ac480c9ae add instruction dataset	1 rok pred
config.json	87c3e78dcb Revert "Revert "newline""	1 rok pred
eval-example-data.json	771992c486 Add openai model eval utility code	1 rok pred
llm-instruction-eval-ollama.ipynb	a7869ad2bf Fix 8-billion-parameter spelling	1 rok pred
llm-instruction-eval-openai.ipynb	a08d7aaa84 Uv workflow improvements (#531)	9 mesiacov pred
requirements-extra.txt	6290dade88 remove redundant dependency	1 rok pred

Chapter 7: Finetuning to Follow Instructions

This folder contains utility code that can be used for model evaluation.

Evaluating Instruction Responses Using the OpenAI API

The llm-instruction-eval-openai.ipynb notebook uses OpenAI's GPT-4 to evaluate responses generated by instruction finetuned models. It works with a JSON file in the following format:

{
"instruction": "What is the atomic number of helium?",
"input": "",
"output": "The atomic number of helium is 2.",               # <-- The target given in the test set
"model 1 response": "\nThe atomic number of helium is 2.0.", # <-- Response by an LLM
"model 2 response": "\nThe atomic number of helium is 3."    # <-- Response by a 2nd LLM
},

Evaluating Instruction Responses Locally Using Ollama

The llm-instruction-eval-ollama.ipynb notebook offers an alternative to the one above, utilizing a locally downloaded Llama 3 model via Ollama.

README.md

Chapter 7: Finetuning to Follow Instructions

Evaluating Instruction Responses Using the OpenAI API

Evaluating Instruction Responses Locally Using Ollama