Sebastian Raschka a08d7aaa84 Uv workflow improvements (#531) 9 mesiacov pred
..
scores f5a4f9dee3 add spearman and kendall-tau analysis 1 rok pred
README.md 4ac480c9ae add instruction dataset 1 rok pred
config.json 87c3e78dcb Revert "Revert "newline"" 1 rok pred
eval-example-data.json 771992c486 Add openai model eval utility code 1 rok pred
llm-instruction-eval-ollama.ipynb a7869ad2bf Fix 8-billion-parameter spelling 1 rok pred
llm-instruction-eval-openai.ipynb a08d7aaa84 Uv workflow improvements (#531) 9 mesiacov pred
requirements-extra.txt 6290dade88 remove redundant dependency 1 rok pred

README.md

Chapter 7: Finetuning to Follow Instructions

This folder contains utility code that can be used for model evaluation.

 

Evaluating Instruction Responses Using the OpenAI API

  • The llm-instruction-eval-openai.ipynb notebook uses OpenAI's GPT-4 to evaluate responses generated by instruction finetuned models. It works with a JSON file in the following format:

    {
    "instruction": "What is the atomic number of helium?",
    "input": "",
    "output": "The atomic number of helium is 2.",               # <-- The target given in the test set
    "model 1 response": "\nThe atomic number of helium is 2.0.", # <-- Response by an LLM
    "model 2 response": "\nThe atomic number of helium is 3."    # <-- Response by a 2nd LLM
    },
    

 

Evaluating Instruction Responses Locally Using Ollama