Quickstart

Installation

`raga-llm-hub` is provided as an open-source Python package. To install it, run the following command:

pip install raga_llm_hub

You can also use a virtual environment for installation

with pip

  • python -m venv venv - Create a new python environment.

  • source venv/bin/activate - Activate the environment.

  • pip install raga-llm-hub - Install the package

with conda

  • conda create --name myenv - Create a new python environment.

  • conda activate myenv - Activate the environment.

  • python -m pip install raga-llm-hub - Install the package

Setting Up

Setting up raga-llm-eval is straightforward. Let's get your environment ready for testing.First things first, let's set up your API keys or any other necessary credentials. We use LLM as a Judge to evaluate many of the metrics and currently, we support OpenAI models as LLM Judge to score the evaluation tests​

from raga_llm_hub import RagaLLMEval

evaluator = RagaLLMEval(
    api_keys={"OPENAI_API_KEY": "Your Key"}
)

Loading Data

raga_llm_hub offers versatile data loading options to suit your needs.

1. Using user-provided content

Directly input your own texts or data snippets.

# Add tests with custom data
evaluator.add_test(
    test_names=["relevancy_test"], # an example evaluation test name
    data={
        "prompt": ["How are you?"],
        "context": ["You are a student, answering your teacher."],
        "response": ["I am fine. Thank you"],
    },
    arguments={"model": "gpt-4", "threshold": 0.6}, # Arguments
).run() # executes the test

evaluator.print_results() # print test results in a pretty format

2. Loading from JSON file

Easily load your data from a JSON file, making bulk evaluations seamless.

#add data to be tested from JSON file
evaluator.add_test(
    test_names=["relevancy_test"],
    data="test_data.json",
    arguments={"model": "gpt-4", "threshold": 0.6},
).run()

evaluator.print_results()

Result Management

Efficiently manage your test results with raga-llm-hub robust result management features.

1. Printing Results

Instantly view your results in the notebook for a quick analysis.

evaluator.print_results()

2. Saving Results

Save your results to a file for in-depth analysis later or to share with your team.

evaluator.save_results("test_results.json")

3. Accessing Results

Access detailed results and metrics through the package's API for further processing or visualization.

evaluator.get_results()

4. Re-using Previous Results

Leverage previous results for comparative analysis or to track your LLM's performance over time.

PREV_EVAL_ID = "433eba04-dfb8-4634-ae9c-5457d92b344d"

evaluator.load_eval(
    eval_name=PREV_EVAL_ID
)

evaluator.print_results()
evaluator.save_results("test_results.json")

Last updated