`raga-llm-hub` is provided as an open-source Python package. To install it, run the following command:

pip install raga_llm_hub

You can also use a virtual environment for installation

with pip

  • python -m venv venv - Create a new python environment.

  • source venv/bin/activate - Activate the environment.

  • pip install raga-llm-hub - Install the package

with conda

  • conda create --name myenv - Create a new python environment.

  • conda activate myenv - Activate the environment.

  • python -m pip install raga-llm-hub - Install the package

Setting Up

Setting up raga-llm-eval is straightforward. Let's get your environment ready for testing.First things first, let's set up your API keys or any other necessary credentials. We use LLM as a Judge to evaluate many of the metrics and currently, we support OpenAI models as LLM Judge to score the evaluation tests​

from raga_llm_hub import RagaLLMEval

evaluator = RagaLLMEval(
    api_keys={"OPENAI_API_KEY": "Your Key"}

Loading Data

raga_llm_hub offers versatile data loading options to suit your needs.

1. Using user-provided content

Directly input your own texts or data snippets.

# Add tests with custom data
    test_names=["relevancy_test"], # an example evaluation test name
        "prompt": ["How are you?"],
        "context": ["You are a student, answering your teacher."],
        "response": ["I am fine. Thank you"],
    arguments={"model": "gpt-4", "threshold": 0.6}, # Arguments
).run() # executes the test

evaluator.print_results() # print test results in a pretty format

2. Loading from JSON file

Easily load your data from a JSON file, making bulk evaluations seamless.

#add data to be tested from JSON file
    arguments={"model": "gpt-4", "threshold": 0.6},


Result Management

Efficiently manage your test results with raga-llm-hub robust result management features.

1. Printing Results

Instantly view your results in the notebook for a quick analysis.


2. Saving Results

Save your results to a file for in-depth analysis later or to share with your team.


3. Accessing Results

Access detailed results and metrics through the package's API for further processing or visualization.


4. Re-using Previous Results

Leverage previous results for comparative analysis or to track your LLM's performance over time.

PREV_EVAL_ID = "433eba04-dfb8-4634-ae9c-5457d92b344d"



Last updated