Faithfulness

Objective: The Faithfulness metric evaluates if the RAG pipeline is accurate by comparing the LLM's response with the context retrieved from the Knowledge Base. This metric is vital for us to ensure that the LLM is using the provided context and is not hallucinating new information or focussing on the wrong context for answering the prompt.

Required Parameters: response, context

Interpretation:

  • Lower faithfulness score indicates the model is not able to focus on the correct context document.

  • Lower faithfulness score indicates the model is hallucinating and generating information not present in the context documents.

  • Lower faithfulness score indicates the Knowledge Base has contradicting information regarding the topic referred to in the prompt.

Code Execution:

experiment_manager = Experiment(project_name="project_name",
                                experiment_name="experiment_name",
                                dataset_name="dataset_name")

response = experiment_manager.add_metrics(
    metrics=[
        {"name":"faithfulness", "config": {"model": "gpt-4o"}},
        {"name":"faithfulness", "config": {"model": "gpt-4"}},
        {"name":"faithfulness", "config": {"model": "gpt-3.5-turbo"}}
    ]
)

Last updated