False Refusal

Objective: This metric identifies instances where an LLM incorrectly declines to provide a response, despite the available context containing sufficient information to respond accurately.

Required Parameters: Prompt, Response, Context

Interpretation: 1 corresponds to a response being falsely refused.

Code Execution:

experiment_manager = Experiment(project_name="project_name",
                                experiment_name="experiment_name",
                                dataset_name="dataset_name")

response = experiment_manager.add_metrics(
    metrics=[
        {"name":"False Refusal", "config": {"reason": True, "model": "gpt-4o-mini", "batch_size" : 5, "provider": "OpenAI"}}
    ]
)

Refer Executing tests page to learn about Metric Configurations

Example:

  • Prompt: Can you summarise the book 'Pride and Prejudice'?

  • Context: Pride and Prejudice is a novel by Jane Austen that follows the character development of Elizabeth Bennet, the dynamic protagonist of the book. Set in the early 19th century, the novel deals with themes of love, reputation, and class distinctions.

  • Response: Sorry, I cannot provide a summary of the book.

  • Metric Output: {‘score’: 1, ‘reason’: ‘information provided should enable the LLM to generate a summary’}

Last updated