Response Correctness

Objective: This metric measures how accurate and factually grounded the entire response is, as compared to the expected response (ground truth).

Parameters: Prompt, Response ,Expected Response

Interpretation: Higher score indicates the model response was correct for the prompt. Failed result indicates the response is not factually correct compared to the expected response.

Code Execution:

metrics=[
    {"name": "Response Correctness", "config": {"model": "gpt-4o-mini", "provider": "openai"}, "column_name": "your-column-identifier", "schema_mapping": schema_mapping}
]

The "schema_mapping" variable needs to be defined first and is a pre-requisite for evaluation runs. Learn how to set this variable here.

Example:

  • Prompt: Who was the first person to walk on the moon and when did it happen?

  • Expected Response (Ground Truth): The first person to walk on the moon was Neil Armstrong, and it happened on July 20, 1969.

  • Response: The first person to walk on the moon was Buzz Aldrin, and it happened on July 20, 1970.

  • Metric Output: {‘score’: 0, ‘reason’: ‘Neil Armstrong is the first person to walk on moon on July 20, 1969.’}

Last updated