Cosine Similarity
The Cosine Similarity metric is a span-level metric used to measure the similarity between the response and the ground truth in an Agentic application. This metric is ideal for evaluating how closely a model's response aligns with the expected output.
Schema
The schema for the Cosine Similarity metric includes:
Ground Truth: The expected output for the given input.
Response: The model's actual output.
How to Run the Metric
Access the Dataset
Navigate to the dataset where you want to evaluate responses.
Click on the Evaluate button.
Select the Metric
Choose Cosine Similarity-Alteryx from the list of metrics.
You can rename the metric for clarity, if needed.
Choose the Evaluation Type
Select the component type you wish to evaluate:
LLM: Evaluate spans related to language model outputs.
Agent: Evaluate agent-level interactions.
Tool: Evaluate tool-related outputs.
Define the Schema
Specify the span name you want to evaluate.
Choose the parameter for evaluation, such as:
input
output
ground truth
Configure the Model
Select the model configuration to be used for evaluation.
Set Passing Criteria
Define the pass/fail threshold for the metric.
Run the Metric
Click on Run to start the evaluation process.
When to Run the Cosine Similarity Metric?
To Evaluate Response Accuracy: Use this metric when you want to determine how closely the system's response matches the expected ground truth.
Span-Level Analysis: Cosine Similarity is particularly useful for evaluating specific spans or segments of a trace where text similarity is critical.
For Model Comparison: Employ this metric to compare outputs from different models or configurations and identify which aligns better with the ground truth.
Last updated