Cosine Similarity

The Cosine Similarity metric is a span-level metric used to measure the similarity between the response and the ground truth in an Agentic application. This metric is ideal for evaluating how closely a model's response aligns with the expected output.

Schema

The schema for the Cosine Similarity metric includes:

Ground Truth: The expected output for the given input.
Response: The model's actual output.

How to Run the Metric

Access the Dataset
- Navigate to the dataset where you want to evaluate responses.
- Click on the Evaluate button.
Select the Metric
- Choose Cosine Similarity-Alteryx from the list of metrics.
- You can rename the metric for clarity, if needed.
Choose the Evaluation Type
- Select the component type you wish to evaluate:
  - LLM: Evaluate spans related to language model outputs.
  - Agent: Evaluate agent-level interactions.
  - Tool: Evaluate tool-related outputs.
Define the Schema
- Specify the span name you want to evaluate.
- Choose the parameter for evaluation, such as:
  - input
  - output
  - ground truth
Configure the Model
- Select the model configuration to be used for evaluation.
Set Passing Criteria
- Define the pass/fail threshold for the metric.
Run the Metric
- Click on Run to start the evaluation process.

When to Run the Cosine Similarity Metric?

To Evaluate Response Accuracy: Use this metric when you want to determine how closely the system's response matches the expected ground truth.
Span-Level Analysis: Cosine Similarity is particularly useful for evaluating specific spans or segments of a trace where text similarity is critical.
For Model Comparison: Employ this metric to compare outputs from different models or configurations and identify which aligns better with the ground truth.

PreviousHonesty NextCompare Traces

Last updated 7 months ago

Was this helpful?