Cosine Similarity

The Cosine Similarity metric is a span-level metric used to measure the similarity between the response and the ground truth in an Agentic application. This metric is ideal for evaluating how closely a model's response aligns with the expected output.


Schema

The schema for the Cosine Similarity metric includes:

  • Ground Truth: The expected output for the given input.

  • Response: The model's actual output.


How to Run the Metric

  1. Access the Dataset

    • Navigate to the dataset where you want to evaluate responses.

    • Click on the Evaluate button.

  2. Select the Metric

    • Choose Cosine Similarity-Alteryx from the list of metrics.

    • You can rename the metric for clarity, if needed.

  3. Choose the Evaluation Type

    • Select the component type you wish to evaluate:

      • LLM: Evaluate spans related to language model outputs.

      • Agent: Evaluate agent-level interactions.

      • Tool: Evaluate tool-related outputs.

  4. Define the Schema

    • Specify the span name you want to evaluate.

    • Choose the parameter for evaluation, such as:

      • input

      • output

      • ground truth

  5. Configure the Model

    • Select the model configuration to be used for evaluation.

  6. Set Passing Criteria

    • Define the pass/fail threshold for the metric.

  7. Run the Metric

    • Click on Run to start the evaluation process.


When to Run the Cosine Similarity Metric?

  • To Evaluate Response Accuracy: Use this metric when you want to determine how closely the system's response matches the expected ground truth.

  • Span-Level Analysis: Cosine Similarity is particularly useful for evaluating specific spans or segments of a trace where text similarity is critical.

  • For Model Comparison: Employ this metric to compare outputs from different models or configurations and identify which aligns better with the ground truth.

Last updated