Compare Experiments

The Compare Experiments feature in RagaAI Catalyst enables users to analyze and contrast up to 3 experiments, helping identify differences and patterns across experiments. This feature is ideal for evaluating the performance and impact of changes across different experimental setups.

Benefits of Experiment Comparison

Comprehensive Analysis: Identify patterns and dependencies by comparing performance metrics and behaviors across experiments.
Cost Optimization: Gain insights into cost breakdowns and opportunities for efficiency improvements.
Versioning Insights: Understand how different code versions and configurations impact overall outcomes.

Steps to Compare Experiments

Navigate to the Dataset Page
- Inside the dataset view, locate the section displaying the selected dataset name.
Select Compare Dataset
- Click on the Compare Dataset button to initiate the experiment comparison process.
Choose Experiments
- Select up to 2 experiments you wish to compare.
Start Comparison
- Click the Compare Experiments button to generate the Diff View.
Baseline Experiment
- You can change the baseline experiment to analyze the comparison relative to a different experiment.

Features of the Diff View

Code Diff
- Displays a side-by-side comparison of the code versions or configurations used in the experiments.
- Highlights differences, additions, and deletions.
- Allows toggling between multiple code versions for in-depth comparison.
Analysis Section
- Tool Usage Count Bar Chart:
  - Visualizes how often each tool is used across experiments.
  - Reveals patterns and dependencies in tool utilization.
- Time vs. Tool Calls Chart:
  - Shows how tool execution times vary between experiments.
  - Helps identify bottlenecks or performance discrepancies.
- Token Consumption Analysis:
  - Compares model interactions across experiments in terms of token usage.
  - Enables users to evaluate efficiency and resource allocation.
- Cost Analysis:
  - Breaks down cost incurred in each experiment.
  - Helps guide optimization decisions by identifying high-cost components.

Why Compare Experiments?

Data-Driven Insights: Make informed decisions about model configurations, tool selection, and resource allocation.
Performance Benchmarking: Establish baselines and measure improvements or regressions.
Experiment Optimization: Refine workflows and configurations based on comparative data.

PreviousCompare Traces NextAdd metrics locally

Last updated 7 months ago

Was this helpful?