Quickstart

1. Create a Project

  • Log in to the RagaAI Catalyst platform.

  • Navigate to the Projects section and click Create New Project.

  • Select the use case as "Agentic Application".

  • Provide a name for your project and click Create.


2. Create a Dataset Using Tracing

  • Once your project is created, you'll be redirected to the Dataset page.

  • Use tracing to record interactions and instrument your application. Refer to the documentation on how to set up tracing.

  • Commit the traces to create a dataset. A dataset represents an experiment and contains all the recorded interactions for evaluation.

  • View the dataset columns, which include essential information such as TraceID, Timestamp, Trace URL, Feedback, Response, and Metrics.


3. Run Evaluations and Metrics

  • Navigate to your dataset and click on the Evaluate button.

  • Choose from pre-configured metrics such as Hallucination, Cosine Similarity, Honesty, or Toxicity.

  • Configure the metric:

    • Select the evaluation type (e.g., LLM, Agent, or Tool).

    • Define the schema by choosing the span name and parameters to evaluate.

    • Configure the model and set pass/fail thresholds.

  • Run the metric and analyze the results for insights into your application's performance.


4. Compare Traces

  • Within the dataset, click on the Compare button.

  • Select up to 3 datapoints (traces) to compare.

  • View the diff view, which highlights differences in code and attributes between traces.

  • Use this feature to analyze performance variations across different spans of the application.


5. Compare Experiments

  • In the Dataset view, select Compare Datasets.

  • Choose up to 3 experiments for comparison.

  • Click Compare Experiments to open a new diff view:

    • Compare code versions and toggle between different configurations.

    • Change the baseline experiment for a more detailed comparison.

    • Analyze graphs such as:

      • Tool Usage Count Bar Chart: Understand patterns in tool usage.

      • Time vs. Tool Calls Chart: Compare execution times across experiments.

      • Cost Analysis: Evaluate cost efficiency between experiments.


Last updated