User Quickstart
This guide will help you get started with RagaAI Catalyst using any sample dataset, straight from your Python environment.
Last updated
This guide will help you get started with RagaAI Catalyst using any sample dataset, straight from your Python environment.
Last updated
The above video demonstrates all the steps required to perform your first evaluation on the Catalyst software. All commands shown in the video have been documented below.
Click here to open the companion notebook in Google Colab
1. Sign Up and Authentication
1.1 Sign Up
Start by signing up for an account @ https://catalyst.raga.ai/
1.2 Install Python Package
1.3 Authenticate Using Keys
Authenticate using your access key and secret key:
You can find Catalyst Access & Secret Keys in the UI by navigating to Settings -> Authenticate and use the "Generate New Key" button to create a fresh pair:
A valid API key will be required to run evaluations on the public sandbox. Your API keys can be added on the UI by navigating to Settings -> API Keys as follows:
2. Create New Project
Create a new project for testing your LLM Application:
User can also create the project via UI by clicking the "Create New Project" button on the platform homepage:
List your projects:
Note: 'num_projects' is optional and is used to list the latest 'n' projects (here, 2). Passing list_projects() without any arguments will list all the projects available.
3. Upload Dataset
RagaAI Catalyst enables users to ingest data in two broad ways: real-time tracing and static uploads of data (CSV, Pandas, etc.).
3.1 Upload Dataset via CSV
Once your project is created, you can upload datasets to it. Here's how you can upload a dataset via a CSV file:
Select your newly created project from the project list.
Navigate to the Dataset tab within the project.
Click on the Create New Dataset button.
In the upload window, select the Upload CSV tab.
Click on the upload area and browse/drag and drop your local CSV file. Ensure the file size does not exceed 1GB.
Enter a suitable name for your dataset.
Click Next to proceed.
Next, you will be directed to map your dataset schema with Catalyst's inbuilt schema, so that your column headings don't require editing:
Here is a list of Catalyst's inbuilt schema elements (definitions are for reference purposes and may vary slightly based on your use case):
3.2 Upload Dataset Via Tracing
Set up a tracer to monitor your application:
Once the tracer has successfully started, execute your RAG Langchain code. You can find a sample RAG code in this Colab project. Your OpenAI API key will be required to run this example.
Once traces have been logged using your code, stop the tracer and check its status:
If successful, you can view the logged dataset by navigating to Project_Name -> Logs.
In order to run metrics on traced logs, Catalyst requires users to create a sub-dataset as follows:
You can also do this on the Catalyst UI by navigating to Project_Name -> Logs and clicking on the "Create Dataset" button.
In the dialog box that pops up, users can create a sub-dataset from the traced logs by filtering on metadata, human feedback, prompt length, and time as shown:
4. Create Experiments and Run Metrics
Set up an experiment and add metrics for evaluation:
Check out the list of Supported metrics.
Check out Test Execution documentation to learn about Metric Configuration
Once the experiment is executed, get the status of ypur experiment execution by Navigating to Job Status from the navigation bar on the User Interface. Alternatively, run the following command to print the status on your notebook:
You can view the experiment on the platform. Alternatively, run the following command to print the results on your notebook:
Experiment View:
Refer docs to learn more about analysing new results [Learn More]
Schema Element | Definition |
---|---|
traceId
Unique ID associated with a trace
metadata
Any additional data not falling into a defined bucket
cost
Expense associated with generating a particular inference
expected_context
Context documents expected to be retrieved for a query
latency
Time taken for an inference to be returned
system_prompt
Predefined instruction provided to an LLM to shape its behaviour during interactions
traceUri
Unique identifier used to trace and log the sequence of operations during an LLM inference process
pipeline
Sequence of processes or stages that an input passes through before producing an output in LLM systems
response
Output generated by an LLM after processing a given prompt or query
context
Surrounding information or history provided to an LLM to inform and influence its responses
sanitized_response
A version of the LLM's output that has been modified to remove sensitive, inappropriate, or undesirable content
prompt
Input or query provided to an LLM that triggers the generation of a response
expected_response
Anticipated or ideal output that an LLM should produce in response to a given prompt
timestamp
Specific date and time at which an LLM action, such as an inference or a response, occurs