Add Dataset

The Add Datasets feature in RagaAI Catalyst provides users with two ways to extend their existing datasets: by adding new rows or adding new columns. This allows users to either append additional data or introduce new response columns based on prompts. The workflow adapts based on the dataset's original upload method (CSV or traces).


Adding Rows

1. How to Add Rows

  • Click the "+" icon and choose the "Add Rows" option.

2. Adding Rows via CSV

  • If your dataset was originally uploaded via CSV, you can append new rows by uploading another CSV file.

    • The new CSV must match the existing dataset schema or a valid subset of it.

3. Adding Rows via Logging Traces

  • If your dataset was uploaded via logging traces, you can only append new rows via tracing.

    • This maintains consistency with the original dataset format.

Important Note: Ensure that the format of the new data (CSV or trace) aligns with the schema of the existing dataset to avoid any upload errors.


Adding Columns

1. How to Add Columns

  • Click the "+" icon and choose the "Add Columns" option.

2. Creating a New Response Column

  • Enter a unique response column name to define the new column.

3. Selecting Column Type

  • Set the column type to "Run Prompt". This allows the column to be used for generating new responses based on prompts.

4. Importing a Prompt Slug

  • Click "Import Slug" to pull in an existing prompt from the playground.

    • You’ll need to select the prompt name and version.

5. Mapping Variables

  • Map the variables in the prompt slug to the relevant columns in your dataset. This mapping allows the prompt to generate new responses based on the existing data.

6. Optional: Applying Filters

  • You can apply filters to specify which data points you want to generate responses for.

    • This is useful if you want to run the prompt on a subset of the dataset instead of the entire dataset.

Last updated