Logging traces

Another method to upload datasets is by logging traces of your application. This involves setting up a tracer to monitor your application. Follow these steps:

  • Set up a tracer to monitor your application:

from ragaai_catalyst import Tracer

tracer = Tracer(
    project_name="your_project_name",
    tracer_type="langchain",
    pipeline={
        "llm_model": "gpt-4o-mini",
        "vector_store": "faiss",
        "embed_model": "text-embedding-ada-002",
    },
    #add your metadata as "key":"value" pairs
    metadata={"use-case": "YourUseCase", "stage": "testing-stage"}
)

tracer.start()
  • In case you see an error around tenacity v9.0.0 on executing tracer.start(), retry the same after using the following command (ignore in absence of error):

Click here to open the companion notebook in Google Colab

pip install tenacity==8.3.0
  • Once the tracer has successfully started, execute your RAG Langchain code. You can find a sample RAG code in this Colab project. Your OpenAI API key will be required to run this example.

  • Once traces have been logged using your code, stop the tracer and check its status:

tracer.stop()
tracer.get_upload_status()
  • If successful, you can view the logged dataset by navigating to Project_Name -> Logs.

  • In order to run metrics on traced logs, Catalyst requires users to create a sub-dataset as follows:

dataset = Dataset(project_name="your_project_name")

# Sample filter to create sub-dataset from comprehensive list of traced logs; change values if needed
filter_list = [{
        "name":"llm_model",
        "values":["gpt-4o-mini"]
    },
    {
        "name":"prompt_length",
        "lte": 10000,
        "gte": 0
    }]
    
dataset_name ='your_dataset_name'
dataset.create_dataset(dataset_name, filter_list)

dataset.list_datasets()
  • You can also do this on the Catalyst UI by navigating to Project_Name -> Logs and clicking on the "Create Dataset" button.

  • In the dialog box that pops up, users can create a sub-dataset from the traced logs by filtering on metadata, human feedback, prompt length, and time as shown:

Your sub-dataset is now created and can be used for further experiments and analysis within RagaAI Catalyst.

Last updated