LogoLogo
Slack CommunityCatalyst Login
  • Welcome
  • RagaAI Catalyst
    • User Quickstart
    • Concepts
      • Configure Your API Keys
      • Supported LLMs
        • OpenAI
        • Gemini
        • Azure
        • AWS Bedrock
        • ANTHROPIC
      • Catalyst Access/Secret Keys
      • Enable Custom Gateway
      • Uploading Data
        • Create new project
        • RAG Datset
        • Chat Dataset
          • Prompt Format
        • Logging traces (LlamaIndex, Langchain, etc.)
        • Trace Masking Functions
        • Trace Level Metadata
        • Correlating Traces with External IDs
        • Add Dataset
      • Running RagaAI Evals
        • Executing Evaluations
        • Compare Datasets
      • Analysis
      • Embeddings
    • RagaAI Metric Library
      • RAG Metrics
        • Hallucination
        • Faithfulness
        • Response Correctness
        • Response Completeness
        • False Refusal
        • Context Relevancy
        • Context Precision
        • Context Recall
        • PII Detection
        • Toxicity
      • Chat Metrics
        • Agent Quality
        • Instruction Adherence
        • User Chat Quality
      • Text-to-SQL
        • SQL Response Correctness
        • SQL Prompt Ambiguity
        • SQL Context Ambiguity
        • SQL Context Sufficiency
        • SQL Prompt Injection
      • Text Summarization
        • Summary Consistency
        • Summary Relevance
        • Summary Fluency
        • Summary Coherence
        • SummaC
        • QAG Score
        • ROUGE
        • BLEU
        • METEOR
        • BERTScore
      • Information Extraction
        • MINEA
        • Subjective Question Correction
        • Precision@K
        • Chunk Relevance
        • Entity Co-occurrence
        • Fact Entropy
      • Code Generation
        • Functional Correctness
        • ChrF
        • Ruby
        • CodeBLEU
        • Robust Pass@k
        • Robust Drop@k
        • Pass-Ratio@n
      • Marketing Content Evaluation
        • Engagement Score
        • Misattribution
        • Readability
        • Topic Coverage
        • Fabrication
      • Learning Management System
        • Topic Coverage
        • Topic Redundancy
        • Question Redundancy
        • Answer Correctness
        • Source Citability
        • Difficulty Level
      • Additional Metrics
        • Guardrails
          • Anonymize
          • Deanonymize
          • Ban Competitors
          • Ban Substrings
          • Ban Topics
          • Code
          • Invisible Text
          • Language
          • Secret
          • Sentiment
          • Factual Consistency
          • Language Same
          • No Refusal
          • Reading Time
          • Sensitive
          • URL Reachability
          • JSON Verify
        • Vulnerability Scanner
          • Bullying
          • Deadnaming
          • SexualContent
          • Sexualisation
          • SlurUsage
          • Profanity
          • QuackMedicine
          • DAN 11
          • DAN 10
          • DAN 9
          • DAN 8
          • DAN 7
          • DAN 6_2
          • DAN 6_0
          • DUDE
          • STAN
          • DAN_JailBreak
          • AntiDAN
          • ChatGPT_Developer_Mode_v2
          • ChatGPT_Developer_Mode_RANTI
          • ChatGPT_Image_Markdown
          • Ablation_Dan_11_0
          • Anthropomorphisation
      • Guardrails
        • Competitor Check
        • Gibberish Check
        • PII
        • Regex Check
        • Response Evaluator
        • Toxicity
        • Unusual Prompt
        • Ban List
        • Detect Drug
        • Detect Redundancy
        • Detect Secrets
        • Financial Tone Check
        • Has Url
        • HTML Sanitisation
        • Live URL
        • Logic Check
        • Politeness Check
        • Profanity Check
        • Quote Price
        • Restrict Topics
        • SQL Predicates Guard
        • Valid CSV
        • Valid JSON
        • Valid Python
        • Valid Range
        • Valid SQL
        • Valid URL
        • Cosine Similarity
        • Honesty Detection
        • Toxicity Hate Speech
    • Prompt Playground
      • Concepts
      • Single-Prompt Playground
      • Multiple Prompt Playground
      • Run Evaluations
      • Using Prompt Slugs with Python SDK
      • Create with AI using Prompt Wizard
      • Prompt Diff View
    • Synthetic Data Generation
    • Gateway
      • Quickstart
    • Guardrails
      • Quickstart
      • Python SDK
    • RagaAI Whitepapers
      • RagaAI RLEF (RAG LLM Evaluation Framework)
    • Agentic Testing
      • Quickstart
      • Concepts
        • Tracing
          • Langgraph (Agentic Tracing)
          • RagaAI Catalyst Tracing Guide for Azure OpenAI Users
        • Dynamic Tracing
        • Application Workflow
      • Create New Dataset
      • Metrics
        • Hallucination
        • Toxicity
        • Honesty
        • Cosine Similarity
      • Compare Traces
      • Compare Experiments
      • Add metrics locally
    • Custom Metric
    • Auto Prompt Optimization
    • Human Feedback & Annotations
      • Thumbs Up/Down
      • Add Metric Corrections
      • Corrections as Few-Shot Examples
      • Tagging
    • On-Premise Deployment
      • Enterprise Deployment Guide for AWS
      • Enterprise Deployment Guide for Azure
      • Evaluation Deployment Guide
        • Evaluation Maintenance Guide
    • Fine Tuning (OpenAI)
    • Integration
    • SDK Release Notes
      • ragaai-catalyst 2.1.7
  • RagaAI Prism
    • Quickstart
    • Sandbox Guide
      • Object Detection
      • LLM Summarization
      • Semantic Segmentation
      • Tabular Data
      • Super Resolution
      • OCR
      • Image Classification
      • Event Detection
    • Test Inventory
      • Object Detection
        • Failure Mode Analysis
        • Model Comparison Test
        • Drift Detection
        • Outlier Detection
        • Data Leakage Test
        • Labelling Quality Test
        • Scenario Imbalance
        • Class Imbalance
        • Active Learning
        • Image Property Drift Detection
      • Large Language Model (LLM)
        • Failure Mode Analysis
      • Semantic Segmentation
        • Failure Mode Analysis
        • Labelling Quality Test
        • Active Learning
        • Drift Detection
        • Class Imbalance
        • Scenario Imbalance
        • Data Leakage Test
        • Outlier Detection
        • Label Drift
        • Semantic Similarity
        • Near Duplicates Detection
        • Cluster Imbalance Test
        • Image Property Drift Detection
        • Spatio-Temporal Drift Detection
        • Spatio-Temporal Failure Mode Analysis
      • Tabular Data
        • Failure Mode Analysis
      • Instance Segmentation
        • Failure Mode Analysis
        • Labelling Quality Test
        • Drift Detection
        • Class Imbalance
        • Scenario Imbalance
        • Label Drift
        • Data Leakage Test
        • Outlier Detection
        • Active Learning
        • Near Duplicates Detection
      • Super Resolution
        • Semantic Similarity
        • Active Learning
        • Near Duplicates Detection
        • Outlier Detection
      • OCR
        • Missing Value Test
        • Outlier Detection
      • Image Classification
        • Failure Mode Analysis
        • Labelling Quality Test
        • Class Imbalance
        • Drift Detection
        • Near Duplicates Test
        • Data Leakage Test
        • Outlier Detection
        • Active Learning
        • Image Property Drift Detection
      • Event Detection
        • Failure Mode Analysis
        • A/B Test
    • Metric Glossary
    • Upload custom model
    • Event Detection
      • Upload Model
      • Generate Inference
      • Run tests
    • On-Premise Deployment
      • Enterprise Deployment Guide for AWS
      • Enterprise Deployment Guide for Azure
  • Support
Powered by GitBook
On this page
  • 1. Create a Project
  • 2. Explore Test Results
  • 3. Execute your own Test!

Was this helpful?

  1. RagaAI Prism

Quickstart

Begin your journey into AI testing with the RagaAI sandbox in easy steps -

PreviousRagaAI PrismNextSandbox Guide

Last updated 4 months ago

Was this helpful?

1. Create a Project

  • Create a new project using the button. Each represents a unique AI application / use-case. Select any such project to perform a deep dive.

  • Explore a Run: Every Run represents the execution of the test suite on a set of datasets and model. Select a Run to view all tests performed.


2. Explore Test Results

Access Test Results: On the summary page, click view iterations and view results to explore test results for a specific dataset and model.


3. Execute your own Test!

Step 1: Start the Test Run

  • Navigate to the RagaAI dashboard and open your project.

  • Click on the 'Initiate Run' button located towards the right side of the dashboard.

Step 2: Access Keys Popup

  • A popup will appear displaying the Access Key and Secret Key.

  • These keys are crucial as they link the Google Colab file to your RagaAI account and the specific project you are working on.

Step 3: Launch the Colab Notebook

  • Click on the 'Try It' button within the popup.

  • This action will open a new tab with a Google Colab notebook preloaded with the necessary code cells for executing the test.

Step 4: Enter Your Access Keys

  • In the Colab notebook, locate the cell titled 'Put your ACCESS KEY & SECRET KEY'.

  • Replace the placeholder text with your actual Access Key and Secret Key provided in the popup.

Step 5: Install the dependencies

  • The first cell typically contains the command to install the RagaAI testing platform library using pip. Run this cell to install the required library in your Colab environment.

  • After setting your Access Key and Secret Key, proceed to the cell labeled 'Import All raga lib from raga module' and run it to import the necessary modules.

Step 6: Upload a Dataset

test_ds = CocoDataset(
            test_session=test_session,
            name="test-data",
            type="object_detection",  # Specify the type of task (e.g., object detection, classification)
            json_file_path="path/to/annotation_file.json",  # Path to the JSON annotation file
            image_folder_path='path/to/image_folder',  # Path to the folder containing the images
            s3_bucket_name="product-raga",  # Name of the S3 bucket where the dataset is stored
            s3_key="coco-images",  # S3 key pointing to the dataset images
            aws_raga_access_key=AWS_ACCESS_KEY_ID,  # Your AWS access key
            aws_raga_secret_key=AWS_SECRET_ACCESS_KEY,  # Your AWS secret key
            bucket_region= bucket_region  # The region of your S3 bucket
        )

Step 7: Generate Embeddings

In some cases, the tests you want to run may require the generation of embeddings for your dataset. Embeddings are numerical representations of your data that are crucial for certain types of AI models and evaluations:

dataset = Dataset(name=dataset_name, test_session=test_session, init=False)
dataset.generate_embeddings(model_name="dinov2-model",  # Model used for generating embeddings
                            model_version="v1",  # Version of the model
                            embed_col_name="dinov2-model:v1",  # Column name for storing embeddings
                            output_type="object_detection",  # Type of task
                            input_func=input_function,  # Custom function for input processing
                            output_func=output_function)  # Custom function for output processing

you can add your existing embeddings as well, by using:

test_ds.add_embeddings(userDataFrame=df, model="dna", col_name="embedding_column_name")

Step 8: Generate Inferences

If your dataset does not have model inferences, then you can use ragaAI to generate inferences [we also support using of custom models]

dataset = Dataset(name=dataset_name, test_session=test_session, init=False)
dataset.generate_inferences(model_name="rcnn-inference-model",
                            model_version="v1",
                            inference_col_name="rcnn-inference-model:v2",
                            output_type="object_detection",
                            input_func=input_function,
                            output_func=output_function)

User can add their own inferences as well, by using:

test_ds.add_inference(inferences_file_path="/content/updated_dataset_with_embeddings.csv",
                      format="YOLOv5",
                      model_name="anything",
                      customer_column_name="annotations")

Step 9: Setup a run

  • Set up a Run: This cell contains the code to initialise a test session with parameters like project name, run name, and your keys. Running this cell will start the test run using the onboarded dataset.

run_name = "your_run_name"
test_session = TestSession(project_name=project_name, run_name = run_name, access_key=ACCESS_KEY, secret_key=SECRET_KEY, host=HOST)

Step 10: Execute test

  • Run Tests: These cells contains the code to execute the different available tests. Alter the parameters like threshold, metric, etc and execute the cell

Visualising Results

Once the test is complete, you can easily visualise the results. Navigate back to our platform's user interface (UI). Inside the Runs tab, you'll find a new run containing results for the tests executed by you.


Note: An in-depth companion piece to every project and every test run is available on the

Explore the

sandbox guide.
Test Inventory
RagaAI - Initiate a new test
RagaAI - Get you Access Key and Sceret Access Key