LogoLogo
Slack CommunityCatalyst Login
  • Welcome
  • RagaAI Catalyst
    • User Quickstart
    • Concepts
      • Configure Your API Keys
      • Supported LLMs
        • OpenAI
        • Gemini
        • Azure
        • AWS Bedrock
        • ANTHROPIC
      • Catalyst Access/Secret Keys
      • Enable Custom Gateway
      • Uploading Data
        • Create new project
        • RAG Datset
        • Chat Dataset
          • Prompt Format
        • Logging traces (LlamaIndex, Langchain)
        • Trace Masking Functions
        • Trace Level Metadata
        • Correlating Traces with External IDs
        • Add Dataset
      • Running RagaAI Evals
        • Executing Evaluations
        • Compare Datasets
      • Analysis
      • Embeddings
    • RagaAI Metric Library
      • RAG Metrics
        • Hallucination
        • Faithfulness
        • Response Correctness
        • Response Completeness
        • False Refusal
        • Context Relevancy
        • Context Precision
        • Context Recall
        • PII Detection
        • Toxicity
      • Chat Metrics
        • Agent Quality
        • Instruction Adherence
        • User Chat Quality
      • Text-to-SQL
        • SQL Response Correctness
        • SQL Prompt Ambiguity
        • SQL Context Ambiguity
        • SQL Context Sufficiency
        • SQL Prompt Injection
      • Text Summarization
        • Summary Consistency
        • Summary Relevance
        • Summary Fluency
        • Summary Coherence
        • SummaC
        • QAG Score
        • ROUGE
        • BLEU
        • METEOR
        • BERTScore
      • Information Extraction
        • MINEA
        • Subjective Question Correction
        • Precision@K
        • Chunk Relevance
        • Entity Co-occurrence
        • Fact Entropy
      • Code Generation
        • Functional Correctness
        • ChrF
        • Ruby
        • CodeBLEU
        • Robust Pass@k
        • Robust Drop@k
        • Pass-Ratio@n
      • Marketing Content Evaluation
        • Engagement Score
        • Misattribution
        • Readability
        • Topic Coverage
        • Fabrication
      • Learning Management System
        • Topic Coverage
        • Topic Redundancy
        • Question Redundancy
        • Answer Correctness
        • Source Citability
        • Difficulty Level
      • Additional Metrics
        • Guardrails
          • Anonymize
          • Deanonymize
          • Ban Competitors
          • Ban Substrings
          • Ban Topics
          • Code
          • Invisible Text
          • Language
          • Secret
          • Sentiment
          • Factual Consistency
          • Language Same
          • No Refusal
          • Reading Time
          • Sensitive
          • URL Reachability
          • JSON Verify
        • Vulnerability Scanner
          • Bullying
          • Deadnaming
          • SexualContent
          • Sexualisation
          • SlurUsage
          • Profanity
          • QuackMedicine
          • DAN 11
          • DAN 10
          • DAN 9
          • DAN 8
          • DAN 7
          • DAN 6_2
          • DAN 6_0
          • DUDE
          • STAN
          • DAN_JailBreak
          • AntiDAN
          • ChatGPT_Developer_Mode_v2
          • ChatGPT_Developer_Mode_RANTI
          • ChatGPT_Image_Markdown
          • Ablation_Dan_11_0
          • Anthropomorphisation
      • Guardrails
        • Competitor Check
        • Gibberish Check
        • PII
        • Regex Check
        • Response Evaluator
        • Toxicity
        • Unusual Prompt
        • Ban List
        • Detect Drug
        • Detect Redundancy
        • Detect Secrets
        • Financial Tone Check
        • Has Url
        • HTML Sanitisation
        • Live URL
        • Logic Check
        • Politeness Check
        • Profanity Check
        • Quote Price
        • Restrict Topics
        • SQL Predicates Guard
        • Valid CSV
        • Valid JSON
        • Valid Python
        • Valid Range
        • Valid SQL
        • Valid URL
        • Cosine Similarity
        • Honesty Detection
        • Toxicity Hate Speech
    • Prompt Playground
      • Concepts
      • Single-Prompt Playground
      • Multiple Prompt Playground
      • Run Evaluations
      • Using Prompt Slugs with Python SDK
      • Create with AI using Prompt Wizard
      • Prompt Diff View
    • Synthetic Data Generation
    • Gateway
      • Quickstart
    • Guardrails
      • Quickstart
      • Python SDK
    • RagaAI Whitepapers
      • RagaAI RLEF (RAG LLM Evaluation Framework)
    • Agentic Testing
      • Quickstart
      • Concepts
        • Tracing
          • Langgraph (Agentic Tracing)
          • RagaAI Catalyst Tracing Guide for Azure OpenAI Users
        • Dynamic Tracing
        • Application Workflow
      • Create New Dataset
      • Metrics
        • Hallucination
        • Toxicity
        • Honesty
        • Cosine Similarity
      • Compare Traces
      • Compare Experiments
      • Add metrics locally
    • Custom Metric
    • Auto Prompt Optimization
    • Human Feedback & Annotations
      • Thumbs Up/Down
      • Add Metric Corrections
      • Corrections as Few-Shot Examples
      • Tagging
    • On-Premise Deployment
      • Enterprise Deployment Guide for AWS
      • Enterprise Deployment Guide for Azure
      • Evaluation Deployment Guide
        • Evaluation Maintenance Guide
    • Fine Tuning (OpenAI)
    • Integration
    • SDK Release Notes
      • ragaai-catalyst 2.1.7
  • RagaAI Prism
    • Quickstart
    • Sandbox Guide
      • Object Detection
      • LLM Summarization
      • Semantic Segmentation
      • Tabular Data
      • Super Resolution
      • OCR
      • Image Classification
      • Event Detection
    • Test Inventory
      • Object Detection
        • Failure Mode Analysis
        • Model Comparison Test
        • Drift Detection
        • Outlier Detection
        • Data Leakage Test
        • Labelling Quality Test
        • Scenario Imbalance
        • Class Imbalance
        • Active Learning
        • Image Property Drift Detection
      • Large Language Model (LLM)
        • Failure Mode Analysis
      • Semantic Segmentation
        • Failure Mode Analysis
        • Labelling Quality Test
        • Active Learning
        • Drift Detection
        • Class Imbalance
        • Scenario Imbalance
        • Data Leakage Test
        • Outlier Detection
        • Label Drift
        • Semantic Similarity
        • Near Duplicates Detection
        • Cluster Imbalance Test
        • Image Property Drift Detection
        • Spatio-Temporal Drift Detection
        • Spatio-Temporal Failure Mode Analysis
      • Tabular Data
        • Failure Mode Analysis
      • Instance Segmentation
        • Failure Mode Analysis
        • Labelling Quality Test
        • Drift Detection
        • Class Imbalance
        • Scenario Imbalance
        • Label Drift
        • Data Leakage Test
        • Outlier Detection
        • Active Learning
        • Near Duplicates Detection
      • Super Resolution
        • Semantic Similarity
        • Active Learning
        • Near Duplicates Detection
        • Outlier Detection
      • OCR
        • Missing Value Test
        • Outlier Detection
      • Image Classification
        • Failure Mode Analysis
        • Labelling Quality Test
        • Class Imbalance
        • Drift Detection
        • Near Duplicates Test
        • Data Leakage Test
        • Outlier Detection
        • Active Learning
        • Image Property Drift Detection
      • Event Detection
        • Failure Mode Analysis
        • A/B Test
    • Metric Glossary
    • Upload custom model
    • Event Detection
      • Upload Model
      • Generate Inference
      • Run tests
    • On-Premise Deployment
      • Enterprise Deployment Guide for AWS
      • Enterprise Deployment Guide for Azure
  • Support
Powered by GitBook
On this page

Was this helpful?

  1. RagaAI Catalyst
  2. Gateway

Quickstart

PreviousGatewayNextGuardrails

Last updated 7 months ago

Was this helpful?

This guide outlines the steps required to run the Gateway workflow in order to route real-time prompts to various available LLMs based on performance and/or fallback rules.

Step 1: Navigate to Gateway

  1. From the left-hand side pane, select the "Gateway" option


Step 2: Configure API Keys

Users have the freedom to use separate LLM endpoints for their production (deployed) applications. Therefore, there exists a separate section within Gateway to specify the same.

This can be done by clicking the "Configure API Keys" button on the top-right, which opens up the following sidebar:

In case users want to use the same keys they do for metric evaluations on Catalyst, the “Import from Settings” button will allow them to directly import the same (if already declared in Catalyst Settings) to the respective field here.


Step 3: Configure a RagaAI Classifier

As mentioned above, users can setup a classifier model using metric runs from their existing Catalyst datasets. The default page lists existing trained classifiers (if any).

To do so, users can follow the steps below:

  1. Click the "Train New Classifier" button placed on the top-right of this screen. Doing so will open the setup dialog:

    Users will be prompted to add a classifier name (unique), followed by the following:

    • An existing Catalyst dataset - This dataset must contain a “prompt” column, along with a common metric computed on at least two different LLM responses for each prompt.

    • The common metric - refers to the common metric (e.g, Context Precision) computed on various LLM responses inside the dataset selected above.

    Clicking “Next” takes users to the schema mapping screen.

  2. Map your dataset's column names with the relevant classifier schema elements, most importantly prompt and metric values:

    Note that the configuration for each metric evaluation is stored during computation, so the model associated with a particular metric column is already known in the background, and hence need not be specified here.

  3. Clicking “Train” will add the classifier training job in the Job ID window, and once complete, users can find it on the default screen.

  4. Clicking on individual classifiers will allow users to view their corresponding details as shown (no edits allowed, view-only):

  5. Users can delete existing classifiers using the three-dot menu on the top-right corner of each card on the default screen.


Step 3: Create a Deployment

Users can create multiple deployments - which are basically a combination of rules, settings, classifier details (as described above), and model choices for real-time inferences. That said, only one deployment can be active at any time.

Deployments can follow one of two routing logics: custom ordering and auto-routing.

Custom Order

To create a new custom order deployment:

  1. Navigate to the "Deploy" section of the Gateway:

  2. Click on "New Deployment" to open the following dialog:

    The default choice is set to "Custom Order", wherein users can choose their preferred models from a multi-select dropdown (based on the API keys declared). Once chosen, users can drag-and-drop them in order of their preference.

  3. Once the preference order is chosen, users can select the rules based on which the order will be executed by clicking "Next":

    Users can enable each of the following options using a checkbox:

    • Model Response Timeout - time value in seconds for how long the Catalyst Router should wait for a model to inference a query before moving on to the next one

    • Retries on Model Timeout - numerical value of the number of times the same model should be retried on hitting the timeout value set above

    • Retries on Model Failure - numerical value of the number of times the same model should be retried in case a failure/error code is received

Auto-Routing

To create an auto-routiing deployment:

  1. On the "New Deployment" dialog, select Auto-Routing from the preference dropdown:

    Till users train their own classifiers, the above screen will be shown. Once 1+ classifiers have been trained, a list will be shown as follows:

    For any one deployment, only one classifier can be enabled. Toggling one classifier will disable the others.

  2. Clicking "Next" will allow users to set the rules, same as above.

    Clicking "Save" will add the deployment to the default "Deploy" screen, similar to the Projects flow on the Catalyst app.

Once a deployment (custom/auto) has been created, its details can be viewed by clicking its respective card on the "Deploy" screen:


Step 4: Activate a Deployment

To activate a created deployment, simply switch on the toggle switch associated with a particular deployment:

Doing so will expose a code snippet, containing the deployment ID associated with a particular deployment as below:

The copy button next to the snippet will allow users to bring the code snippet to their application's codebase/SDK. Users can simply edit the message with their own variables to enable Routing.