LogoLogo
Slack CommunityCatalyst Login
  • Welcome
  • RagaAI Catalyst
    • User Quickstart
    • Concepts
      • Configure Your API Keys
      • Supported LLMs
        • OpenAI
        • Gemini
        • Azure
        • AWS Bedrock
        • ANTHROPIC
      • Catalyst Access/Secret Keys
      • Enable Custom Gateway
      • Uploading Data
        • Create new project
        • RAG Datset
        • Chat Dataset
          • Prompt Format
        • Logging traces (LlamaIndex, Langchain)
        • Trace Masking Functions
        • Trace Level Metadata
        • Correlating Traces with External IDs
        • Add Dataset
      • Running RagaAI Evals
        • Executing Evaluations
        • Compare Datasets
      • Analysis
      • Embeddings
    • RagaAI Metric Library
      • RAG Metrics
        • Hallucination
        • Faithfulness
        • Response Correctness
        • Response Completeness
        • False Refusal
        • Context Relevancy
        • Context Precision
        • Context Recall
        • PII Detection
        • Toxicity
      • Chat Metrics
        • Agent Quality
        • Instruction Adherence
        • User Chat Quality
      • Text-to-SQL
        • SQL Response Correctness
        • SQL Prompt Ambiguity
        • SQL Context Ambiguity
        • SQL Context Sufficiency
        • SQL Prompt Injection
      • Text Summarization
        • Summary Consistency
        • Summary Relevance
        • Summary Fluency
        • Summary Coherence
        • SummaC
        • QAG Score
        • ROUGE
        • BLEU
        • METEOR
        • BERTScore
      • Information Extraction
        • MINEA
        • Subjective Question Correction
        • Precision@K
        • Chunk Relevance
        • Entity Co-occurrence
        • Fact Entropy
      • Code Generation
        • Functional Correctness
        • ChrF
        • Ruby
        • CodeBLEU
        • Robust Pass@k
        • Robust Drop@k
        • Pass-Ratio@n
      • Marketing Content Evaluation
        • Engagement Score
        • Misattribution
        • Readability
        • Topic Coverage
        • Fabrication
      • Learning Management System
        • Topic Coverage
        • Topic Redundancy
        • Question Redundancy
        • Answer Correctness
        • Source Citability
        • Difficulty Level
      • Additional Metrics
        • Guardrails
          • Anonymize
          • Deanonymize
          • Ban Competitors
          • Ban Substrings
          • Ban Topics
          • Code
          • Invisible Text
          • Language
          • Secret
          • Sentiment
          • Factual Consistency
          • Language Same
          • No Refusal
          • Reading Time
          • Sensitive
          • URL Reachability
          • JSON Verify
        • Vulnerability Scanner
          • Bullying
          • Deadnaming
          • SexualContent
          • Sexualisation
          • SlurUsage
          • Profanity
          • QuackMedicine
          • DAN 11
          • DAN 10
          • DAN 9
          • DAN 8
          • DAN 7
          • DAN 6_2
          • DAN 6_0
          • DUDE
          • STAN
          • DAN_JailBreak
          • AntiDAN
          • ChatGPT_Developer_Mode_v2
          • ChatGPT_Developer_Mode_RANTI
          • ChatGPT_Image_Markdown
          • Ablation_Dan_11_0
          • Anthropomorphisation
      • Guardrails
        • Competitor Check
        • Gibberish Check
        • PII
        • Regex Check
        • Response Evaluator
        • Toxicity
        • Unusual Prompt
        • Ban List
        • Detect Drug
        • Detect Redundancy
        • Detect Secrets
        • Financial Tone Check
        • Has Url
        • HTML Sanitisation
        • Live URL
        • Logic Check
        • Politeness Check
        • Profanity Check
        • Quote Price
        • Restrict Topics
        • SQL Predicates Guard
        • Valid CSV
        • Valid JSON
        • Valid Python
        • Valid Range
        • Valid SQL
        • Valid URL
        • Cosine Similarity
        • Honesty Detection
        • Toxicity Hate Speech
    • Prompt Playground
      • Concepts
      • Single-Prompt Playground
      • Multiple Prompt Playground
      • Run Evaluations
      • Using Prompt Slugs with Python SDK
      • Create with AI using Prompt Wizard
      • Prompt Diff View
    • Synthetic Data Generation
    • Gateway
      • Quickstart
    • Guardrails
      • Quickstart
      • Python SDK
    • RagaAI Whitepapers
      • RagaAI RLEF (RAG LLM Evaluation Framework)
    • Agentic Testing
      • Quickstart
      • Concepts
        • Tracing
          • Langgraph (Agentic Tracing)
          • RagaAI Catalyst Tracing Guide for Azure OpenAI Users
        • Dynamic Tracing
        • Application Workflow
      • Create New Dataset
      • Metrics
        • Hallucination
        • Toxicity
        • Honesty
        • Cosine Similarity
      • Compare Traces
      • Compare Experiments
      • Add metrics locally
    • Custom Metric
    • Auto Prompt Optimization
    • Human Feedback & Annotations
      • Thumbs Up/Down
      • Add Metric Corrections
      • Corrections as Few-Shot Examples
      • Tagging
    • On-Premise Deployment
      • Enterprise Deployment Guide for AWS
      • Enterprise Deployment Guide for Azure
      • Evaluation Deployment Guide
        • Evaluation Maintenance Guide
    • Fine Tuning (OpenAI)
    • Integration
    • SDK Release Notes
      • ragaai-catalyst 2.1.7
  • RagaAI Prism
    • Quickstart
    • Sandbox Guide
      • Object Detection
      • LLM Summarization
      • Semantic Segmentation
      • Tabular Data
      • Super Resolution
      • OCR
      • Image Classification
      • Event Detection
    • Test Inventory
      • Object Detection
        • Failure Mode Analysis
        • Model Comparison Test
        • Drift Detection
        • Outlier Detection
        • Data Leakage Test
        • Labelling Quality Test
        • Scenario Imbalance
        • Class Imbalance
        • Active Learning
        • Image Property Drift Detection
      • Large Language Model (LLM)
        • Failure Mode Analysis
      • Semantic Segmentation
        • Failure Mode Analysis
        • Labelling Quality Test
        • Active Learning
        • Drift Detection
        • Class Imbalance
        • Scenario Imbalance
        • Data Leakage Test
        • Outlier Detection
        • Label Drift
        • Semantic Similarity
        • Near Duplicates Detection
        • Cluster Imbalance Test
        • Image Property Drift Detection
        • Spatio-Temporal Drift Detection
        • Spatio-Temporal Failure Mode Analysis
      • Tabular Data
        • Failure Mode Analysis
      • Instance Segmentation
        • Failure Mode Analysis
        • Labelling Quality Test
        • Drift Detection
        • Class Imbalance
        • Scenario Imbalance
        • Label Drift
        • Data Leakage Test
        • Outlier Detection
        • Active Learning
        • Near Duplicates Detection
      • Super Resolution
        • Semantic Similarity
        • Active Learning
        • Near Duplicates Detection
        • Outlier Detection
      • OCR
        • Missing Value Test
        • Outlier Detection
      • Image Classification
        • Failure Mode Analysis
        • Labelling Quality Test
        • Class Imbalance
        • Drift Detection
        • Near Duplicates Test
        • Data Leakage Test
        • Outlier Detection
        • Active Learning
        • Image Property Drift Detection
      • Event Detection
        • Failure Mode Analysis
        • A/B Test
    • Metric Glossary
    • Upload custom model
    • Event Detection
      • Upload Model
      • Generate Inference
      • Run tests
    • On-Premise Deployment
      • Enterprise Deployment Guide for AWS
      • Enterprise Deployment Guide for Azure
  • Support
Powered by GitBook
On this page

Was this helpful?

  1. RagaAI Catalyst

Guardrails

Get started with Guardrails using this Colab Link

PreviousQuickstartNextQuickstart

Last updated 5 months ago

Was this helpful?

Exclusive to enterprise customers. to activate this feature.

RagaAI’s Guardrails feature ensures the secure and reliable operation of large language models (LLMs) by enforcing strict guidelines during response generation. By addressing security concerns, it provides a robust framework to handle diverse and emerging types of prompts with minimal risk. Tailored specifically to individual use cases, RagaAI Guardrails offer best-in-class response time, ensuring low latency while maintaining security and quality.

These guardrails protect against potential risks like hallucinations, inappropriate content, and manipulation attempts while ensuring that LLM outputs meet the specific needs of each application. Whether you are developing chatbots, automated customer support, or complex decision-making systems, RagaAI Guardrails ensure that your model behaves reliably and securely in real time.

Get started with Guardrails using this Link

Example: Guardrails for Chatbot use case:

  1. Context Adherence (Closed Domain Hallucinations): Ensures the response strictly follows the context in closed-domain use cases, avoiding any unnecessary or incorrect external information.

  2. Correct Language: Validates the correctness of grammar and syntax in the response to maintain high communication standards.

  3. Gibberish Text Detection: Prevents the generation of nonsensical or incoherent responses, ensuring clarity and relevance.

  4. Instruction Adherence: Measures how well the model follows instructions provided in the prompt, crucial for mission-critical applications.

  5. NSFW Text Detection: Detects and blocks any inappropriate content (e.g., explicit or harmful language) from being generated.

  6. Off-Topic Detection: Ensures that the response remains relevant to the prompt and doesn’t deviate into unrelated topics.

  7. Profanity-Free Checks: Monitors and blocks profane language, ensuring the content is appropriate and professional.

  8. Prompt Injection Protection: Identifies and mitigates attempts to manipulate the model through crafted prompts, safeguarding the integrity of responses.

  9. Response Length Appropriateness: Ensures that responses are neither too short nor unnecessarily long, optimizing the information provided.

  10. Restrict to Topic: Confirms that the response is aligned with the specified topic, keeping the output focused and relevant.

  11. Input and Output Toxic Language: Detects and filters out toxic or harmful language in both the input and output, promoting respectful and positive communication.

  12. Unusual Prompt Detection: Identifies unusual or trick prompts that could compromise the model’s integrity or output.

  13. Bias Detection: Detects and measures bias in the model’s responses to ensure fairness and avoid discriminatory outcomes.

  14. Financial Tone Validation: Ensures that the generated response maintains an appropriate tone in financial or professional contexts.

  15. Language Formality: Evaluates the level of formality in the language, ensuring it matches the expected tone (casual or formal) for the use case.

  16. Personally Identifiable Information (PII) Protection: Flags any PII in both input prompts and generated responses, maintaining privacy and compliance.

  17. Politeness Check: Ensures the response is polite and respectful, especially important in customer interactions and sensitive communications.

  18. Sensitive Topic Detection: Flags sensitive subjects in the response, ensuring that potentially harmful topics are avoided.

  19. Sexism Detection: Identifies and measures sexist language in both input prompts and generated outputs, ensuring inclusivity and fairness.

  20. Tone Assessment: Evaluates the tone of the input and output, ensuring consistency with the intended emotional or professional tone.

  21. Correctness (Open-domain Hallucinations): Validates factual correctness in responses, especially important when dealing with open-domain prompts.

By implementing these guardrails, RagaAI provides a comprehensive and secure environment for LLM development, protecting both the user and end users from harmful or incorrect outputs. With industry-leading latency, responses are delivered in real time without compromising on security or quality.

Contact us
Colab