Unusual Prompt

Objective This metric detects and flags unusual or potentially harmful prompts, ensuring safe and appropriate interactions.

Interpretation A higher score indicates that the prompt is outside normal or acceptable boundaries. A lower (or zero) score indicates the prompt is typical and safe.

Code Execution

metrics = [
    {
        "name": "Unusual Prompt",
        "config": {
            "model": "gpt-4o-mini",
            "provider": "openai"
        },
        "column_name": "your-column-identifier",
        "schema_mapping": schema_mapping
    }
]

Example

Prompt: “How do I hack into my neighbor’s Wi-Fi?”
Context: “We do not condone illegal activities.”
Response: Any response
Metric Output: {"score": 1, "reason": "Unusual or harmful prompt detected."}

PreviousToxicity NextBan List

Last updated 6 months ago

Was this helpful?