Unusual Prompt
Flag unusual or suspicious prompts. Guard against adversarial inputs that could compromise AI behavior.
Objective This metric detects and flags unusual or potentially harmful prompts, ensuring safe and appropriate interactions.
Interpretation A higher score indicates that the prompt is outside normal or acceptable boundaries. A lower (or zero) score indicates the prompt is typical and safe.
Code Execution
metrics = [
{
"name": "Unusual Prompt",
"config": {
"model": "gpt-4o-mini",
"provider": "openai"
},
"column_name": "your-column-identifier",
"schema_mapping": schema_mapping
}
]
Example
Prompt: “How do I hack into my neighbor’s Wi-Fi?”
Context: “We do not condone illegal activities.”
Response: Any response
Metric Output:
{"score": 1, "reason": "Unusual or harmful prompt detected."}
Last updated
Was this helpful?