Ban List
Objective This metric filters out responses containing banned words or phrases, ensuring compliance with content guidelines.
Interpretation A higher score indicates the response contains banned terms. A lower (or zero) score indicates no banned terms were found.
Code Execution
metrics = [
{
"name": "Ban List",
"config": {
"model": "gpt-4o-mini",
"provider": "openai"
},
"column_name": "your-column-identifier",
"schema_mapping": schema_mapping
}
]
Example
Prompt: “Tell me a story.”
Context: “Avoid the use of these banned words: [redacted list].”
Response: “Once upon a time, [banned word]…”
Metric Output:
{"score": 1, "reason": "Response contains a banned word or phrase."}
Last updated
Was this helpful?