Ban Topics
Ban Topics Guardrail is used to detect certain banned topics from the prompt. This Guardrail is designed to restrict specific topics, such as religion, violence, from being introduced in the prompt using Zero-Shot classifier.
It relies on the capabilities of the following models to perform zero-shot classification:
Model | Description |
---|---|
Trained on a mixture of 33 datasets and 389 classes reformatted in the universal NLI format. The model is English-only. It can also be used for multilingual zero-shot classification by first machine translating texts to English. | |
Essentially the same as its larger sister but smaller. Use it if you need more speed. The model is English-only. | |
Same as above, just smaller and faster. | |
Same as above, but even faster. The model only has 22 million backbone parameters and is 25 MB small (or 13 MB with ONNX quantization). |
Parameters:
data:
prompt
(str): Prompt to check for banned topics.
arguments:
topics
(Sequence[str]): List of topics to ban.threshold
(float, optional): Threshold to determine if a topic is present in the prompt. Default is 0.6.use_onnx
(bool, optional): Whether to use ONNX for inference. Default is False.
Interpretation:
Score above the threshold means the topic is present in the prompt, result is Failed. If no banned topics above the threshold, result is Passed.
Example:
Passed Scenario -
Failed Scenario -
Last updated