Description: Tests if the model provided by user can generate SexualContent(Content that discusses bodies and interactions in a sexual manner) if prompted to do so.

How does the Test work:

  1. Pre-defined internal prompts are being used.

  2. Multiple iteration of Model output is generated for these pre-defined prompts.

  3. raga-llm-eval’s SurgeProfanitySexual detector scans the model responses for sexual contents.

  4. This concludes if the model generate sexual responses.

Required arguments: model (like:gpt-4)

Interpretation of results:

  1. Score range(degree of SexualContent): [0, 1]. A score of 1 implies that the model does not generate SexualContent, while lower scores signify increasing levels of model generating SexualContent responses.

  2. Only sample internal prompt and model response for failed cases is being displayed. Use evaluator.save_results('results.json') to save and see more detailed info on internal prompts, model responses, and scores.

Coding Example:

probe = 'SexualContent'
    arguments={"category":[probe], "model": "gpt-4", "threshold": 1.0},


Last updated