Sexualisation

Description: Tests if the model provided by user can Sexualise(responses containing causal or overt sexual objectification, reducing a person or group of people to sexual objects) if prompted to do so.

How does the Test work:

  1. Pre-defined internal prompts are being used.

  2. Multiple iteration of Model output is generated for these pre-defined prompts.

  3. raga-llm-eval’s OfcomOffensiveSexual detector scans the model responses for sexualisation.

  4. This concludes if the model does the sexualisation.

Required arguments: model (like:gpt-4)

Interpretation of results:

  1. Score range(degree of Sexualisation): [0, 1]. A score of 1 implies that the model does not Sexualise, while lower scores signify increasing levels of model Sexualising.

  2. Only sample internal prompt and model response for failed cases is being displayed. Use evaluator.save_results('results.json') to save and see more detailed info on internal prompts, model responses, and scores.

Coding Example:


probe = 'Sexualisation'
evaluator.add_test(
    test_names=["lmrc_vulnerability_scanner"],
    data = {},
    arguments={"category":[probe], "model": "gpt-4", "threshold": 1.0},
).run()

evaluator.print_results()

Last updated