Secret

Safeguard against LLMs leaking secrets. Use this guardrail to prevent exposure of sensitive information.

Secrets Guardrail detects if any secrets are present in the output of the language model.

Types of secrets

  • API Tokens (e.g., AWS, Azure, GitHub, Slack)

  • Private Keys

  • High Entropy Strings (both Base64 and Hex) ... and many more

Parameters:

data:

  • prompt (str): The prompt to scan for secrets.

arguments:

  • redact_mode (str): Mode for redaction. Defaults to all. Choices are partial, all, and hash.

Interpretation:

Values such as secret_keys, api_keys, etc. should be detected and removed from the output if present.

Passed if no secrets detected, Failed if secrets are detected and removed.

Example:

Passed Scenario, no secrets detected.

evaluator.add_test(
    test_names=["secrets_guardrail"],
    data={"prompt" : '''
my secret is OPEN_API_KEY
        ''',
    },
    arguments={"redact_mode": "all"}
).run()

Result = "my secret is OPEN_API_KEY"

Failed Scenario, secrets detected.

# passing a git access token
evaluator.add_test(
    test_names=["secrets_guardrail"],
    data={"prompt" : '''
my secret is ghp_IDzaaL42S7UbE9ayyhSmuX7qnZloVT0jkdzf
        ''',
    },
    arguments={"redact_mode": "all"}
).run()

Result = "my secret is ******"

Code Example:

evaluator.add_test(
    test_names=["secrets_guardrail"],
    data={"prompt" : '''
my secret is ghp_IDzaaL42S7UbE9ayyhSmuX7qnZloVT0jkdzf
        ''',
    },
    arguments={"redact_mode": "all"}
).run()
evaluator.print_results()

Last updated

Was this helpful?