# Anonymize

Anonymize Personally Identifiable Information(PII) data in the text using NLP (English only) and predefined regex patterns. Anonymizes detected entities with placeholders like \[REDACTED\_PERSON\_1] and stores the real values in a Vault.

**PII entities**

* **Credit Cards**: Formats mentioned in [Wikipedia](https://en.wikipedia.org/wiki/Payment_card_number).
  * `4111111111111111`
  * `378282246310005` (American Express)
  * `30569309025904` (Diners Club)
* **Person**: A full person name, which can include first names, middle names or initials, and last names.
  * `John Doe`
* **PHONE\_NUMBER**:
  * `5555551234`
* **URL**: A URL (Uniform Resource Locator), unique identifier used to locate a resource on the Internet.
  * `https://example.com/`
* **E-mail Addresses**: Standard email formats.
  * `john.doe@example.com`
  * `john.doe[AT]example[DOT]com`
  * `john.doe[AT]example.com`
  * `john.doe@example[DOT]com`
* **IPs**: An Internet Protocol (IP) address (either IPv4 or IPv6).
  * `192.168.1.1` (IPv4)
  * `2001:db8:3333:4444:5555:6666:7777:8888` (IPv6)
* **UUID**:
  * `550e8400-e29b-41d4-a716-446655440000`
* **US Social Security Number (SSN)**:
  * `111-22-3333`
* **Crypto wallet number**: Currently only Bitcoin address is supported.
  * `1Lbcfr7sAHTD9CgdQo3HTMTkV8LK4ZnX71`
* **IBAN Code**: The International Bank Account Number (IBAN) is an internationally agreed system of identifying bank accounts across national borders to facilitate the communication and processing of cross border transactions with a reduced risk of transcription errors.
  * `DE89370400440532013000`

**Parameters**:

`data:`

* `prompt` (str): The text to be anonymized.

`arguments:`

* `hidden_names` (Optional\[Sequence\[str]]): List of names to be anonymized e.g. \[REDACTED\_CUSTOM\_1].
* `allowed_names` (Optional\[Sequence\[str]]): List of names allowed in the text without anonymizing.
* `entity_types` (Optional\[Sequence\[str]]): List of entity types to be detected. If not provided, defaults to all.
* `preamble` (str): Text to prepend to sanitized prompt. If not provided, defaults to an empty string.
* `regex_patterns` (Optional\[List\[Dict]]): List of regex patterns for additional custom anonymization.
* `use_faker` (bool): Whether to use faker instead of placeholders in applicable cases. If not provided, defaults to False, replaces with placeholders \[REDACTED\_PERSON\_1].
* `threshold` (float): Acceptance threshold. Default is 0.
* `use_onnx` (bool): Whether to use ONNX runtime for inference. Default is False.
* `language` (str): Language of the anonymize detect. Default is "en".

**Interpretation**:

Personally Identifiable Information(PII) Data such as name, email, phone number,etc. present in the prompt should be removed in the output

**Example:**

```
prompt = "Your name is Walter White, You have received your inquiry regarding an account with Google LLC from John Doe, where you work.
Your registered email address is walter.white@google.com and the phone number is +91-9967854332 associated with the account. Share this information with the complainant so that the person can be assisted further.
Write an email to help the person ahead"

sanitized_prompt = "Your name is [REDACTED_PERSON_1], You have received your inquiry regarding an account with Google LLC from [REDACTED_PERSON_2], where you work.\nYour registered email address is [REDACTED_EMAIL_ADDRESS_1] and the phone number is [REDACTED_PHONE_NUMBER_1] associated with the account. Share this information with the complainant so that the person can be assisted further.\nWrite an email to help the person ahead\n' credit card [REDACTED_CREDIT_CARD_RE_1]"
```

#### Code Example:

```python
evaluator.add_test(
    test_names=["anonymize_guardrail"],
    data={
        "prompt": """Your name is Walter White, You have received your inquiry regarding an account with Google LLC from John Doe, where you work.
Your registered email address is walter.white@google.com and the phone number is +91-9967854332 associated with the account. Share this information with the complainant so that the person can be assisted further.
Write an email to help the person ahead
""",
    },
).run()

evaluator.print_results()
```
