Some organizations process Personally Identifiable Information (PII) such as names, phone numbers, and email addresses in their Large Language Model (LLM) workflows. Storing this data in Weights & Biases (W&B) Weave poses compliance and security risks. Stripping this data from being logged can help keep your agent compliant with policies like GDPR and HIPAA. The Sensitive Data Protection feature allows you to automatically redact Personally Identifiable Information (PII) from a trace before it is sent to Weave servers. This feature integrates Microsoft Presidio into the Weave Python SDK, which means that you can control redaction settings at the SDK level. The Sensitive Data Protection feature introduces the following functionality to the Python SDK:Documentation Index
Fetch the complete documentation index at: https://wb-21fd5541-john-wbdocs-2044-rename-serverless-products.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
- A
redact_piisetting, which can be toggled on or off in theweave.init()call to enable PII redaction. - Automatic redaction of common entities when
redact_pii = True. - Customizable redaction fields using the configurable
redact_pii_fieldssetting. - Exclude specific entities from redaction using the
redact_pii_exclude_fieldssetting.
Enable PII redaction
To get started with the Sensitive Data Protection feature in Weave, complete the following steps:-
Install the required dependencies:
-
Modify your
weave.init()call to enable redaction. Whenredact_pii=True, common entities are redacted by default: -
(Optional) Customize redaction fields using the
redact_pii_fieldsparameter:For a full list of the entities that can be detected and redacted, see PII entities supported by Presidio. -
(Optional) Exclude specific entities from redaction using the
redact_pii_exclude_fieldsparameter. This is useful when you want to keep the default redaction but preserve certain entity types. The following example demonstrates how to redact all default entities exceptEMAIL_ADDRESSandPERSON:
Entities redacted by default
The following entities are automatically redacted when PII redaction is enabled:CREDIT_CARDCRYPTOEMAIL_ADDRESSES_NIFFI_PERSONAL_IDENTITY_CODEIBAN_CODEIN_AADHAARIN_PANIP_ADDRESSLOCATIONPERSONPHONE_NUMBERUK_NHSUK_NINOUS_BANK_NUMBERUS_DRIVER_LICENSEUS_PASSPORTUS_SSN
Redacting sensitive keys with REDACT_KEYS
In addition to PII redaction, the Weave SDK also supports redaction of custom keys using REDACT_KEYS. This is useful when you want to protect additional sensitive data that might not fall under the PII category but needs to be kept private. Examples include:
- API keys
- Authentication headers
- Tokens
- Internal IDs
- Config values
Pre-defined REDACT_KEYS
Weave automatically redacts the following sensitive keys by default:
Adding your own keys
You can extend this list with your own custom keys that you want to redact from traces:client_id and token appear as "REDACTED":
Usage information
- This feature is only available in the Python SDK.
- Enabling redaction increases processing time due to the Presidio dependency.