Technique 2: Hashing
Hashing is a one-way transformation perfect for fields where you need to check for uniqueness or track activity without needing to know the original value. A common use case is anonymizing IP addresses for GDPR compliance.
The Goal
You will use the .hash() function to replace the plaintext ip_address field with a salted ip_hash, and then delete the original.
Implementation
-
Start with the Previous Pipeline: Copy the
delete-payment.yamlfrom Step 1 to a new file namedhash-ip.yaml.cp delete-payment.yaml hash-ip.yamlNote: Remember to set the
IP_SALTenvironment variable as described in the setup guide. -
Add the Hashing Logic: Open
hash-ip.yamland add the hashing logic to the bottom of the existingmappingprocessor.Add this to the 'mapping' processor in hash-ip.yaml# --- Logic from Step 1 (Deletion) ---
root.payment_method = this.payment_method.without("full_number", "expiry")
# --- START: New additions for Hashing ---
# Hash the IP address with a secret salt for security
root.ip_hash = this.ip_address.hash("sha256", env("IP_SALT"))
# Delete the original IP address field
root = this.without("ip_address")
# --- END: New additions --- -
Deploy and Test:
# Send the sample event data
curl -X POST http://localhost:8080/events/ingest \
-H "Content-Type: application/json" \
-d @~/expanso-remove-pii/sample-event.json -
Verify: Check your logs. The
ip_addressfield will be gone, replaced by a 64-characterip_hash. Sending the same event again will produce the exact same hash, allowing you to count unique visitors or track user sessions without storing personal data.
You have now applied the hashing pattern to make your data GDPR-compliant while preserving its value for abuse detection and session tracking.