Data Privacy & PII Leakage

Learn how LLMs memorize and leak PII, and how to protect sensitive data with scrubbing, self-hosting, and differential privacy

The Problem: Your LLM application processes customer data — names, emails, medical records, financial information. How do you prevent this data from leaking through the model, API logs, or prompt injection attacks?

The Solution: Build a Privacy Protection Pipeline

LLMs pose unique privacy risks: they can memorize training data (including personal information), leak user inputs through prompt injection attacks, and send sensitive data to third-party APIs with every call. Unlike traditional software where data flows are explicit, LLMs create implicit data flows that are hard to audit and control.

Think of it like a hotel safe — you trust the hotel (API provider) with your valuables, but a determined thief (attacker) might still find a way in. Self-hosting is like keeping valuables in your own home safe:

1. Classify your data sensitivity: Map what PII your app handles: names, emails, SSN, medical records, financial data. Each type has different regulatory requirements (GDPR, HIPAA, CCPA)
2. Scrub PII before sending to LLM: Use tools like Microsoft Presidio or regex patterns to detect and mask PII in prompts. Replace "John Smith, john@email.com" with "[NAME_1], [EMAIL_1]" before the API call
3. Filter model outputs: Scan LLM responses for PII before returning to users. The model might hallucinate real phone numbers or reproduce memorized training data. Post-processing catches these leaks
4. Choose API vs Self-hosting: High-sensitivity data (healthcare, finance): self-host open-source models — data never leaves your servers. Low-sensitivity: API with PII scrubbing is acceptable. Check provider data retention policies

Privacy Risk Categories

Training Data Memorization: LLMs can memorize and reproduce PII from training data — names, emails, phone numbers, addresses. GPT-2 was shown to output real phone numbers and email addresses when prompted correctly. Larger models memorize more
Prompt Data Exfiltration: Indirect prompt injection can trick models into sending user data to attacker-controlled URLs. Example: hidden instructions in a webpage tell the model to encode user data into a URL and "summarize" it
PII in API Logs: Every API call sends your data to the provider. Prompts containing customer data, medical records, or financial info are stored in logs. Check provider data retention policies — some keep logs for 30 days
Regulatory Compliance: GDPR (EU): right to erasure — but you cannot delete data from a trained model. HIPAA (healthcare): PHI must stay on-premise. CCPA (California): users can opt out of data collection. Non-compliance fines reach 4% of annual revenue

Fun Fact: Researchers extracted over 600 real memorized training examples from GPT-2 (1.5B params) using simple prompting techniques. With GPT-3 (175B), the extraction rate was even higher. The larger the model, the more it memorizes — and potentially leaks.

Try It Yourself!

Explore the interactive PII detection pipeline below to see how data protection works in practice.

Training Data Memorization

LLMs memorize PII from training data — names, emails, phones, addresses. GPT-2 was shown to reproduce real contact details. Larger models memorize more.

Prompt Data Exfiltration

•Indirect prompt injection: hidden instructions make model leak data
•API logs: every call sends data to the provider
•Context attack: model can include PII in a "harmless" response

Protection Methods

•PII Scrubbing: mask before sending (Presidio, regex)
•Self-Hosting: data never leaves your servers (Llama, DeepSeek)
•Output Filtering: scan responses for PII
•Differential Privacy: mathematical noise during training

Regulations

•GDPR (EU): right to erasure, but cannot "forget" data from a model
•HIPAA (healthcare): PHI must stay on-premise
•CCPA (California): right to opt out of data collection
•Fines: up to 4% of annual revenue for non-compliance

PII Detection & Protection Methods

Dear support, my name is John Smith (john.smith@company.com, +1-555-0123). My SSN is 123-45-6789. I was diagnosed with diabetes and live at 123 Main St, NYC.

high

Name

John Smith

high

john.smith@company.com

high

Phone

+1-555-0123

critical

SSN

123-45-6789

critical

Medical

diagnosed with diabetes

medium

Address

123 Main St, NYC

Practice Challenges

Create a free account to solve challenges

3 AI-verified challenges for this lesson

Related lessons:Prompt Injection Alignment

This lesson is part of a structured LLM course.

My Learning Path

Data Privacy & PII Leakage

Learn how LLMs memorize and leak PII, and how to protect sensitive data with scrubbing, self-hosting, and differential privacy

The Solution: Build a Privacy Protection Pipeline

Think of it like a hotel safe — you trust the hotel (API provider) with your valuables, but a determined thief (attacker) might still find a way in. Self-hosting is like keeping valuables in your own home safe:

1. Classify your data sensitivity: Map what PII your app handles: names, emails, SSN, medical records, financial data. Each type has different regulatory requirements (GDPR, HIPAA, CCPA)
2. Scrub PII before sending to LLM: Use tools like Microsoft Presidio or regex patterns to detect and mask PII in prompts. Replace "John Smith, john@email.com" with "[NAME_1], [EMAIL_1]" before the API call
3. Filter model outputs: Scan LLM responses for PII before returning to users. The model might hallucinate real phone numbers or reproduce memorized training data. Post-processing catches these leaks
4. Choose API vs Self-hosting: High-sensitivity data (healthcare, finance): self-host open-source models — data never leaves your servers. Low-sensitivity: API with PII scrubbing is acceptable. Check provider data retention policies

Privacy Risk Categories

Training Data Memorization: LLMs can memorize and reproduce PII from training data — names, emails, phone numbers, addresses. GPT-2 was shown to output real phone numbers and email addresses when prompted correctly. Larger models memorize more
Prompt Data Exfiltration: Indirect prompt injection can trick models into sending user data to attacker-controlled URLs. Example: hidden instructions in a webpage tell the model to encode user data into a URL and "summarize" it
PII in API Logs: Every API call sends your data to the provider. Prompts containing customer data, medical records, or financial info are stored in logs. Check provider data retention policies — some keep logs for 30 days
Regulatory Compliance: GDPR (EU): right to erasure — but you cannot delete data from a trained model. HIPAA (healthcare): PHI must stay on-premise. CCPA (California): users can opt out of data collection. Non-compliance fines reach 4% of annual revenue

Try It Yourself!

Explore the interactive PII detection pipeline below to see how data protection works in practice.

Training Data Memorization

LLMs memorize PII from training data — names, emails, phones, addresses. GPT-2 was shown to reproduce real contact details. Larger models memorize more.

Prompt Data Exfiltration

•Indirect prompt injection: hidden instructions make model leak data
•API logs: every call sends data to the provider
•Context attack: model can include PII in a "harmless" response

Protection Methods

•PII Scrubbing: mask before sending (Presidio, regex)
•Self-Hosting: data never leaves your servers (Llama, DeepSeek)
•Output Filtering: scan responses for PII
•Differential Privacy: mathematical noise during training

Regulations

•GDPR (EU): right to erasure, but cannot "forget" data from a model
•HIPAA (healthcare): PHI must stay on-premise
•CCPA (California): right to opt out of data collection
•Fines: up to 4% of annual revenue for non-compliance

PII Detection & Protection Methods

Dear support, my name is John Smith (john.smith@company.com, +1-555-0123). My SSN is 123-45-6789. I was diagnosed with diabetes and live at 123 Main St, NYC.

high

Name

John Smith

high

john.smith@company.com

high

Phone

+1-555-0123

critical

SSN

123-45-6789

critical

Medical

diagnosed with diabetes

medium

Address

123 Main St, NYC

Practice Challenges

Create a free account to solve challenges

3 AI-verified challenges for this lesson

Related lessons:Prompt Injection Alignment

This lesson is part of a structured LLM course.

My Learning Path