🎯 Designing Systems under GDPR Constraints
1️⃣ Core Framework
When discussing GDPR-compliant System Design, I frame it as:
- What GDPR is
- Why GDPR matters
- Personal data identification
- Data minimization
- Consent management
- Right to access
- Right to deletion
- Trade-offs: compliance vs scalability vs business needs
2️⃣ What Is GDPR?
GDPR stands for:
General Data Protection Regulation
It is the primary privacy regulation governing personal data of EU residents.
Key Principle
GDPR is not merely a legal document.
It fundamentally changes how systems must be designed.
Applies To
- Websites
- SaaS platforms
- Mobile apps
- AI systems
- E-commerce
- Internal systems
- Data platforms
Examples of Personal Data
- Name
- Phone number
- IP address
- Cookie identifiers
- Device identifiers
- Location data
- User profiles
👉 Interview Memorization
GDPR is a privacy regulation that governs how organizations collect, store, process, transfer, and delete personal data belonging to EU residents.
Modern systems must often be designed around GDPR requirements from the beginning rather than retrofitted later.
3️⃣ Why GDPR Matters in System Design
Traditional Design Goal
Scale
Availability
Performance
GDPR Adds
Privacy
Compliance
User Rights
New Design Constraint
A technically optimal architecture may become legally unacceptable.
Example
Store everything forever
Technically easy.
GDPR:
Not allowed
👉 Interview Memorization
GDPR introduces privacy and compliance requirements that influence data models, storage strategies, retention policies, and operational workflows.
4️⃣ GDPR Core Principles
Key Principles
Lawfulness
Must have legal basis
Purpose Limitation
Use data only
for intended purpose
Data Minimization
Collect only necessary data
Accuracy
Keep data correct
Storage Limitation
Don't keep forever
Integrity and Confidentiality
Protect data
👉 Interview Memorization
GDPR is built around principles such as lawfulness, purpose limitation, data minimization, storage limitation, and confidentiality.
These principles directly influence system architecture decisions.
5️⃣ Personal Data Identification
First Question
What data is regulated?
Common Examples
Email
PII
Phone Number
PII
User ID
Potentially PII
IP Address
GDPR often considers it PII.
Data Inventory
Organizations must know:
What data exists
Where it exists
Who can access it
👉 Interview Memorization
GDPR compliance starts with identifying personal data and maintaining an inventory of where that data is stored and processed.
6️⃣ Data Minimization
Bad Design
Collect everything
GDPR Design
Collect only what is necessary
Example
Bad:
Birth Date
Gender
Location
Phone
Address
Required for Newsletter Signup
Good:
Email Only
Benefits
- Lower compliance risk
- Smaller attack surface
- Lower storage costs
👉 Interview Memorization
Data minimization means collecting only the information required for a specific business purpose.
Excessive data collection increases compliance and security risks.
7️⃣ Consent Management
Requirement
Users must often consent before data collection.
Example
Accept Cookies
System Requirements
Store:
Who
When
What consent
Version
Example Schema
UserID
ConsentType
Timestamp
PolicyVersion
Challenge
Consent may later be withdrawn.
👉 Interview Memorization
Consent management systems must track what users agreed to, when they agreed, and which policy version was accepted.
8️⃣ Right to Access
GDPR Requirement
Users may request:
Show me all my data
Challenge
Data may exist in:
- Databases
- Logs
- Search indexes
- Data lakes
- Analytics systems
- Backups
Architecture Requirement
Need unified retrieval.
Example
Data Access Service
↓
Aggregates User Data
👉 Interview Memorization
GDPR requires organizations to provide users with access to their personal data, often requiring data aggregation across many systems.
9️⃣ Right to Rectification
Requirement
Users may request:
Correct my data
Example
Old Email
↓
New Email
Challenge
Update everywhere.
Systems Affected
- Primary database
- Search indexes
- Caches
- Analytics stores
👉 Interview Memorization
Data correction workflows must propagate updates across all systems that contain personal information.
🔟 Right to Erasure (“Right to be Forgotten”)
Requirement
User requests deletion.
Example
Delete Account
Challenge
Data exists everywhere.
Systems
- Databases
- Search indexes
- Caches
- Data warehouses
- Object storage
Workflow
Delete Request
↓
Deletion Service
↓
Propagation
👉 Interview Memorization
The Right to Erasure is one of the most challenging GDPR requirements because personal data may exist across many distributed systems.
1️⃣1️⃣ Data Retention Policies
Problem
Many systems keep data forever.
GDPR Approach
Keep data
Only as long as necessary
Example
Inactive User
↓
Delete after 3 years
Technical Requirements
- TTL
- Scheduled deletion
- Retention policies
👉 Interview Memorization
GDPR requires organizations to define retention policies and automatically remove data that is no longer needed.
1️⃣2️⃣ Data Portability
Requirement
User requests:
Export my data
Example
Download Account Data
Formats
- JSON
- CSV
- XML
Architecture Requirement
Portable standardized format.
👉 Interview Memorization
Data portability requires systems to export user data in a structured, machine-readable format.
1️⃣3️⃣ Encryption Requirements
Data At Rest
Database Encryption
Data In Transit
TLS
Additional Controls
- Key rotation
- Secret management
- Access auditing
Goal
Reduce breach impact.
👉 Interview Memorization
Encryption is one of the most effective controls for protecting personal data and reducing risk in GDPR-regulated systems.
1️⃣4️⃣ Audit Logging
Requirement
Track:
Who accessed data
When
Why
Example
Admin
Viewed User Profile
2026-05-24
Benefits
- Compliance
- Investigations
- Security monitoring
👉 Interview Memorization
Audit logs provide accountability and are critical for demonstrating compliance during regulatory reviews.
1️⃣5️⃣ Data Residency and Transfers
Example
EU Data
↓
US Region
May require:
- Legal agreements
- Additional controls
- Transfer mechanisms
Architecture Impact
Multi-region systems become more complicated.
👉 Interview Memorization
GDPR imposes restrictions on international data transfers, requiring architects to carefully evaluate replication and storage strategies.
1️⃣6️⃣ GDPR and AI Systems
New Challenge
AI models may train on user data.
Questions
Was consent given?
Can training data be removed?
Can outputs leak PII?
Challenges
- Model retraining
- Embedding stores
- Vector databases
- RAG systems
👉 Interview Memorization
AI systems introduce new GDPR challenges because personal data may appear in training datasets, embeddings, vector stores, and model outputs.
1️⃣7️⃣ Common Architecture Patterns
Pattern 1
Data Access Service
User Data
↓
Central Access Layer
Pattern 2
Deletion Service
Delete Request
↓
Global Deletion Workflow
Pattern 3
Consent Service
Consent State
↓
Shared Service
Pattern 4
Data Catalog
Track Data Locations
👉 Interview Memorization
Large organizations often build dedicated services for consent management, data access, deletion workflows, and data cataloging.
1️⃣8️⃣ Common Failure Modes
Examples
- Forgotten backups
- Logs containing PII
- Search indexes not deleted
- Analytics copies
- Missing consent tracking
- Incomplete deletion workflows
- Cross-border replication issues
Lesson
Data spreads faster
than compliance controls.
👉 Interview Memorization
GDPR violations often occur because personal data is copied into systems that are not included in compliance workflows.
1️⃣9️⃣ Best Practices
Practical Rules
- Identify all personal data
- Minimize collection
- Encrypt everything
- Build deletion workflows early
- Track consent
- Automate retention policies
- Audit access
- Classify data
- Continuously review compliance
Design Principle
Privacy must be designed in,
not added later.
👉 Interview Memorization
GDPR compliance is easiest when privacy requirements are incorporated into system design from the beginning.
🧠 Staff-Level Answer Final
👉 Full Interview Answer
Designing systems under GDPR constraints requires balancing traditional system design goals such as scalability, availability, and performance with privacy and regulatory requirements.
Key requirements include identifying personal data, minimizing collection, managing consent, supporting user access requests, enabling data correction and deletion, enforcing retention policies, and controlling international data transfers.
GDPR also introduces architectural requirements around encryption, auditing, data catalogs, and compliance workflows.
Modern AI systems face additional challenges because personal data may appear in training datasets, vector databases, and generated outputs.
Successful GDPR-compliant architectures typically include dedicated services for consent management, deletion workflows, access requests, and data governance.
Ultimately, GDPR-compliant system design is about making privacy a first-class architectural concern rather than treating it as an afterthought.
⭐ Final Insight
Designing Systems under GDPR Constraints 的核心不是:
“如何存储数据”
而是:
Privacy
- User Rights
- Consent
- Data Lifecycle
- Compliance
- Governance
- Security
最重要的一句话:
Privacy must be designed in,
not added later.
中文部分
🎯 GDPR 约束下的系统设计
核心理解
GDPR 要求系统不仅满足:
- Scalability
- Availability
- Performance
还必须满足:
- Privacy
- Compliance
- User Rights
核心要求
数据最小化
只收集必要数据
用户同意
记录用户何时同意
同意什么
同意哪个版本
数据访问权
用户可以查看全部个人数据
数据更正权
用户可以修改个人信息
被遗忘权
用户可以要求删除数据
数据可携带权
用户可以导出数据
架构影响
常见组件:
- Consent Service
- Deletion Service
- Data Access Service
- Data Catalog
- Audit Logging System
AI 系统挑战
重点关注:
- Training Data
- Embeddings
- Vector Database
- RAG Storage
- Model Outputs
面试背诵版
GDPR 的核心要求包括数据最小化、用户同意管理、数据访问权、数据删除权和数据可携带权。
系统设计必须支持完整的数据生命周期管理,并确保用户能够控制自己的个人数据。
⭐ 最终总结
GDPR System Design 的核心不是:
“如何满足法律”
而是:
如何让用户真正拥有自己的数据控制权。
最重要的一句话:
Privacy must be designed in,
not added later.
Implement