🎯 Problem Background
In large-scale distributed systems, databases are often unable to handle extremely high read traffic directly.
For example:
- Product pages receiving millions of reads per minute
- Social feeds accessed by large user populations
- Recommendation or configuration data accessed repeatedly
To reduce database load and improve latency, systems introduce caching layers such as:
- Redis
- Memcached
- CDN caches
- Application-level caches
However, caching introduces a new challenge:
Cache consistency — ensuring cached data does not diverge significantly from the source of truth.
Several caching strategies are commonly used in system design, including:
- Cache Aside
- Write Through
- Write Back
- Read Through
Each strategy offers different trade-offs in consistency, latency, and complexity.
1️⃣ Cache Aside (Lazy Loading)
Core Idea
The application manages the cache explicitly.
Workflow:
read request
│
▼
check cache
│
cache miss
▼
query database
│
store result in cache
Write Flow
When updating data:
- Update database
- Invalidate cache
Example
1. User profile requested
2. Cache miss
3. Query database
4. Store result in Redis
Benefits
- Simple and flexible
- Works with most existing architectures
- Cache only stores hot data
Trade-offs
- Cache inconsistency during update window
- Cache miss causes extra database latency
Best Fit
- Read-heavy workloads
- Product catalog
- User profile services
- Social feeds
Interview Answer (Memorization Version)
Cache-aside is the most commonly used caching strategy in distributed systems. The application first checks the cache, and if the data is missing, it loads the data from the database and populates the cache. For writes, the application updates the database and then invalidates the cache. This strategy works well for read-heavy workloads because it keeps the cache small and only stores hot data.
2️⃣ Write Through Cache
Core Idea
All writes go through the cache layer first.
Workflow:
write request
│
▼
write to cache
│
▼
cache writes to database
Benefits
- Cache always consistent with database
- Read operations are always fast
Trade-offs
- Higher write latency
- Increased load on cache layer
Example Use Cases
- User sessions
- Configuration systems
- Authentication tokens
Interview Answer (Memorization Version)
In write-through caching, every write goes through the cache first, and the cache synchronously updates the database. This guarantees that the cache is always consistent with the database. The downside is increased write latency, since each write must update both layers. It is typically used for session data or configuration systems where consistency is critical.
3️⃣ Write Back (Write Behind)
Core Idea
Writes are first stored in cache and asynchronously flushed to the database later.
Workflow:
write request
│
▼
update cache
│
▼
async write to database
Benefits
- Very fast write performance
- Database write load reduced
Trade-offs
- Risk of data loss if cache crashes
- More complex failure handling
Best Fit
- High write throughput systems
- Analytics pipelines
- Logging systems
- Telemetry ingestion
Interview Answer (Memorization Version)
Write-back caching stores writes in the cache first and asynchronously flushes them to the database later. This significantly improves write throughput because the database is not on the critical path. However, it introduces durability risks if the cache fails before the data is persisted. Therefore it is commonly used in high-write systems like logging, analytics, or telemetry pipelines.
4️⃣ Read Through Cache
Core Idea
The cache layer automatically loads data from the database.
Application only talks to the cache.
Workflow:
application → cache
cache miss
│
▼
cache fetches from database
│
▼
cache returns result
Benefits
- Cleaner application logic
- Centralized cache management
- Easier to standardize
Trade-offs
- More complex cache infrastructure
- Harder to debug
Best Fit
- Large platforms
- Infrastructure-managed caching
- Shared caching services
Interview Answer (Memorization Version)
In read-through caching, the application always reads from the cache, and the cache automatically loads data from the database on a miss. This removes cache logic from the application and centralizes it in the cache layer. While this simplifies application code, it increases infrastructure complexity. It is often used in platform-level caching services or managed caching infrastructure.
🎤 30-Second Interview Summary
When discussing cache strategies, I usually consider the read-write ratio and consistency requirements.
Cache-aside is the most common pattern for read-heavy systems because it keeps cache logic simple and flexible. Write-through ensures strong consistency by updating cache and database together but increases write latency. Write-back optimizes write throughput by asynchronously persisting data but introduces durability risks. Read-through centralizes cache management in the cache layer and simplifies application logic.
In practice, cache-aside is the default choice for most distributed systems.
⭐ Staff-Level Insight (Bonus)
In large-scale systems, cache consistency is rarely strict. Instead, the goal is to minimize the inconsistency window while maximizing system performance and scalability.
中文部分
🎯 问题背景
在大型系统中,数据库通常无法直接承受极高的读流量,例如:
- 商品详情页
- 社交 feed
- 推荐系统
- 用户配置数据
因此系统通常会引入 缓存层(例如 Redis、Memcached)。
但缓存会带来新的问题:
缓存一致性(Cache Consistency)
即缓存中的数据是否与数据库保持一致。
常见的缓存策略包括:
- Cache Aside
- Write Through
- Write Back
- Read Through
1️⃣ Cache Aside(Lazy Loading)
应用程序自己管理缓存。
读取流程:
读请求
↓
查缓存
↓
未命中
↓
查数据库
↓
写入缓存
更新流程:
更新数据库
删除缓存
面试回答
Cache Aside 是最常见的缓存策略。 应用程序先查询缓存,如果未命中再访问数据库,并把结果写入缓存。 更新数据时通常是先更新数据库,再删除缓存。 这种方式非常适合 读多写少的系统,因为缓存只保存热点数据。
2️⃣ Write Through
写操作先写缓存,再同步写数据库。
优点:
- 缓存始终和数据库一致
缺点:
- 写延迟增加
适用场景:
- Session
- 用户配置
- 认证信息
面试回答
Write Through 的特点是所有写操作都会先写缓存,再同步写数据库。 这样可以保证缓存和数据库始终一致。 但缺点是写入延迟会增加。 通常用于 session 或配置类数据。
3️⃣ Write Back(Write Behind)
写入缓存后 异步写数据库。
优点:
- 写入性能极高
缺点:
- Cache 崩溃可能丢数据
适用场景:
- 日志系统
- 分析系统
- Telemetry
面试回答
Write Back 是先写缓存,再异步写数据库。 这样数据库不在写路径上,因此写入性能非常高。 但如果缓存崩溃,可能会丢失数据。 所以通常用于 日志、分析、监控等高写入系统。
4️⃣ Read Through
应用只访问缓存。
缓存未命中时,缓存系统自动从数据库读取。
优点:
- 应用代码更简单
- 缓存逻辑集中
缺点:
- 缓存系统更复杂
面试回答
Read Through 的特点是应用只访问缓存。 如果缓存未命中,缓存系统会自动从数据库加载数据。 这样可以把缓存逻辑从应用中抽离出来。 但缓存系统本身会变得更复杂。
🎤 30 秒面试总结
在系统设计中选择缓存策略时,我主要看 读写比例和一致性要求。
Cache Aside 是最常见的方式,适合读多写少系统。 Write Through 可以保证缓存和数据库一致,但写延迟更高。 Write Back 写入性能最好,但存在数据丢失风险。 Read Through 把缓存逻辑集中在缓存层。
实际系统中 Cache Aside 是最常用的默认方案。
Implement