🎯 Design Inventory System
1️⃣ Core Framework
When discussing Inventory System design, I frame it as:
- Inventory data model: SKU, location, quantity
- Core flows: stock in, reserve, commit, release
- Reservation and oversell prevention
- Order integration and payment integration
- Multi-warehouse / multi-store inventory
- Event-driven updates and audit trail
- Reconciliation and correction
- Trade-offs: consistency vs availability vs latency
2️⃣ Core Requirements
Functional Requirements
- Track available inventory by SKU
- Track inventory by warehouse / store / region
- Support stock increase and decrease
- Support reservation during checkout
- Support commit after order/payment success
- Support release after cancellation or timeout
- Prevent overselling
- Support inventory audit history
- Support inventory reconciliation
- Support low-stock alerts
Non-functional Requirements
- Strong correctness for stock deduction
- High availability for product browsing
- Low-latency checkout validation
- Scalable read traffic
- Auditable inventory changes
- Eventually consistent read models are acceptable
- Stronger consistency needed for reservation and commit
👉 Interview Answer
An inventory system tracks how many units of each SKU are available at each location.
The most important challenge is preventing overselling, especially during checkout and high-demand events.
I would separate read-heavy inventory display from write-critical reservation and commit flows.
3️⃣ Core Concepts
SKU
A SKU represents a sellable item variant.
Example:
product = T-shirt
SKU = red, size M
Location
Inventory can exist at:
- Warehouse
- Store
- Fulfillment center
- Region
- Seller location
Inventory States
Common quantity fields:
on_hand
reserved
available
sold
damaged
returned
Formula:
available = on_hand - reserved - unavailable
👉 Interview Answer
I would model inventory at the SKU-location level.
The same product may have multiple SKUs, and each SKU may have inventory in multiple warehouses or stores.
Available inventory is usually derived from on-hand quantity minus reserved or unavailable quantity.
4️⃣ Main APIs
Get Inventory
GET /api/inventory?skuId=sku123&locationId=wh1
Reserve Inventory
POST /api/inventory/reservations
Request:
{
"orderId": "o123",
"skuId": "sku123",
"locationId": "wh1",
"quantity": 2
}
Commit Inventory
POST /api/inventory/reservations/{reservationId}/commit
Release Inventory
POST /api/inventory/reservations/{reservationId}/release
Adjust Inventory
POST /api/inventory/adjustments
Request:
{
"skuId": "sku123",
"locationId": "wh1",
"delta": 10,
"reason": "stock_received"
}
👉 Interview Answer
The core APIs are get inventory, reserve inventory, commit inventory, release inventory, and adjust inventory.
Reservation, commit, and release APIs must be idempotent, because order and payment systems may retry calls.
5️⃣ Data Model
Inventory Balance Table
inventory_balance (
sku_id VARCHAR,
location_id VARCHAR,
on_hand INT,
reserved INT,
unavailable INT,
version BIGINT,
updated_at TIMESTAMP,
PRIMARY KEY (sku_id, location_id)
)
Inventory Reservation Table
inventory_reservation (
reservation_id VARCHAR PRIMARY KEY,
order_id VARCHAR,
sku_id VARCHAR,
location_id VARCHAR,
quantity INT,
status VARCHAR, -- reserved, committed, released, expired
expires_at TIMESTAMP,
idempotency_key VARCHAR,
created_at TIMESTAMP,
updated_at TIMESTAMP
)
Inventory Event Table
inventory_event (
event_id VARCHAR PRIMARY KEY,
sku_id VARCHAR,
location_id VARCHAR,
event_type VARCHAR,
quantity_delta INT,
reason VARCHAR,
reference_id VARCHAR,
created_at TIMESTAMP,
metadata JSON
)
Inventory Snapshot Table
inventory_snapshot (
sku_id VARCHAR,
location_id VARCHAR,
available INT,
updated_at TIMESTAMP,
PRIMARY KEY (sku_id, location_id)
)
👉 Interview Answer
I would maintain an inventory balance table for current quantities, a reservation table for checkout holds, and an event table for audit history.
The event table is important because inventory changes must be explainable, debuggable, and reconcilable.
6️⃣ Reservation Flow
Why Reservation?
During checkout, we do not want to immediately mark inventory as sold.
Instead:
reserve now
commit after payment/order success
release if cancelled or expired
Reservation Flow
User checks out
→ Order service requests reservation
→ Inventory service checks available quantity
→ If enough stock, increase reserved
→ Create reservation record with TTL
→ Return success
Atomic Condition
Use conditional update:
UPDATE inventory_balance
SET reserved = reserved + 2,
version = version + 1
WHERE sku_id = 'sku123'
AND location_id = 'wh1'
AND on_hand - reserved - unavailable >= 2;
👉 Interview Answer
I would use a reservation model.
During checkout, the inventory service atomically checks whether enough stock is available and increases the reserved quantity.
This prevents overselling while payment and order confirmation are still in progress.
7️⃣ Commit and Release Flow
Commit Flow
After order/payment success:
Reservation reserved
→ Commit reservation
→ Decrease on_hand
→ Decrease reserved
→ Mark reservation committed
→ Emit inventory committed event
Example:
on_hand = on_hand - quantity
reserved = reserved - quantity
Release Flow
If order is cancelled or payment fails:
Reservation reserved
→ Release reservation
→ Decrease reserved
→ Mark reservation released
→ Emit inventory released event
Expiration Flow
If reservation expires:
Reservation expires
→ Background worker releases it
→ Reserved stock becomes available again
👉 Interview Answer
After payment succeeds, the reservation should be committed, which reduces both on-hand and reserved inventory.
If payment fails or the order is cancelled, the reservation should be released, which decreases reserved inventory and makes the stock available again.
Reservations should have expiration times so abandoned checkouts do not hold inventory forever.
8️⃣ Oversell Prevention
Main Risk
Multiple customers try to buy the same SKU at the same time.
Techniques
1. Conditional Update
WHERE available >= requested_quantity
2. Optimistic Locking
Use version column:
read version
update where version = old_version
3. Row-level Locking
Lock SKU-location row during update.
4. Single-writer per SKU Partition
Route all writes for one SKU to one partition/actor.
5. Reservation Queue for Flash Sales
Serialize requests for extremely hot SKUs.
👉 Interview Answer
To prevent overselling, the critical operation is the atomic reservation.
I would use conditional updates, optimistic locking, or single-writer partitioning to ensure reserved quantity never exceeds available stock.
For flash sales, a queue-based reservation system can protect hot SKUs.
9️⃣ Multi-location Inventory
Why Multi-location Matters
The same SKU may exist in many places:
SKU123:
- warehouse A: 100
- warehouse B: 50
- store C: 5
Location Selection
Choose fulfillment location based on:
- Available inventory
- Distance to customer
- Shipping cost
- Delivery speed
- Warehouse capacity
- Business rules
Flow
Order request
→ Find eligible locations
→ Check available inventory
→ Select best fulfillment location
→ Reserve inventory at that location
👉 Interview Answer
In a real system, inventory is tracked by SKU and location.
When an order is placed, the system should choose the best fulfillment location based on stock availability, distance, shipping cost, and delivery promise.
The reservation should happen at the selected location.
🔟 Read Model and Caching
Read-heavy Use Cases
- Product detail page
- Search results
- Store availability
- Low-stock badge
- Estimated delivery promise
Why Not Query Strong Store Every Time?
Because browsing traffic is much higher than checkout traffic.
Strategy
Use a read-optimized inventory snapshot:
Inventory write model
→ events
→ read model / cache
→ product pages
Cache Rules
- Product page inventory can be slightly stale
- Checkout must revalidate against source of truth
- Use short TTL for hot products
- Use event-driven cache invalidation
👉 Interview Answer
I would separate inventory reads from inventory writes.
Product browsing can use cached or eventually consistent inventory snapshots, because slight staleness is acceptable.
But checkout must call the inventory service to perform an atomic reservation against the authoritative inventory balance.
1️⃣1️⃣ Event-driven Inventory Updates
Events
Examples:
inventory_reserved
inventory_committed
inventory_released
inventory_adjusted
inventory_received
inventory_damaged
inventory_returned
Event Consumers
- Product page cache
- Search index
- Analytics
- Low-stock alerting
- Warehouse management system
- Order system
- Recommendation system
👉 Interview Answer
Inventory changes should emit events.
These events update read models, product availability cache, search index, analytics, low-stock alerts, and warehouse systems.
This keeps the write path focused on correctness while downstream systems update asynchronously.
1️⃣2️⃣ Returns, Damaged Goods, and Adjustments
Return Flow
Customer returns item
→ Warehouse receives item
→ Inspect condition
→ If sellable, increase on_hand
→ If damaged, increase unavailable
→ Emit inventory_returned event
Adjustment Reasons
- Stock received
- Manual correction
- Damaged item
- Lost item
- Return processed
- Cycle count correction
- Supplier shipment correction
👉 Interview Answer
Not all inventory changes come from orders.
Returns, damaged goods, warehouse receiving, manual corrections, and cycle counts also change inventory.
Every adjustment should include a reason, reference ID, and audit event.
1️⃣3️⃣ Reconciliation
Why Needed?
System inventory may differ from physical inventory.
Causes:
- Warehouse scanning errors
- Lost items
- Damaged items
- Event processing failures
- Manual operations
- Supplier shipment mismatch
Reconciliation Flow
Physical count / warehouse report
→ Compare with system inventory
→ Find discrepancy
→ Create adjustment event
→ Update inventory balance
→ Generate audit report
Important Principle
Never silently overwrite inventory.
Always create adjustment events.
👉 Interview Answer
Reconciliation is necessary because physical inventory can diverge from system inventory.
I would compare warehouse counts with system balances, create adjustment events for discrepancies, and keep a full audit trail.
Inventory corrections should never be silent overwrites.
1️⃣4️⃣ Flash Sale / High-demand SKU Handling
Problem
A very popular item receives huge concurrent demand.
Risks:
- Oversell
- Database hot row
- Checkout latency spike
- Bad user experience
Strategies
- Pre-allocate limited tokens
- Queue purchase requests
- Use single-writer partition
- Use Redis atomic counters with DB confirmation
- Rate limit per user
- Use waitlist or lottery
- Degrade inventory display to “limited stock”
👉 Interview Answer
Flash sales create hot SKU problems.
For extremely high-demand SKUs, I would avoid letting every request directly hit the inventory database.
Instead, I would use a queue, token bucket, or single-writer partition to serialize reservations and protect consistency.
1️⃣5️⃣ Integration With Order and Payment
Normal Flow
User checks out
→ Reserve inventory
→ Authorize payment
→ Create order
→ Commit inventory after order confirmed
Alternative Flow
Authorize payment
→ Reserve inventory
→ Create order
→ Capture payment
Failure Cases
- Inventory reserved but payment fails → release reservation
- Payment authorized but inventory fails → void authorization
- Order created but commit fails → retry / manual reconciliation
Saga Pattern
Use saga to coordinate:
reserve inventory
authorize payment
create order
commit inventory
capture payment
Each step has a compensation action.
👉 Interview Answer
Inventory, order, and payment should be coordinated carefully.
I would use a saga pattern, where each step has a compensating action.
For example, if payment fails after inventory reservation, the system releases the reservation.
If inventory reservation fails, the system should not proceed with payment capture.
1️⃣6️⃣ Consistency Model
Stronger Consistency Needed For
- Reservation
- Commit
- Release
- Inventory adjustment
- Order checkout
- Flash sale deduction
Eventual Consistency Acceptable For
- Product page stock display
- Search result availability
- Low-stock badges
- Analytics
- Recommendation features
- Inventory reporting dashboards
👉 Interview Answer
Inventory requires mixed consistency.
Checkout reservation and commit need strong correctness to prevent overselling.
Product pages and search results can use eventually consistent snapshots, but they must revalidate inventory during checkout.
1️⃣7️⃣ Scaling Patterns
Pattern 1: Separate Write Model and Read Model
- Write model = authoritative inventory balance
- Read model = cached snapshot for browsing
Pattern 2: Shard by SKU or SKU-location
hash(sku_id + location_id)
Pattern 3: Event-driven Propagation
Inventory changes publish events to downstream consumers.
Pattern 4: Single-writer for Hot SKU
Serialize writes for high-demand items.
Pattern 5: Reservation Expiration Worker
Automatically releases expired reservations.
👉 Interview Answer
To scale inventory, I would shard by SKU-location, separate authoritative writes from cached read models, and use events to update downstream systems.
For hot SKUs, a single-writer or queue-based reservation model can prevent contention and overselling.
1️⃣8️⃣ Failure Handling
Common Failures
- Reservation request timeout
- Duplicate reservation request
- Payment fails after reservation
- Reservation expires but release worker delayed
- Commit retry creates duplicate deduction
- Event delivery failure
- Warehouse adjustment mismatch
Strategies
- Idempotency keys
- Reservation state machine
- Conditional updates
- Expiration worker
- Retry with backoff
- Outbox pattern for events
- Reconciliation jobs
- Audit events for every change
👉 Interview Answer
Inventory systems must handle retries and partial failures.
Reservation, commit, and release should be idempotent.
Inventory updates should use conditional writes, and all state changes should emit audit events.
Reconciliation jobs are needed to correct mismatches over time.
1️⃣9️⃣ Observability
Key Metrics
- Reservation success rate
- Reservation failure rate
- Oversell count
- Expired reservation count
- Commit failure rate
- Release failure rate
- Inventory adjustment count
- Hot SKU contention
- Checkout inventory latency
- Reconciliation mismatch count
- Low-stock alert count
👉 Interview Answer
I would monitor reservation success rate, oversell count, checkout inventory latency, expired reservations, commit and release failures, hot SKU contention, and reconciliation mismatches.
These metrics directly show whether inventory correctness and checkout reliability are healthy.
2️⃣0️⃣ End-to-End Flow
Checkout Flow
User checks out
→ Inventory service reserves SKU-location quantity
→ Payment service authorizes payment
→ Order service creates order
→ Inventory service commits reservation
→ Order confirmed
Cancellation Flow
User cancels order
→ Order state updated
→ Inventory reservation released
→ Payment authorization voided or refunded
→ Events emitted
Reconciliation Flow
Warehouse physical count
→ Compare with system balance
→ Create adjustment event
→ Update inventory balance
→ Audit report generated
Key Insight
Inventory System is not just a quantity table — it is a correctness-critical reservation and reconciliation system.
🧠 Staff-Level Answer (Final)
👉 Interview Answer (Full Version)
When designing an inventory system, I think of it as a correctness-critical system that tracks available stock by SKU and location.
The most important goal is to prevent overselling, especially during checkout and high-demand events.
I would model inventory using an authoritative inventory balance table, a reservation table, and an append-only inventory event table.
During checkout, the system should create a reservation instead of immediately marking stock as sold. The reservation atomically checks available quantity and increases reserved inventory.
After payment and order confirmation, the reservation is committed, which decreases both on-hand and reserved inventory.
If payment fails, the order is cancelled, or the reservation expires, the reservation is released and the stock becomes available again.
To prevent overselling, I would use conditional updates, optimistic locking, row-level locking, or single-writer partitioning for hot SKUs.
For browsing, I would use eventually consistent inventory snapshots or caches, because product pages and search results are read-heavy and can tolerate slight staleness.
But checkout must always revalidate and reserve against the authoritative inventory store.
Inventory changes should emit events so downstream systems like search, product pages, analytics, low-stock alerts, and warehouse systems can update asynchronously.
Reconciliation is essential because physical inventory can diverge from system inventory. Corrections should be made through adjustment events, never silent overwrites.
The main trade-offs are consistency, availability, checkout latency, contention on hot SKUs, and operational complexity.
Ultimately, the goal is to provide fast inventory visibility for users while maintaining strong correctness for reservation, commit, release, and reconciliation.
⭐ Final Insight
Inventory System 的核心不是简单的库存数量表, 而是一个防止 oversell、支持 reservation、commit、release 和 reconciliation 的强正确性系统。
中文部分
🎯 Design Inventory System
1️⃣ 核心框架
在设计 Inventory System 时,我通常从以下几个方面来分析:
- Inventory 数据模型:SKU、location、quantity
- 核心流程:stock in、reserve、commit、release
- Reservation 和 oversell 防护
- Order 和 payment integration
- Multi-warehouse / multi-store inventory
- Event-driven updates 和 audit trail
- Reconciliation 和 correction
- 核心权衡:consistency vs availability vs latency
2️⃣ 核心需求
功能需求
- 按 SKU 跟踪可用库存
- 按 warehouse / store / region 跟踪库存
- 支持库存增加和减少
- 支持 checkout 时库存 reservation
- 支持 order / payment 成功后的 commit
- 支持 cancellation 或 timeout 后 release
- 防止 overselling
- 支持 inventory audit history
- 支持 inventory reconciliation
- 支持 low-stock alerts
非功能需求
- 库存扣减需要强正确性
- 商品浏览需要高可用
- Checkout validation 低延迟
- 可扩展读流量
- 库存变化可审计
- Read models 可以最终一致
- Reservation 和 commit 需要更强一致性
👉 面试回答
Inventory System 用来追踪每个 SKU 在每个 location 有多少可用库存。
最重要的挑战是防止 overselling, 尤其是在 checkout 和高需求活动期间。
我会将 read-heavy 的库存展示 和 write-critical 的 reservation / commit 流程分开。
3️⃣ 核心概念
SKU
SKU 表示一个可销售的商品变体。
示例:
product = T-shirt
SKU = red, size M
Location
库存可以存在于:
- Warehouse
- Store
- Fulfillment center
- Region
- Seller location
Inventory States
常见 quantity 字段:
on_hand
reserved
available
sold
damaged
returned
公式:
available = on_hand - reserved - unavailable
👉 面试回答
我会在 SKU-location 级别建模库存。
同一个 product 可能有多个 SKUs, 每个 SKU 也可能存在于多个 warehouses 或 stores。
Available inventory 通常由 on-hand quantity 减去 reserved 或 unavailable quantity 得出。
4️⃣ 主要 API
Get Inventory
GET /api/inventory?skuId=sku123&locationId=wh1
Reserve Inventory
POST /api/inventory/reservations
Request:
{
"orderId": "o123",
"skuId": "sku123",
"locationId": "wh1",
"quantity": 2
}
Commit Inventory
POST /api/inventory/reservations/{reservationId}/commit
Release Inventory
POST /api/inventory/reservations/{reservationId}/release
Adjust Inventory
POST /api/inventory/adjustments
Request:
{
"skuId": "sku123",
"locationId": "wh1",
"delta": 10,
"reason": "stock_received"
}
👉 面试回答
核心 API 包括 get inventory、reserve inventory、 commit inventory、release inventory 和 adjust inventory。
Reservation、commit 和 release APIs 必须幂等, 因为 order 和 payment systems 都可能重试调用。
5️⃣ 数据模型
Inventory Balance Table
inventory_balance (
sku_id VARCHAR,
location_id VARCHAR,
on_hand INT,
reserved INT,
unavailable INT,
version BIGINT,
updated_at TIMESTAMP,
PRIMARY KEY (sku_id, location_id)
)
Inventory Reservation Table
inventory_reservation (
reservation_id VARCHAR PRIMARY KEY,
order_id VARCHAR,
sku_id VARCHAR,
location_id VARCHAR,
quantity INT,
status VARCHAR, -- reserved, committed, released, expired
expires_at TIMESTAMP,
idempotency_key VARCHAR,
created_at TIMESTAMP,
updated_at TIMESTAMP
)
Inventory Event Table
inventory_event (
event_id VARCHAR PRIMARY KEY,
sku_id VARCHAR,
location_id VARCHAR,
event_type VARCHAR,
quantity_delta INT,
reason VARCHAR,
reference_id VARCHAR,
created_at TIMESTAMP,
metadata JSON
)
Inventory Snapshot Table
inventory_snapshot (
sku_id VARCHAR,
location_id VARCHAR,
available INT,
updated_at TIMESTAMP,
PRIMARY KEY (sku_id, location_id)
)
👉 面试回答
我会维护 inventory balance table 来记录当前库存数量, reservation table 来记录 checkout hold, event table 来记录 audit history。
Event table 很重要, 因为库存变化必须可以解释、debug 和 reconciliation。
6️⃣ Reservation Flow
为什么需要 Reservation?
在 checkout 时, 我们不应该立刻把库存标记为 sold。
而是:
reserve now
commit after payment/order success
release if cancelled or expired
Reservation Flow
User checks out
→ Order service requests reservation
→ Inventory service checks available quantity
→ If enough stock, increase reserved
→ Create reservation record with TTL
→ Return success
Atomic Condition
使用 conditional update:
UPDATE inventory_balance
SET reserved = reserved + 2,
version = version + 1
WHERE sku_id = 'sku123'
AND location_id = 'wh1'
AND on_hand - reserved - unavailable >= 2;
👉 面试回答
我会使用 reservation model。
在 checkout 期间, inventory service 会原子检查是否有足够库存, 并增加 reserved quantity。
这样可以在 payment 和 order confirmation 仍在进行时, 防止 overselling。
7️⃣ Commit and Release Flow
Commit Flow
Order / payment 成功后:
Reservation reserved
→ Commit reservation
→ Decrease on_hand
→ Decrease reserved
→ Mark reservation committed
→ Emit inventory committed event
示例:
on_hand = on_hand - quantity
reserved = reserved - quantity
Release Flow
如果 order cancelled 或 payment failed:
Reservation reserved
→ Release reservation
→ Decrease reserved
→ Mark reservation released
→ Emit inventory released event
Expiration Flow
如果 reservation 过期:
Reservation expires
→ Background worker releases it
→ Reserved stock becomes available again
👉 面试回答
Payment 成功后, reservation 应该被 commit, 这会同时减少 on-hand 和 reserved inventory。
如果 payment 失败或 order 被取消, reservation 应该被 release, 减少 reserved inventory, 让库存重新变成可用。
Reservations 应该有过期时间, 避免 abandoned checkout 永久占用库存。
8️⃣ Oversell Prevention
Main Risk
多个 customers 同时购买同一个 SKU。
Techniques
1. Conditional Update
WHERE available >= requested_quantity
2. Optimistic Locking
使用 version column:
read version
update where version = old_version
3. Row-level Locking
更新时锁住 SKU-location row。
4. Single-writer per SKU Partition
同一个 SKU 的所有写入都路由到同一个 partition / actor。
5. Reservation Queue for Flash Sales
对极热门 SKU 串行化 reservation requests。
👉 面试回答
为了防止 overselling, 最关键的操作是 atomic reservation。
我会使用 conditional updates、optimistic locking 或 single-writer partitioning, 保证 reserved quantity 不会超过 available stock。
对于 flash sale, queue-based reservation system 可以保护 hot SKUs。
9️⃣ Multi-location Inventory
为什么 Multi-location 重要?
同一个 SKU 可能存在于多个地点:
SKU123:
- warehouse A: 100
- warehouse B: 50
- store C: 5
Location Selection
选择 fulfillment location 时考虑:
- Available inventory
- Distance to customer
- Shipping cost
- Delivery speed
- Warehouse capacity
- Business rules
Flow
Order request
→ Find eligible locations
→ Check available inventory
→ Select best fulfillment location
→ Reserve inventory at that location
👉 面试回答
在真实系统中, inventory 通常按 SKU 和 location 追踪。
当订单创建时, 系统应该根据 stock availability、distance、 shipping cost 和 delivery promise 选择最佳 fulfillment location。
Reservation 应该发生在被选中的 location 上。
🔟 Read Model and Caching
Read-heavy Use Cases
- Product detail page
- Search results
- Store availability
- Low-stock badge
- Estimated delivery promise
Why Not Query Strong Store Every Time?
因为 browsing traffic 远高于 checkout traffic。
Strategy
使用 read-optimized inventory snapshot:
Inventory write model
→ events
→ read model / cache
→ product pages
Cache Rules
- Product page inventory 可以轻微 stale
- Checkout 必须基于 source of truth 重新验证
- 热门产品使用短 TTL
- 使用 event-driven cache invalidation
👉 面试回答
我会将 inventory reads 和 writes 分开。
商品浏览可以使用 cached 或最终一致的 inventory snapshots, 因为轻微 stale 是可以接受的。
但 checkout 必须调用 inventory service, 在 authoritative inventory balance 上执行 atomic reservation。
1️⃣1️⃣ Event-driven Inventory Updates
Events
示例:
inventory_reserved
inventory_committed
inventory_released
inventory_adjusted
inventory_received
inventory_damaged
inventory_returned
Event Consumers
- Product page cache
- Search index
- Analytics
- Low-stock alerting
- Warehouse management system
- Order system
- Recommendation system
👉 面试回答
Inventory changes 应该发布 events。
这些 events 可以更新 read models、 product availability cache、search index、 analytics、low-stock alerts 和 warehouse systems。
这样 write path 可以专注于 correctness, downstream systems 异步更新。
1️⃣2️⃣ Returns, Damaged Goods, and Adjustments
Return Flow
Customer returns item
→ Warehouse receives item
→ Inspect condition
→ If sellable, increase on_hand
→ If damaged, increase unavailable
→ Emit inventory_returned event
Adjustment Reasons
- Stock received
- Manual correction
- Damaged item
- Lost item
- Return processed
- Cycle count correction
- Supplier shipment correction
👉 面试回答
并不是所有库存变化都来自订单。
Returns、damaged goods、warehouse receiving、 manual corrections 和 cycle counts 也会改变库存。
每次 adjustment 都应该包含 reason、reference ID 和 audit event。
1️⃣3️⃣ Reconciliation
为什么需要?
系统库存可能和实际物理库存不同。
原因:
- Warehouse scanning errors
- Lost items
- Damaged items
- Event processing failures
- Manual operations
- Supplier shipment mismatch
Reconciliation Flow
Physical count / warehouse report
→ Compare with system inventory
→ Find discrepancy
→ Create adjustment event
→ Update inventory balance
→ Generate audit report
Important Principle
不要静默覆盖库存。
必须创建 adjustment events。
👉 面试回答
Reconciliation 是必要的, 因为 physical inventory 可能和 system inventory 不一致。
我会将 warehouse count 和 system balance 对比, 对差异创建 adjustment events, 并保留完整 audit trail。
Inventory corrections 不应该是 silent overwrites。
1️⃣4️⃣ Flash Sale / High-demand SKU Handling
Problem
一个热门商品收到大量并发需求。
风险:
- Oversell
- Database hot row
- Checkout latency spike
- Bad user experience
Strategies
- Pre-allocate limited tokens
- Queue purchase requests
- Use single-writer partition
- Use Redis atomic counters with DB confirmation
- Rate limit per user
- Use waitlist or lottery
- 库存展示降级成 “limited stock”
👉 面试回答
Flash sale 会造成 hot SKU problem。
对极高需求的 SKU, 我不会让所有请求直接打到 inventory database。
我会使用 queue、token bucket 或 single-writer partition 来串行化 reservations 并保护一致性。
1️⃣5️⃣ Integration With Order and Payment
Normal Flow
User checks out
→ Reserve inventory
→ Authorize payment
→ Create order
→ Commit inventory after order confirmed
Alternative Flow
Authorize payment
→ Reserve inventory
→ Create order
→ Capture payment
Failure Cases
- Inventory reserved but payment fails → release reservation
- Payment authorized but inventory fails → void authorization
- Order created but commit fails → retry / manual reconciliation
Saga Pattern
使用 saga 协调:
reserve inventory
authorize payment
create order
commit inventory
capture payment
每一步都有 compensation action。
👉 面试回答
Inventory、order 和 payment 需要谨慎协调。
我会使用 saga pattern, 每一步都有对应的补偿动作。
例如,如果 inventory reservation 后 payment 失败, 系统需要 release reservation。
如果 inventory reservation 失败, 系统不应该继续 capture payment。
1️⃣6️⃣ Consistency Model
需要较强一致性的场景
- Reservation
- Commit
- Release
- Inventory adjustment
- Order checkout
- Flash sale deduction
可以最终一致的场景
- Product page stock display
- Search result availability
- Low-stock badges
- Analytics
- Recommendation features
- Inventory reporting dashboards
👉 面试回答
Inventory 需要 mixed consistency。
Checkout reservation 和 commit 需要强正确性, 防止 overselling。
Product pages 和 search results 可以使用最终一致 snapshots, 但 checkout 时必须重新验证 inventory。
1️⃣7️⃣ Scaling Patterns
Pattern 1: Separate Write Model and Read Model
- Write model = authoritative inventory balance
- Read model = cached snapshot for browsing
Pattern 2: Shard by SKU or SKU-location
hash(sku_id + location_id)
Pattern 3: Event-driven Propagation
Inventory changes publish events to downstream consumers.
Pattern 4: Single-writer for Hot SKU
对热门 SKU 串行化写入。
Pattern 5: Reservation Expiration Worker
自动释放过期 reservations。
👉 面试回答
为了扩展 inventory, 我会按 SKU-location 分片, 将 authoritative writes 和 cached read models 分开, 并使用 events 更新 downstream systems。
对 hot SKUs, single-writer 或 queue-based reservation model 可以减少竞争并防止 overselling。
1️⃣8️⃣ Failure Handling
Common Failures
- Reservation request timeout
- Duplicate reservation request
- Payment fails after reservation
- Reservation expires but release worker delayed
- Commit retry creates duplicate deduction
- Event delivery failure
- Warehouse adjustment mismatch
Strategies
- Idempotency keys
- Reservation state machine
- Conditional updates
- Expiration worker
- Retry with backoff
- Outbox pattern for events
- Reconciliation jobs
- Audit events for every change
👉 面试回答
Inventory system 必须处理 retries 和 partial failures。
Reservation、commit 和 release 都应该幂等。
Inventory updates 应该使用 conditional writes, 所有状态变化都应该产生 audit events。
Reconciliation jobs 用于长期修复 mismatches。
1️⃣9️⃣ Observability
Key Metrics
- Reservation success rate
- Reservation failure rate
- Oversell count
- Expired reservation count
- Commit failure rate
- Release failure rate
- Inventory adjustment count
- Hot SKU contention
- Checkout inventory latency
- Reconciliation mismatch count
- Low-stock alert count
👉 面试回答
我会监控 reservation success rate、oversell count、 checkout inventory latency、expired reservations、 commit / release failures、hot SKU contention 和 reconciliation mismatches。
这些指标可以直接反映 inventory correctness 和 checkout reliability 是否健康。
2️⃣0️⃣ End-to-End Flow
Checkout Flow
User checks out
→ Inventory service reserves SKU-location quantity
→ Payment service authorizes payment
→ Order service creates order
→ Inventory service commits reservation
→ Order confirmed
Cancellation Flow
User cancels order
→ Order state updated
→ Inventory reservation released
→ Payment authorization voided or refunded
→ Events emitted
Reconciliation Flow
Warehouse physical count
→ Compare with system balance
→ Create adjustment event
→ Update inventory balance
→ Audit report generated
Key Insight
Inventory System 不是简单的 quantity table, 而是 correctness-critical reservation and reconciliation system。
🧠 Staff-Level Answer(最终版)
👉 面试回答(完整背诵版)
在设计 Inventory System 时, 我会把它看作一个 correctness-critical system, 用来追踪每个 SKU 在每个 location 的可用库存。
最重要目标是防止 overselling, 特别是在 checkout 和高并发抢购场景。
我会使用 authoritative inventory balance table、 reservation table 和 append-only inventory event table 来建模。
在 checkout 期间, 系统不应该立刻将库存标记为 sold, 而是先创建 reservation。 Reservation 会原子检查可用数量, 并增加 reserved inventory。
Payment 和 order confirmation 成功后, reservation 会被 commit, 这会同时减少 on-hand 和 reserved inventory。
如果 payment 失败、order 被取消, 或 reservation 过期, reservation 会被 release, 库存重新变为可用。
为了防止 overselling, 我会使用 conditional updates、optimistic locking、 row-level locking, 或对 hot SKUs 使用 single-writer partitioning。
对于 browsing, 我会使用最终一致的 inventory snapshots 或 cache, 因为商品页和搜索结果是 read-heavy, 可以容忍轻微 stale。
但 checkout 必须始终基于 authoritative inventory store 重新验证并 reserve inventory。
Inventory changes 应该发布 events, 让 search、product pages、analytics、 low-stock alerts 和 warehouse systems 可以异步更新。
Reconciliation 非常关键, 因为 physical inventory 可能和 system inventory 不一致。 Correction 应该通过 adjustment events 完成, 不能 silent overwrite。
核心权衡包括 consistency、availability、 checkout latency、hot SKU contention 和 operational complexity。
最终目标是在给用户提供快速库存可见性的同时, 对 reservation、commit、release 和 reconciliation 保持强正确性。
⭐ Final Insight
Inventory System 的核心不是简单的库存数量表, 而是一个防止 oversell、支持 reservation、commit、release 和 reconciliation 的强正确性系统。
Implement