🎯 Design Ride Sharing (Uber)
1️⃣ Core Framework
When discussing Ride Sharing design, I frame it as:
- Core flows: request ride, match driver, track trip
- Real-time location system
- Driver discovery and geo indexing
- Matching and dispatch algorithm
- Trip state machine
- ETA, routing, and pricing
- Payment and notifications
- Trade-offs: latency vs accuracy vs consistency
2️⃣ Core Requirements
Functional Requirements
- Rider can request a ride
- Driver can go online / offline
- System can find nearby drivers
- System can match rider with driver
- Rider and driver can track each other in real time
- Support trip lifecycle: requested, accepted, picked up, completed
- Support pricing and ETA
- Support payment
- Support cancellation
- Support notifications
Non-functional Requirements
- Low-latency matching
- Real-time location updates
- High availability
- Scalable geo search
- Reliable trip state management
- Strong payment correctness
- Eventually consistent location updates are acceptable
👉 Interview Answer
A ride-sharing system has three critical real-time flows: driver location updates, rider trip requests, and rider-driver matching.
The main challenge is finding the best nearby driver quickly, while handling real-time location changes, trip state transitions, pricing, payment, and system failures.
3️⃣ Main APIs
Driver Location Update
POST /api/drivers/{driverId}/location
Request:
{
"lat": 40.7128,
"lng": -74.0060,
"status": "AVAILABLE",
"timestamp": "2026-05-02T10:00:00Z"
}
Request Ride
POST /api/rides
Request:
{
"riderId": "r123",
"pickup": {
"lat": 40.7128,
"lng": -74.0060
},
"dropoff": {
"lat": 40.7580,
"lng": -73.9855
},
"rideType": "standard"
}
Accept Ride
POST /api/rides/{rideId}/accept
Request:
{
"driverId": "d456"
}
Update Trip State
POST /api/rides/{rideId}/state
Request:
{
"state": "PICKED_UP"
}
Get Ride Status
GET /api/rides/{rideId}
👉 Interview Answer
I would separate APIs for driver location updates, ride creation, ride acceptance, trip state updates, and ride status retrieval.
Location updates are high-volume and eventually consistent, while ride acceptance and payment require stronger correctness.
4️⃣ Data Model
Rider Table
rider (
rider_id VARCHAR PRIMARY KEY,
name VARCHAR,
phone VARCHAR,
payment_profile_id VARCHAR,
created_at TIMESTAMP
)
Driver Table
driver (
driver_id VARCHAR PRIMARY KEY,
name VARCHAR,
phone VARCHAR,
vehicle_id VARCHAR,
status VARCHAR,
created_at TIMESTAMP
)
Driver Location Store
driver_location (
driver_id VARCHAR PRIMARY KEY,
lat DOUBLE,
lng DOUBLE,
geohash VARCHAR,
status VARCHAR,
updated_at TIMESTAMP
)
Ride Table
ride (
ride_id VARCHAR PRIMARY KEY,
rider_id VARCHAR,
driver_id VARCHAR,
pickup_lat DOUBLE,
pickup_lng DOUBLE,
dropoff_lat DOUBLE,
dropoff_lng DOUBLE,
state VARCHAR,
price_estimate DECIMAL,
final_price DECIMAL,
created_at TIMESTAMP,
updated_at TIMESTAMP
)
Ride Event Table
ride_event (
event_id VARCHAR PRIMARY KEY,
ride_id VARCHAR,
actor_id VARCHAR,
event_type VARCHAR,
created_at TIMESTAMP,
metadata JSON
)
👉 Interview Answer
I would store riders, drivers, trips, driver locations, and ride events separately.
Driver location is high-volume and frequently updated, so it should be stored in a fast geo-indexed store.
Ride state should be stored durably, and ride events can be used for auditing and debugging.
5️⃣ Real-time Location System
Driver Location Updates
Drivers send location every few seconds.
Driver App
→ Location Service
→ Geo Index
→ Location Stream
Storage Choice
Use fast in-memory or geo-enabled storage:
Redis Geo
Geohash index
S2 cells
Elastic geo index
Custom in-memory geo grid
Why Not Store Every Location in Main DB?
Because location updates are:
- High frequency
- Short-lived
- Mostly used for nearby search
- Eventually consistent
👉 Interview Answer
Driver location updates are high-volume and short-lived.
I would store latest driver locations in a fast geo-indexed store, such as Redis Geo, S2 cells, or a geohash-based index.
The main relational database should store durable trip state, not every real-time location update.
6️⃣ Geo Indexing and Nearby Driver Search
Geohash / Grid-based Index
Convert location into a cell.
lat/lng → geohash / S2 cell
Then search:
pickup cell + neighboring cells
Search Flow
Rider requests ride
→ Convert pickup to geo cell
→ Find available drivers in nearby cells
→ Filter by distance and driver status
→ Rank drivers
→ Dispatch request
Candidate Filters
- Driver availability
- Distance to pickup
- ETA to pickup
- Vehicle type
- Driver rating
- Current assignment status
- Region constraints
👉 Interview Answer
To find nearby drivers, I would use a geo index such as geohash or S2 cells.
The system first searches the pickup cell and neighboring cells, then filters available drivers by status, vehicle type, distance, and ETA.
This avoids scanning all drivers.
7️⃣ Matching and Dispatch
Simple Matching
Choose nearest available driver.
min(driver ETA to pickup)
Better Matching
Consider:
- ETA to pickup
- Driver acceptance probability
- Driver rating
- Rider preference
- Vehicle type
- Driver fairness
- Surge region
- Ongoing supply-demand balance
Dispatch Flow
Find candidate drivers
→ Rank candidates
→ Send request to top driver
→ Wait for response
→ If timeout/reject, try next driver
→ Confirm match
Avoid Double Assignment
Driver should not accept two rides at once.
Use:
conditional update / compare-and-swap
Example:
UPDATE driver
SET status = 'ASSIGNED'
WHERE driver_id = 'd456'
AND status = 'AVAILABLE';
👉 Interview Answer
Matching starts by finding nearby available drivers, then ranking them based on ETA, availability, vehicle type, and business constraints.
To avoid double assignment, accepting a ride should use an atomic conditional update so only one ride can claim a driver.
8️⃣ Trip State Machine
Common States
REQUESTED
MATCHING
DRIVER_ASSIGNED
DRIVER_ARRIVING
PICKED_UP
IN_PROGRESS
COMPLETED
CANCELLED
State Transition Rules
Examples:
REQUESTED → MATCHING
MATCHING → DRIVER_ASSIGNED
DRIVER_ASSIGNED → DRIVER_ARRIVING
DRIVER_ARRIVING → PICKED_UP
PICKED_UP → IN_PROGRESS
IN_PROGRESS → COMPLETED
Why State Machine Matters
- Prevent invalid transitions
- Support retries
- Support audit trail
- Handle cancellation correctly
- Coordinate payment and notifications
👉 Interview Answer
I would model the trip lifecycle as a state machine.
This prevents invalid transitions, makes retries safer, and gives us a clear audit trail.
Payment should only be triggered after the trip reaches a completed state.
9️⃣ ETA and Routing
ETA Use Cases
- Driver ETA to pickup
- Trip ETA to destination
- Estimated fare
- Matching ranking
- Rider UI updates
Inputs
- Road network
- Traffic conditions
- Driver location
- Pickup/dropoff
- Historical travel time
- Real-time speed data
Routing Service
pickup + driver location → ETA to pickup
pickup + dropoff → trip ETA
👉 Interview Answer
ETA is used for both user experience and matching.
The matching system should rank drivers by estimated time to pickup, not just straight-line distance.
ETA can be computed by a routing service using traffic, road network data, and historical travel time.
🔟 Pricing and Surge
Base Price Components
- Base fare
- Distance
- Time
- Ride type
- Taxes / fees
- Surge multiplier
Surge Pricing
Surge is based on supply and demand.
surge = demand / available_supply
Surge Region
Use geo cells or regions:
city → zone → geohash / S2 cell
Pricing Flow
Estimate route distance and time
→ Apply base fare formula
→ Apply surge multiplier
→ Return estimate
→ Final price calculated after completion
👉 Interview Answer
Pricing uses estimated distance, duration, ride type, and regional supply-demand conditions.
Surge pricing is calculated per region based on demand and available driver supply.
The system returns an estimated fare before the ride, and computes the final fare after trip completion.
1️⃣1️⃣ Payment Flow
Payment Timing
Typical flow:
Pre-authorize payment before ride
→ Capture payment after completion
Why Pre-authorization?
- Validate payment method
- Reduce payment failure risk
- Improve driver payout reliability
Payment Flow
Ride requested
→ Payment service pre-authorizes amount
→ Ride completed
→ Final price calculated
→ Payment captured
→ Receipt sent
→ Driver payout recorded
Correctness
Payment needs stronger consistency than location.
Use:
- Idempotency keys
- Transaction records
- Payment state machine
- Retry with provider-safe semantics
👉 Interview Answer
Payment requires stronger correctness.
I would pre-authorize the rider’s payment method before confirming the trip, and capture the final amount after the trip completes.
Payment operations should be idempotent to avoid duplicate charges.
1️⃣2️⃣ Notifications
Notification Types
- Ride request sent to driver
- Driver accepted
- Driver arriving
- Driver arrived
- Trip started
- Trip completed
- Payment receipt
- Cancellation
Channels
- Push notification
- SMS fallback
- In-app real-time updates
- Email receipt
👉 Interview Answer
Notifications are critical for ride coordination.
I would use push and in-app updates for real-time ride status, SMS as fallback for important events, and email for receipts.
Notification delivery should be asynchronous and retryable.
1️⃣3️⃣ Real-time Tracking
Rider Tracking Driver
Driver app sends location
→ Location service updates geo store
→ Tracking service pushes updates
→ Rider app updates map
Driver Tracking Rider Pickup
Rider location can also be shared temporarily.
Protocol
Use:
- WebSocket
- Server-sent events
- Mobile push for background updates
👉 Interview Answer
For real-time tracking, the driver app sends location updates every few seconds.
The tracking service pushes relevant updates to the rider app, usually through WebSocket or another real-time channel.
These updates are eventually consistent, because slight location delay is acceptable.
1️⃣4️⃣ Scaling Patterns
Pattern 1: Separate Location and Trip State
- Location = high-volume, ephemeral
- Trip state = durable, strongly managed
Pattern 2: Geo-sharded Location Store
Shard by:
geohash / S2 cell / region
Pattern 3: Async Dispatch Queue
Use queues for matching attempts.
ride request → dispatch queue → matching workers
Pattern 4: Regional Architecture
Ride matching is local.
city / region based deployment
Benefits:
- Lower latency
- Smaller search space
- Better fault isolation
Pattern 5: Event-driven Trip Updates
ride state change → event bus → notification/payment/analytics
👉 Interview Answer
To scale the system, I would separate real-time location from durable trip state, shard the location store by region or geo cell, and run matching regionally.
Ride matching is naturally local, so regional architecture reduces latency and improves fault isolation.
1️⃣5️⃣ Failure Handling
Common Failures
- Driver location update delayed
- Driver accepts but request times out
- Double assignment race
- Payment pre-auth fails
- Driver app disconnects
- Rider cancels during matching
- Dispatch queue backlog
- Routing service unavailable
Strategies
- Use driver location TTL
- Atomic driver assignment
- Retry dispatch to next driver
- Idempotent ride state transitions
- Fallback ETA approximation
- Cancel stale matching requests
- Payment idempotency
- Reconcile trip state with events
👉 Interview Answer
The system should handle failures through idempotency, timeouts, retries, and clear state transitions.
Driver locations should have TTLs, so stale drivers are not matched.
Driver assignment must be atomic to prevent one driver from being assigned to multiple trips.
1️⃣6️⃣ Consistency Model
Stronger Consistency Needed For
- Ride state transitions
- Driver assignment
- Payment
- Cancellation
- Driver payout
- Access to trip details
Eventual Consistency Acceptable For
- Driver location
- ETA updates
- Map tracking
- Surge estimate
- Analytics
- Notifications
👉 Interview Answer
Not all parts of the system need the same consistency.
Driver location and ETA can be eventually consistent, because small delays are acceptable.
Ride assignment, trip state, payment, and cancellation require stronger correctness.
1️⃣7️⃣ Observability
Key Metrics
- Ride request rate
- Match success rate
- Match latency
- Driver acceptance rate
- ETA accuracy
- Location update lag
- Dispatch timeout rate
- Cancellation rate
- Payment failure rate
- Surge region imbalance
- Trip state transition errors
👉 Interview Answer
I would monitor match latency, match success rate, driver acceptance rate, location update lag, ETA accuracy, cancellation rate, and payment failure rate.
These metrics directly reflect marketplace health and rider/driver experience.
1️⃣8️⃣ End-to-End Flow
Ride Request Flow
Rider requests ride
→ Validate rider and payment method
→ Estimate price and ETA
→ Find nearby available drivers
→ Dispatch to best driver
→ Driver accepts
→ Create confirmed trip
→ Notify rider and driver
Trip Flow
Driver goes to pickup
→ Rider tracks driver
→ Driver arrives
→ Rider picked up
→ Trip starts
→ Driver navigates to destination
→ Trip completed
→ Final fare calculated
→ Payment captured
→ Receipt sent
Location Flow
Driver sends location every few seconds
→ Location service updates geo index
→ Tracking service streams update to rider
→ Matching service uses latest available drivers
Key Insight
Ride Sharing is a real-time marketplace system, not just a map application.
🧠 Staff-Level Answer (Final)
👉 Interview Answer (Full Version)
When designing a ride-sharing system like Uber, I think of it as a real-time marketplace connecting riders and drivers.
The system has three critical flows: driver location updates, rider ride requests, and rider-driver matching.
Driver locations are high-volume and short-lived, so I would store them in a fast geo-indexed store using geohash, S2 cells, or Redis Geo.
Ride state, payment, and trip history should be stored durably because they require stronger correctness.
For matching, the system finds nearby available drivers by searching the pickup geo cell and neighboring cells. Then it ranks candidates by ETA to pickup, vehicle type, driver availability, acceptance probability, and business constraints.
To avoid double assignment, driver acceptance must use an atomic conditional update.
I would model each ride as a state machine, with states such as requested, matching, driver assigned, picked up, in progress, completed, and cancelled.
ETA and pricing are computed using routing, traffic data, distance, time, ride type, and surge multiplier.
Payment should be pre-authorized before the ride and captured after completion, with idempotency to prevent duplicate charges.
Real-time tracking can use WebSocket or streaming updates, but location data can be eventually consistent.
The main trade-offs are matching latency, ETA accuracy, location freshness, consistency, and marketplace efficiency.
Ultimately, the goal is to quickly and reliably match riders with nearby drivers, manage trip state correctly, and provide a smooth real-time experience.
⭐ Final Insight
Ride Sharing 的核心不是地图, 而是一个实时供需匹配、地理索引、状态机和支付系统组合成的 marketplace platform。
中文部分
🎯 Design Ride Sharing (Uber)
1️⃣ 核心框架
在设计 Ride Sharing / Uber 时,我通常从以下几个方面来分析:
- 核心流程:request ride、match driver、track trip
- 实时位置系统
- Driver discovery 和 geo indexing
- Matching 和 dispatch algorithm
- Trip state machine
- ETA、routing 和 pricing
- Payment 和 notifications
- 核心权衡:latency vs accuracy vs consistency
2️⃣ 核心需求
功能需求
- Rider 可以请求用车
- Driver 可以上线 / 下线
- 系统可以查找附近 drivers
- 系统可以匹配 rider 和 driver
- Rider 和 driver 可以实时追踪对方位置
- 支持 trip 生命周期:requested、accepted、picked up、completed
- 支持 pricing 和 ETA
- 支持 payment
- 支持 cancellation
- 支持 notifications
非功能需求
- 低延迟 matching
- 实时 location updates
- 高可用
- 可扩展 geo search
- 可靠 trip state management
- Payment correctness 要强
- Location updates 可以最终一致
👉 面试回答
Ride-sharing system 有三个关键实时流程: driver location updates、 rider trip requests、 以及 rider-driver matching。
核心挑战是快速找到最合适的 nearby driver, 同时处理实时位置变化、trip 状态转换、 pricing、payment 和系统故障。
3️⃣ 主要 API
Driver Location Update
POST /api/drivers/{driverId}/location
Request:
{
"lat": 40.7128,
"lng": -74.0060,
"status": "AVAILABLE",
"timestamp": "2026-05-02T10:00:00Z"
}
Request Ride
POST /api/rides
Request:
{
"riderId": "r123",
"pickup": {
"lat": 40.7128,
"lng": -74.0060
},
"dropoff": {
"lat": 40.7580,
"lng": -73.9855
},
"rideType": "standard"
}
Accept Ride
POST /api/rides/{rideId}/accept
Request:
{
"driverId": "d456"
}
Update Trip State
POST /api/rides/{rideId}/state
Request:
{
"state": "PICKED_UP"
}
Get Ride Status
GET /api/rides/{rideId}
👉 面试回答
我会将 driver location updates、 ride creation、ride acceptance、 trip state updates 和 ride status retrieval 拆成不同 API。
Location updates 是高吞吐、最终一致的数据; 但 ride acceptance 和 payment 需要更强正确性。
4️⃣ 数据模型
Rider Table
rider (
rider_id VARCHAR PRIMARY KEY,
name VARCHAR,
phone VARCHAR,
payment_profile_id VARCHAR,
created_at TIMESTAMP
)
Driver Table
driver (
driver_id VARCHAR PRIMARY KEY,
name VARCHAR,
phone VARCHAR,
vehicle_id VARCHAR,
status VARCHAR,
created_at TIMESTAMP
)
Driver Location Store
driver_location (
driver_id VARCHAR PRIMARY KEY,
lat DOUBLE,
lng DOUBLE,
geohash VARCHAR,
status VARCHAR,
updated_at TIMESTAMP
)
Ride Table
ride (
ride_id VARCHAR PRIMARY KEY,
rider_id VARCHAR,
driver_id VARCHAR,
pickup_lat DOUBLE,
pickup_lng DOUBLE,
dropoff_lat DOUBLE,
dropoff_lng DOUBLE,
state VARCHAR,
price_estimate DECIMAL,
final_price DECIMAL,
created_at TIMESTAMP,
updated_at TIMESTAMP
)
Ride Event Table
ride_event (
event_id VARCHAR PRIMARY KEY,
ride_id VARCHAR,
actor_id VARCHAR,
event_type VARCHAR,
created_at TIMESTAMP,
metadata JSON
)
👉 面试回答
我会将 rider、driver、trip、 driver location 和 ride events 分开存储。
Driver location 更新频繁, 所以应该存储在支持 geo index 的快速存储中。
Ride state 需要持久化保存, ride events 可以用于 audit 和 debugging。
5️⃣ 实时位置系统
Driver Location Updates
Driver 每隔几秒发送一次位置。
Driver App
→ Location Service
→ Geo Index
→ Location Stream
Storage Choice
使用快速 in-memory 或 geo-enabled storage:
Redis Geo
Geohash index
S2 cells
Elastic geo index
Custom in-memory geo grid
为什么不把所有 Location 存在主 DB?
因为 location updates:
- 高频
- 生命周期短
- 主要用于 nearby search
- 可以最终一致
👉 面试回答
Driver location updates 是高吞吐、短生命周期数据。
我会将最新 driver locations 存在快速 geo-indexed store 中, 例如 Redis Geo、S2 cells 或 geohash-based index。
主关系型数据库应该存储持久化 trip state, 而不是每一次实时位置更新。
6️⃣ Geo Indexing and Nearby Driver Search
Geohash / Grid-based Index
将位置转换成 cell。
lat/lng → geohash / S2 cell
然后搜索:
pickup cell + neighboring cells
Search Flow
Rider requests ride
→ Convert pickup to geo cell
→ Find available drivers in nearby cells
→ Filter by distance and driver status
→ Rank drivers
→ Dispatch request
Candidate Filters
- Driver availability
- Distance to pickup
- ETA to pickup
- Vehicle type
- Driver rating
- Current assignment status
- Region constraints
👉 面试回答
为了查找 nearby drivers, 我会使用 geohash 或 S2 cells 这类 geo index。
系统先搜索 pickup 所在 cell 和相邻 cells, 然后根据 driver status、vehicle type、 distance 和 ETA 过滤可用 drivers。
这样可以避免扫描所有 drivers。
7️⃣ Matching and Dispatch
Simple Matching
选择最近的 available driver。
min(driver ETA to pickup)
Better Matching
考虑:
- ETA to pickup
- Driver acceptance probability
- Driver rating
- Rider preference
- Vehicle type
- Driver fairness
- Surge region
- 当前 supply-demand balance
Dispatch Flow
Find candidate drivers
→ Rank candidates
→ Send request to top driver
→ Wait for response
→ If timeout/reject, try next driver
→ Confirm match
Avoid Double Assignment
Driver 不应该同时接受两个 rides。
使用:
conditional update / compare-and-swap
示例:
UPDATE driver
SET status = 'ASSIGNED'
WHERE driver_id = 'd456'
AND status = 'AVAILABLE';
👉 面试回答
Matching 会先找到附近可用 drivers, 然后根据 ETA、availability、vehicle type 和业务约束进行排序。
为了避免 double assignment, driver 接单时应该使用 atomic conditional update, 确保一个 driver 只能被一个 ride 占用。
8️⃣ Trip State Machine
常见 States
REQUESTED
MATCHING
DRIVER_ASSIGNED
DRIVER_ARRIVING
PICKED_UP
IN_PROGRESS
COMPLETED
CANCELLED
State Transition Rules
示例:
REQUESTED → MATCHING
MATCHING → DRIVER_ASSIGNED
DRIVER_ASSIGNED → DRIVER_ARRIVING
DRIVER_ARRIVING → PICKED_UP
PICKED_UP → IN_PROGRESS
IN_PROGRESS → COMPLETED
为什么 State Machine 重要?
- 防止非法状态转换
- 支持 retries
- 支持 audit trail
- 正确处理 cancellation
- 协调 payment 和 notifications
👉 面试回答
我会将 trip lifecycle 建模成 state machine。
这样可以防止非法状态转换, 让 retry 更安全, 并提供清晰的 audit trail。
Payment 应该只在 trip 到达 completed 状态后触发。
9️⃣ ETA and Routing
ETA Use Cases
- Driver ETA to pickup
- Trip ETA to destination
- Estimated fare
- Matching ranking
- Rider UI updates
Inputs
- Road network
- Traffic conditions
- Driver location
- Pickup / dropoff
- Historical travel time
- Real-time speed data
Routing Service
pickup + driver location → ETA to pickup
pickup + dropoff → trip ETA
👉 面试回答
ETA 同时影响用户体验和 matching。
Matching system 不应该只按直线距离排序, 而应该按照 driver 到 pickup 的 estimated time 排序。
ETA 可以由 routing service 根据 traffic、road network 和 historical travel time 计算。
🔟 Pricing and Surge
Base Price Components
- Base fare
- Distance
- Time
- Ride type
- Taxes / fees
- Surge multiplier
Surge Pricing
Surge 基于供需关系。
surge = demand / available_supply
Surge Region
使用 geo cells 或 regions:
city → zone → geohash / S2 cell
Pricing Flow
Estimate route distance and time
→ Apply base fare formula
→ Apply surge multiplier
→ Return estimate
→ Final price calculated after completion
👉 面试回答
Pricing 会基于估算距离、时长、 ride type 和区域供需情况。
Surge pricing 通常按 region 计算, 取决于当前 demand 和 available driver supply。
系统会在 ride 开始前返回 estimated fare, 并在 trip 完成后计算 final fare。
1️⃣1️⃣ Payment Flow
Payment Timing
典型流程:
Pre-authorize payment before ride
→ Capture payment after completion
为什么 Pre-authorization?
- 验证支付方式
- 降低支付失败风险
- 保证 driver payout 更可靠
Payment Flow
Ride requested
→ Payment service pre-authorizes amount
→ Ride completed
→ Final price calculated
→ Payment captured
→ Receipt sent
→ Driver payout recorded
Correctness
Payment 比 location 需要更强一致性。
使用:
- Idempotency keys
- Transaction records
- Payment state machine
- Retry with provider-safe semantics
👉 面试回答
Payment 需要更强正确性。
我会在确认 trip 前对 rider 的支付方式进行 pre-authorization, 并在 trip 完成后 capture final amount。
Payment operations 必须幂等, 防止重复扣款。
1️⃣2️⃣ Notifications
Notification Types
- Ride request sent to driver
- Driver accepted
- Driver arriving
- Driver arrived
- Trip started
- Trip completed
- Payment receipt
- Cancellation
Channels
- Push notification
- SMS fallback
- In-app real-time updates
- Email receipt
👉 面试回答
Notifications 对 ride coordination 非常关键。
我会使用 push 和 in-app updates 提供实时 ride status; 对关键事件使用 SMS 兜底; 对 receipt 使用 email。
Notification delivery 应该异步并支持重试。
1️⃣3️⃣ Real-time Tracking
Rider Tracking Driver
Driver app sends location
→ Location service updates geo store
→ Tracking service pushes updates
→ Rider app updates map
Driver Tracking Rider Pickup
Rider location 也可以临时共享。
Protocol
使用:
- WebSocket
- Server-sent events
- Mobile push for background updates
👉 面试回答
对于 real-time tracking, driver app 会每隔几秒发送位置更新。
Tracking service 会将相关位置更新推送给 rider app, 通常可以通过 WebSocket 或其他 streaming channel。
这些位置更新可以最终一致, 因为几秒的延迟通常是可以接受的。
1️⃣4️⃣ Scaling Patterns
Pattern 1: Separate Location and Trip State
- Location = 高频、临时数据
- Trip state = 持久化、强状态管理
Pattern 2: Geo-sharded Location Store
按以下维度分片:
geohash / S2 cell / region
Pattern 3: Async Dispatch Queue
使用 queue 处理 matching attempts。
ride request → dispatch queue → matching workers
Pattern 4: Regional Architecture
Ride matching 天然是本地化的。
city / region based deployment
好处:
- 降低延迟
- 缩小搜索空间
- 更好的故障隔离
Pattern 5: Event-driven Trip Updates
ride state change → event bus → notification/payment/analytics
👉 面试回答
为了扩展系统, 我会将 real-time location 和 durable trip state 分开。
Location store 按 region 或 geo cell 分片, matching 按 region 执行。
Ride matching 天然是本地问题, 所以 regional architecture 可以降低延迟, 并提升故障隔离能力。
1️⃣5️⃣ Failure Handling
常见故障
- Driver location update delayed
- Driver accepts but request times out
- Double assignment race
- Payment pre-auth fails
- Driver app disconnects
- Rider cancels during matching
- Dispatch queue backlog
- Routing service unavailable
Strategies
- Driver location 设置 TTL
- Atomic driver assignment
- Retry dispatch to next driver
- Idempotent ride state transitions
- Fallback ETA approximation
- Cancel stale matching requests
- Payment idempotency
- Reconcile trip state with events
👉 面试回答
系统应该通过 idempotency、timeouts、 retries 和清晰状态转换来处理故障。
Driver locations 应该有 TTL, 避免 stale drivers 被匹配。
Driver assignment 必须是原子的, 防止同一个 driver 被分配给多个 trips。
1️⃣6️⃣ Consistency Model
需要较强一致性的场景
- Ride state transitions
- Driver assignment
- Payment
- Cancellation
- Driver payout
- Access to trip details
可以最终一致的场景
- Driver location
- ETA updates
- Map tracking
- Surge estimate
- Analytics
- Notifications
👉 面试回答
系统不同部分需要不同一致性。
Driver location 和 ETA 可以最终一致, 因为小延迟是可以接受的。
但是 ride assignment、trip state、payment 和 cancellation 需要更强正确性。
1️⃣7️⃣ Observability
Key Metrics
- Ride request rate
- Match success rate
- Match latency
- Driver acceptance rate
- ETA accuracy
- Location update lag
- Dispatch timeout rate
- Cancellation rate
- Payment failure rate
- Surge region imbalance
- Trip state transition errors
👉 面试回答
我会监控 match latency、match success rate、 driver acceptance rate、location update lag、 ETA accuracy、cancellation rate 和 payment failure rate。
这些指标直接反映 marketplace health 和 rider / driver experience。
1️⃣8️⃣ End-to-End Flow
Ride Request Flow
Rider requests ride
→ Validate rider and payment method
→ Estimate price and ETA
→ Find nearby available drivers
→ Dispatch to best driver
→ Driver accepts
→ Create confirmed trip
→ Notify rider and driver
Trip Flow
Driver goes to pickup
→ Rider tracks driver
→ Driver arrives
→ Rider picked up
→ Trip starts
→ Driver navigates to destination
→ Trip completed
→ Final fare calculated
→ Payment captured
→ Receipt sent
Location Flow
Driver sends location every few seconds
→ Location service updates geo index
→ Tracking service streams update to rider
→ Matching service uses latest available drivers
Key Insight
Ride Sharing 是一个实时 marketplace system, 不只是 map application。
🧠 Staff-Level Answer(最终版)
👉 面试回答(完整背诵版)
在设计 Uber 这类 Ride-sharing system 时, 我会将它看作一个连接 riders 和 drivers 的实时 marketplace。
系统有三个关键流程: driver location updates、 rider ride requests、 以及 rider-driver matching。
Driver locations 是高吞吐、短生命周期数据, 所以我会使用 geohash、S2 cells 或 Redis Geo 将它们存储在快速 geo-indexed store 中。
Ride state、payment 和 trip history 则需要持久化存储, 因为它们需要更强正确性。
对于 matching, 系统会先在 pickup geo cell 和相邻 cells 搜索 nearby available drivers。 然后根据 ETA to pickup、vehicle type、 driver availability、acceptance probability 和业务约束进行排序。
为了避免 double assignment, driver acceptance 必须使用 atomic conditional update。
我会将每个 ride 建模成 state machine, 包含 requested、matching、driver assigned、 picked up、in progress、completed 和 cancelled 等状态。
ETA 和 pricing 会基于 routing、traffic data、 distance、time、ride type 和 surge multiplier 计算。
Payment 应该在 ride 前 pre-authorize, 并在完成后 capture, 同时使用 idempotency 防止重复扣款。
Real-time tracking 可以使用 WebSocket 或 streaming updates, 但 location data 可以最终一致。
核心权衡包括 matching latency、ETA accuracy、 location freshness、consistency 和 marketplace efficiency。
最终目标是快速、可靠地将 riders 匹配给 nearby drivers, 正确管理 trip state, 并提供流畅的实时体验。
⭐ Final Insight
Ride Sharing 的核心不是地图, 而是一个由实时供需匹配、地理索引、状态机和支付系统 组合成的 marketplace platform。
Implement