d&d-t System Design Deep Dive ·

1️⃣ Core Framework

When discussing Ride Sharing design, I frame it as:

Core flows: request ride, match driver, track trip
Real-time location system
Driver discovery and geo indexing
Matching and dispatch algorithm
Trip state machine
ETA, routing, and pricing
Payment and notifications
Trade-offs: latency vs accuracy vs consistency

2️⃣ Core Requirements

Functional Requirements

Rider can request a ride
Driver can go online / offline
System can find nearby drivers
System can match rider with driver
Rider and driver can track each other in real time
Support trip lifecycle: requested, accepted, picked up, completed
Support pricing and ETA
Support payment
Support cancellation
Support notifications

Non-functional Requirements

Low-latency matching
Real-time location updates
High availability
Scalable geo search
Reliable trip state management
Strong payment correctness
Eventually consistent location updates are acceptable

👉 Interview Answer

A ride-sharing system has three critical real-time flows: driver location updates, rider trip requests, and rider-driver matching.

The main challenge is finding the best nearby driver quickly, while handling real-time location changes, trip state transitions, pricing, payment, and system failures.

3️⃣ Main APIs

Driver Location Update

POST /api/drivers/{driverId}/location

Request:

{
  "lat": 40.7128,
  "lng": -74.0060,
  "status": "AVAILABLE",
  "timestamp": "2026-05-02T10:00:00Z"
}

Request Ride

POST /api/rides

Request:

{
  "riderId": "r123",
  "pickup": {
    "lat": 40.7128,
    "lng": -74.0060
  },
  "dropoff": {
    "lat": 40.7580,
    "lng": -73.9855
  },
  "rideType": "standard"
}

Accept Ride

POST /api/rides/{rideId}/accept

Request:

{
  "driverId": "d456"
}

Update Trip State

POST /api/rides/{rideId}/state

Request:

{
  "state": "PICKED_UP"
}

Get Ride Status

GET /api/rides/{rideId}

👉 Interview Answer

I would separate APIs for driver location updates, ride creation, ride acceptance, trip state updates, and ride status retrieval.

Location updates are high-volume and eventually consistent, while ride acceptance and payment require stronger correctness.

4️⃣ Data Model

Rider Table

rider (
  rider_id VARCHAR PRIMARY KEY,
  name VARCHAR,
  phone VARCHAR,
  payment_profile_id VARCHAR,
  created_at TIMESTAMP
)

Driver Table

driver (
  driver_id VARCHAR PRIMARY KEY,
  name VARCHAR,
  phone VARCHAR,
  vehicle_id VARCHAR,
  status VARCHAR,
  created_at TIMESTAMP
)

Driver Location Store

driver_location (
  driver_id VARCHAR PRIMARY KEY,
  lat DOUBLE,
  lng DOUBLE,
  geohash VARCHAR,
  status VARCHAR,
  updated_at TIMESTAMP
)

Ride Table

ride (
  ride_id VARCHAR PRIMARY KEY,
  rider_id VARCHAR,
  driver_id VARCHAR,
  pickup_lat DOUBLE,
  pickup_lng DOUBLE,
  dropoff_lat DOUBLE,
  dropoff_lng DOUBLE,
  state VARCHAR,
  price_estimate DECIMAL,
  final_price DECIMAL,
  created_at TIMESTAMP,
  updated_at TIMESTAMP
)

Ride Event Table

ride_event (
  event_id VARCHAR PRIMARY KEY,
  ride_id VARCHAR,
  actor_id VARCHAR,
  event_type VARCHAR,
  created_at TIMESTAMP,
  metadata JSON
)

👉 Interview Answer

I would store riders, drivers, trips, driver locations, and ride events separately.

Driver location is high-volume and frequently updated, so it should be stored in a fast geo-indexed store.

Ride state should be stored durably, and ride events can be used for auditing and debugging.

5️⃣ Real-time Location System

Driver Location Updates

Drivers send location every few seconds.

Driver App
→ Location Service
→ Geo Index
→ Location Stream

Storage Choice

Use fast in-memory or geo-enabled storage:

Redis Geo
Geohash index
S2 cells
Elastic geo index
Custom in-memory geo grid

Why Not Store Every Location in Main DB?

Because location updates are:

High frequency
Short-lived
Mostly used for nearby search
Eventually consistent

👉 Interview Answer

Driver location updates are high-volume and short-lived.

I would store latest driver locations in a fast geo-indexed store, such as Redis Geo, S2 cells, or a geohash-based index.

The main relational database should store durable trip state, not every real-time location update.

6️⃣ Geo Indexing and Nearby Driver Search

Geohash / Grid-based Index

Convert location into a cell.

lat/lng → geohash / S2 cell

Then search:

pickup cell + neighboring cells

Search Flow

Rider requests ride
→ Convert pickup to geo cell
→ Find available drivers in nearby cells
→ Filter by distance and driver status
→ Rank drivers
→ Dispatch request

Candidate Filters

Driver availability
Distance to pickup
ETA to pickup
Vehicle type
Driver rating
Current assignment status
Region constraints

👉 Interview Answer

To find nearby drivers, I would use a geo index such as geohash or S2 cells.

The system first searches the pickup cell and neighboring cells, then filters available drivers by status, vehicle type, distance, and ETA.

This avoids scanning all drivers.

7️⃣ Matching and Dispatch

Simple Matching

Choose nearest available driver.

min(driver ETA to pickup)

Better Matching

Consider:

ETA to pickup
Driver acceptance probability
Driver rating
Rider preference
Vehicle type
Driver fairness
Surge region
Ongoing supply-demand balance

Dispatch Flow

Find candidate drivers
→ Rank candidates
→ Send request to top driver
→ Wait for response
→ If timeout/reject, try next driver
→ Confirm match

Avoid Double Assignment

Driver should not accept two rides at once.

Use:

conditional update / compare-and-swap

Example:

UPDATE driver
SET status = 'ASSIGNED'
WHERE driver_id = 'd456'
AND status = 'AVAILABLE';

👉 Interview Answer

Matching starts by finding nearby available drivers, then ranking them based on ETA, availability, vehicle type, and business constraints.

To avoid double assignment, accepting a ride should use an atomic conditional update so only one ride can claim a driver.

8️⃣ Trip State Machine

Common States

REQUESTED
MATCHING
DRIVER_ASSIGNED
DRIVER_ARRIVING
PICKED_UP
IN_PROGRESS
COMPLETED
CANCELLED

State Transition Rules

Examples:

REQUESTED → MATCHING
MATCHING → DRIVER_ASSIGNED
DRIVER_ASSIGNED → DRIVER_ARRIVING
DRIVER_ARRIVING → PICKED_UP
PICKED_UP → IN_PROGRESS
IN_PROGRESS → COMPLETED

Why State Machine Matters

Prevent invalid transitions
Support retries
Support audit trail
Handle cancellation correctly
Coordinate payment and notifications

👉 Interview Answer

I would model the trip lifecycle as a state machine.

This prevents invalid transitions, makes retries safer, and gives us a clear audit trail.

Payment should only be triggered after the trip reaches a completed state.

9️⃣ ETA and Routing

ETA Use Cases

Driver ETA to pickup
Trip ETA to destination
Estimated fare
Matching ranking
Rider UI updates

Inputs

Road network
Traffic conditions
Driver location
Pickup/dropoff
Historical travel time
Real-time speed data

Routing Service

pickup + driver location → ETA to pickup
pickup + dropoff → trip ETA

👉 Interview Answer

ETA is used for both user experience and matching.

The matching system should rank drivers by estimated time to pickup, not just straight-line distance.

ETA can be computed by a routing service using traffic, road network data, and historical travel time.

🔟 Pricing and Surge

Base Price Components

Base fare
Distance
Time
Ride type
Taxes / fees
Surge multiplier

Surge Pricing

Surge is based on supply and demand.

surge = demand / available_supply

Surge Region

Use geo cells or regions:

city → zone → geohash / S2 cell

Pricing Flow

Estimate route distance and time
→ Apply base fare formula
→ Apply surge multiplier
→ Return estimate
→ Final price calculated after completion

👉 Interview Answer

Pricing uses estimated distance, duration, ride type, and regional supply-demand conditions.

Surge pricing is calculated per region based on demand and available driver supply.

The system returns an estimated fare before the ride, and computes the final fare after trip completion.

1️⃣1️⃣ Payment Flow

Payment Timing

Typical flow:

Pre-authorize payment before ride
→ Capture payment after completion

Why Pre-authorization?

Validate payment method
Reduce payment failure risk
Improve driver payout reliability

Payment Flow

Ride requested
→ Payment service pre-authorizes amount
→ Ride completed
→ Final price calculated
→ Payment captured
→ Receipt sent
→ Driver payout recorded

Correctness

Payment needs stronger consistency than location.

Use:

Idempotency keys
Transaction records
Payment state machine
Retry with provider-safe semantics

👉 Interview Answer

Payment requires stronger correctness.

I would pre-authorize the rider’s payment method before confirming the trip, and capture the final amount after the trip completes.

Payment operations should be idempotent to avoid duplicate charges.

1️⃣2️⃣ Notifications

Notification Types

Ride request sent to driver
Driver accepted
Driver arriving
Driver arrived
Trip started
Trip completed
Payment receipt
Cancellation

Channels

Push notification
SMS fallback
In-app real-time updates
Email receipt

👉 Interview Answer

Notifications are critical for ride coordination.

I would use push and in-app updates for real-time ride status, SMS as fallback for important events, and email for receipts.

Notification delivery should be asynchronous and retryable.

1️⃣3️⃣ Real-time Tracking

Rider Tracking Driver

Driver app sends location
→ Location service updates geo store
→ Tracking service pushes updates
→ Rider app updates map

Driver Tracking Rider Pickup

Rider location can also be shared temporarily.

Protocol

Use:

WebSocket
Server-sent events
Mobile push for background updates

👉 Interview Answer

For real-time tracking, the driver app sends location updates every few seconds.

The tracking service pushes relevant updates to the rider app, usually through WebSocket or another real-time channel.

These updates are eventually consistent, because slight location delay is acceptable.

1️⃣4️⃣ Scaling Patterns

Pattern 1: Separate Location and Trip State

Location = high-volume, ephemeral
Trip state = durable, strongly managed

Pattern 2: Geo-sharded Location Store

Shard by:

geohash / S2 cell / region

Pattern 3: Async Dispatch Queue

Use queues for matching attempts.

ride request → dispatch queue → matching workers

Pattern 4: Regional Architecture

Ride matching is local.

city / region based deployment

Benefits:

Lower latency
Smaller search space
Better fault isolation

Pattern 5: Event-driven Trip Updates

ride state change → event bus → notification/payment/analytics

👉 Interview Answer

To scale the system, I would separate real-time location from durable trip state, shard the location store by region or geo cell, and run matching regionally.

Ride matching is naturally local, so regional architecture reduces latency and improves fault isolation.

1️⃣5️⃣ Failure Handling

Common Failures

Driver location update delayed
Driver accepts but request times out
Double assignment race
Payment pre-auth fails
Driver app disconnects
Rider cancels during matching
Dispatch queue backlog
Routing service unavailable

Strategies

Use driver location TTL
Atomic driver assignment
Retry dispatch to next driver
Idempotent ride state transitions
Fallback ETA approximation
Cancel stale matching requests
Payment idempotency
Reconcile trip state with events

👉 Interview Answer

The system should handle failures through idempotency, timeouts, retries, and clear state transitions.

Driver locations should have TTLs, so stale drivers are not matched.

Driver assignment must be atomic to prevent one driver from being assigned to multiple trips.

1️⃣6️⃣ Consistency Model

Stronger Consistency Needed For

Ride state transitions
Driver assignment
Payment
Cancellation
Driver payout
Access to trip details

Eventual Consistency Acceptable For

Driver location
ETA updates
Map tracking
Surge estimate
Analytics
Notifications

👉 Interview Answer

Not all parts of the system need the same consistency.

Driver location and ETA can be eventually consistent, because small delays are acceptable.

Ride assignment, trip state, payment, and cancellation require stronger correctness.

1️⃣7️⃣ Observability

Key Metrics

Ride request rate
Match success rate
Match latency
Driver acceptance rate
ETA accuracy
Location update lag
Dispatch timeout rate
Cancellation rate
Payment failure rate
Surge region imbalance
Trip state transition errors

👉 Interview Answer

I would monitor match latency, match success rate, driver acceptance rate, location update lag, ETA accuracy, cancellation rate, and payment failure rate.

These metrics directly reflect marketplace health and rider/driver experience.

1️⃣8️⃣ End-to-End Flow

Ride Request Flow

Rider requests ride
→ Validate rider and payment method
→ Estimate price and ETA
→ Find nearby available drivers
→ Dispatch to best driver
→ Driver accepts
→ Create confirmed trip
→ Notify rider and driver

Trip Flow

Driver goes to pickup
→ Rider tracks driver
→ Driver arrives
→ Rider picked up
→ Trip starts
→ Driver navigates to destination
→ Trip completed
→ Final fare calculated
→ Payment captured
→ Receipt sent

Location Flow

Driver sends location every few seconds
→ Location service updates geo index
→ Tracking service streams update to rider
→ Matching service uses latest available drivers

Key Insight

Ride Sharing is a real-time marketplace system, not just a map application.

🧠 Staff-Level Answer (Final)

👉 Interview Answer (Full Version)

When designing a ride-sharing system like Uber, I think of it as a real-time marketplace connecting riders and drivers.

The system has three critical flows: driver location updates, rider ride requests, and rider-driver matching.

Driver locations are high-volume and short-lived, so I would store them in a fast geo-indexed store using geohash, S2 cells, or Redis Geo.

Ride state, payment, and trip history should be stored durably because they require stronger correctness.

For matching, the system finds nearby available drivers by searching the pickup geo cell and neighboring cells. Then it ranks candidates by ETA to pickup, vehicle type, driver availability, acceptance probability, and business constraints.

To avoid double assignment, driver acceptance must use an atomic conditional update.

I would model each ride as a state machine, with states such as requested, matching, driver assigned, picked up, in progress, completed, and cancelled.

ETA and pricing are computed using routing, traffic data, distance, time, ride type, and surge multiplier.

Payment should be pre-authorized before the ride and captured after completion, with idempotency to prevent duplicate charges.

Real-time tracking can use WebSocket or streaming updates, but location data can be eventually consistent.

The main trade-offs are matching latency, ETA accuracy, location freshness, consistency, and marketplace efficiency.

Ultimately, the goal is to quickly and reliably match riders with nearby drivers, manage trip state correctly, and provide a smooth real-time experience.

⭐ Final Insight

Ride Sharing 的核心不是地图，而是一个实时供需匹配、地理索引、状态机和支付系统组合成的 marketplace platform。

中文部分

1️⃣ 核心框架

在设计 Ride Sharing / Uber 时，我通常从以下几个方面来分析：

核心流程：request ride、match driver、track trip
实时位置系统
Driver discovery 和 geo indexing
Matching 和 dispatch algorithm
Trip state machine
ETA、routing 和 pricing
Payment 和 notifications
核心权衡：latency vs accuracy vs consistency

2️⃣ 核心需求

功能需求

Rider 可以请求用车
Driver 可以上线 / 下线
系统可以查找附近 drivers
系统可以匹配 rider 和 driver
Rider 和 driver 可以实时追踪对方位置
支持 trip 生命周期：requested、accepted、picked up、completed
支持 pricing 和 ETA
支持 payment
支持 cancellation
支持 notifications

非功能需求

低延迟 matching
实时 location updates
高可用
可扩展 geo search
可靠 trip state management
Payment correctness 要强
Location updates 可以最终一致

👉 面试回答

Ride-sharing system 有三个关键实时流程： driver location updates、 rider trip requests、以及 rider-driver matching。

核心挑战是快速找到最合适的 nearby driver，同时处理实时位置变化、trip 状态转换、 pricing、payment 和系统故障。

3️⃣ 主要 API

Driver Location Update

POST /api/drivers/{driverId}/location

Request:

{
  "lat": 40.7128,
  "lng": -74.0060,
  "status": "AVAILABLE",
  "timestamp": "2026-05-02T10:00:00Z"
}

Request Ride

POST /api/rides

Request:

{
  "riderId": "r123",
  "pickup": {
    "lat": 40.7128,
    "lng": -74.0060
  },
  "dropoff": {
    "lat": 40.7580,
    "lng": -73.9855
  },
  "rideType": "standard"
}

Accept Ride

POST /api/rides/{rideId}/accept

Request:

{
  "driverId": "d456"
}

Update Trip State

POST /api/rides/{rideId}/state

Request:

{
  "state": "PICKED_UP"
}

Get Ride Status

GET /api/rides/{rideId}

👉 面试回答

我会将 driver location updates、 ride creation、ride acceptance、 trip state updates 和 ride status retrieval 拆成不同 API。

Location updates 是高吞吐、最终一致的数据；但 ride acceptance 和 payment 需要更强正确性。

4️⃣ 数据模型

Rider Table

rider (
  rider_id VARCHAR PRIMARY KEY,
  name VARCHAR,
  phone VARCHAR,
  payment_profile_id VARCHAR,
  created_at TIMESTAMP
)

Driver Table

driver (
  driver_id VARCHAR PRIMARY KEY,
  name VARCHAR,
  phone VARCHAR,
  vehicle_id VARCHAR,
  status VARCHAR,
  created_at TIMESTAMP
)

Driver Location Store

driver_location (
  driver_id VARCHAR PRIMARY KEY,
  lat DOUBLE,
  lng DOUBLE,
  geohash VARCHAR,
  status VARCHAR,
  updated_at TIMESTAMP
)

Ride Table

ride (
  ride_id VARCHAR PRIMARY KEY,
  rider_id VARCHAR,
  driver_id VARCHAR,
  pickup_lat DOUBLE,
  pickup_lng DOUBLE,
  dropoff_lat DOUBLE,
  dropoff_lng DOUBLE,
  state VARCHAR,
  price_estimate DECIMAL,
  final_price DECIMAL,
  created_at TIMESTAMP,
  updated_at TIMESTAMP
)

Ride Event Table

ride_event (
  event_id VARCHAR PRIMARY KEY,
  ride_id VARCHAR,
  actor_id VARCHAR,
  event_type VARCHAR,
  created_at TIMESTAMP,
  metadata JSON
)

👉 面试回答

我会将 rider、driver、trip、 driver location 和 ride events 分开存储。

Driver location 更新频繁，所以应该存储在支持 geo index 的快速存储中。

Ride state 需要持久化保存， ride events 可以用于 audit 和 debugging。

5️⃣ 实时位置系统

Driver Location Updates

Driver 每隔几秒发送一次位置。

Driver App
→ Location Service
→ Geo Index
→ Location Stream

Storage Choice

使用快速 in-memory 或 geo-enabled storage：

Redis Geo
Geohash index
S2 cells
Elastic geo index
Custom in-memory geo grid

为什么不把所有 Location 存在主 DB？

因为 location updates：

高频
生命周期短
主要用于 nearby search
可以最终一致

👉 面试回答

Driver location updates 是高吞吐、短生命周期数据。

我会将最新 driver locations 存在快速 geo-indexed store 中，例如 Redis Geo、S2 cells 或 geohash-based index。

主关系型数据库应该存储持久化 trip state，而不是每一次实时位置更新。

6️⃣ Geo Indexing and Nearby Driver Search

Geohash / Grid-based Index

将位置转换成 cell。

lat/lng → geohash / S2 cell

然后搜索：

pickup cell + neighboring cells

Search Flow

Rider requests ride
→ Convert pickup to geo cell
→ Find available drivers in nearby cells
→ Filter by distance and driver status
→ Rank drivers
→ Dispatch request

Candidate Filters

Driver availability
Distance to pickup
ETA to pickup
Vehicle type
Driver rating
Current assignment status
Region constraints

👉 面试回答

为了查找 nearby drivers，我会使用 geohash 或 S2 cells 这类 geo index。

系统先搜索 pickup 所在 cell 和相邻 cells，然后根据 driver status、vehicle type、 distance 和 ETA 过滤可用 drivers。

这样可以避免扫描所有 drivers。

7️⃣ Matching and Dispatch

Simple Matching

选择最近的 available driver。

min(driver ETA to pickup)

Better Matching

考虑：

ETA to pickup
Driver acceptance probability
Driver rating
Rider preference
Vehicle type
Driver fairness
Surge region
当前 supply-demand balance

Dispatch Flow

Find candidate drivers
→ Rank candidates
→ Send request to top driver
→ Wait for response
→ If timeout/reject, try next driver
→ Confirm match

Avoid Double Assignment

Driver 不应该同时接受两个 rides。

使用：

conditional update / compare-and-swap

示例：

UPDATE driver
SET status = 'ASSIGNED'
WHERE driver_id = 'd456'
AND status = 'AVAILABLE';

👉 面试回答

Matching 会先找到附近可用 drivers，然后根据 ETA、availability、vehicle type 和业务约束进行排序。

为了避免 double assignment， driver 接单时应该使用 atomic conditional update，确保一个 driver 只能被一个 ride 占用。

8️⃣ Trip State Machine

常见 States

REQUESTED
MATCHING
DRIVER_ASSIGNED
DRIVER_ARRIVING
PICKED_UP
IN_PROGRESS
COMPLETED
CANCELLED

State Transition Rules

示例：

REQUESTED → MATCHING
MATCHING → DRIVER_ASSIGNED
DRIVER_ASSIGNED → DRIVER_ARRIVING
DRIVER_ARRIVING → PICKED_UP
PICKED_UP → IN_PROGRESS
IN_PROGRESS → COMPLETED

为什么 State Machine 重要？

防止非法状态转换
支持 retries
支持 audit trail
正确处理 cancellation
协调 payment 和 notifications

👉 面试回答

我会将 trip lifecycle 建模成 state machine。

这样可以防止非法状态转换，让 retry 更安全，并提供清晰的 audit trail。

Payment 应该只在 trip 到达 completed 状态后触发。

9️⃣ ETA and Routing

ETA Use Cases

Driver ETA to pickup
Trip ETA to destination
Estimated fare
Matching ranking
Rider UI updates

Inputs

Road network
Traffic conditions
Driver location
Pickup / dropoff
Historical travel time
Real-time speed data

Routing Service

pickup + driver location → ETA to pickup
pickup + dropoff → trip ETA

👉 面试回答

ETA 同时影响用户体验和 matching。

Matching system 不应该只按直线距离排序，而应该按照 driver 到 pickup 的 estimated time 排序。

ETA 可以由 routing service 根据 traffic、road network 和 historical travel time 计算。

🔟 Pricing and Surge

Base Price Components

Base fare
Distance
Time
Ride type
Taxes / fees
Surge multiplier

Surge Pricing

Surge 基于供需关系。

surge = demand / available_supply

Surge Region

使用 geo cells 或 regions：

city → zone → geohash / S2 cell

Pricing Flow

Estimate route distance and time
→ Apply base fare formula
→ Apply surge multiplier
→ Return estimate
→ Final price calculated after completion

👉 面试回答

Pricing 会基于估算距离、时长、 ride type 和区域供需情况。

Surge pricing 通常按 region 计算，取决于当前 demand 和 available driver supply。

系统会在 ride 开始前返回 estimated fare，并在 trip 完成后计算 final fare。

1️⃣1️⃣ Payment Flow

Payment Timing

典型流程：

Pre-authorize payment before ride
→ Capture payment after completion

为什么 Pre-authorization？

验证支付方式
降低支付失败风险
保证 driver payout 更可靠

Payment Flow

Ride requested
→ Payment service pre-authorizes amount
→ Ride completed
→ Final price calculated
→ Payment captured
→ Receipt sent
→ Driver payout recorded

Correctness

Payment 比 location 需要更强一致性。

使用：

Idempotency keys
Transaction records
Payment state machine
Retry with provider-safe semantics

👉 面试回答

Payment 需要更强正确性。

我会在确认 trip 前对 rider 的支付方式进行 pre-authorization，并在 trip 完成后 capture final amount。

Payment operations 必须幂等，防止重复扣款。

1️⃣2️⃣ Notifications

Notification Types

Ride request sent to driver
Driver accepted
Driver arriving
Driver arrived
Trip started
Trip completed
Payment receipt
Cancellation

Channels

Push notification
SMS fallback
In-app real-time updates
Email receipt

👉 面试回答

Notifications 对 ride coordination 非常关键。

我会使用 push 和 in-app updates 提供实时 ride status；对关键事件使用 SMS 兜底；对 receipt 使用 email。

Notification delivery 应该异步并支持重试。

1️⃣3️⃣ Real-time Tracking

Rider Tracking Driver

Driver app sends location
→ Location service updates geo store
→ Tracking service pushes updates
→ Rider app updates map

Driver Tracking Rider Pickup

Rider location 也可以临时共享。

Protocol

使用：

WebSocket
Server-sent events
Mobile push for background updates

👉 面试回答

对于 real-time tracking， driver app 会每隔几秒发送位置更新。

Tracking service 会将相关位置更新推送给 rider app，通常可以通过 WebSocket 或其他 streaming channel。

这些位置更新可以最终一致，因为几秒的延迟通常是可以接受的。

1️⃣4️⃣ Scaling Patterns

Pattern 1: Separate Location and Trip State

Location = 高频、临时数据
Trip state = 持久化、强状态管理

Pattern 2: Geo-sharded Location Store

按以下维度分片：

geohash / S2 cell / region

Pattern 3: Async Dispatch Queue

使用 queue 处理 matching attempts。

ride request → dispatch queue → matching workers

Pattern 4: Regional Architecture

Ride matching 天然是本地化的。

city / region based deployment

好处：

降低延迟
缩小搜索空间
更好的故障隔离

Pattern 5: Event-driven Trip Updates

ride state change → event bus → notification/payment/analytics

👉 面试回答

为了扩展系统，我会将 real-time location 和 durable trip state 分开。

Location store 按 region 或 geo cell 分片， matching 按 region 执行。

Ride matching 天然是本地问题，所以 regional architecture 可以降低延迟，并提升故障隔离能力。

1️⃣5️⃣ Failure Handling

常见故障

Driver location update delayed
Driver accepts but request times out
Double assignment race
Payment pre-auth fails
Driver app disconnects
Rider cancels during matching
Dispatch queue backlog
Routing service unavailable

Strategies

Driver location 设置 TTL
Atomic driver assignment
Retry dispatch to next driver
Idempotent ride state transitions
Fallback ETA approximation
Cancel stale matching requests
Payment idempotency
Reconcile trip state with events

👉 面试回答

系统应该通过 idempotency、timeouts、 retries 和清晰状态转换来处理故障。

Driver locations 应该有 TTL，避免 stale drivers 被匹配。

Driver assignment 必须是原子的，防止同一个 driver 被分配给多个 trips。

1️⃣6️⃣ Consistency Model

需要较强一致性的场景

Ride state transitions
Driver assignment
Payment
Cancellation
Driver payout
Access to trip details

可以最终一致的场景

Driver location
ETA updates
Map tracking
Surge estimate
Analytics
Notifications

👉 面试回答

系统不同部分需要不同一致性。

Driver location 和 ETA 可以最终一致，因为小延迟是可以接受的。

但是 ride assignment、trip state、payment 和 cancellation 需要更强正确性。

1️⃣7️⃣ Observability

Key Metrics

Ride request rate
Match success rate
Match latency
Driver acceptance rate
ETA accuracy
Location update lag
Dispatch timeout rate
Cancellation rate
Payment failure rate
Surge region imbalance
Trip state transition errors

👉 面试回答

我会监控 match latency、match success rate、 driver acceptance rate、location update lag、 ETA accuracy、cancellation rate 和 payment failure rate。

这些指标直接反映 marketplace health 和 rider / driver experience。

1️⃣8️⃣ End-to-End Flow

Ride Request Flow

Rider requests ride
→ Validate rider and payment method
→ Estimate price and ETA
→ Find nearby available drivers
→ Dispatch to best driver
→ Driver accepts
→ Create confirmed trip
→ Notify rider and driver

Trip Flow

Driver goes to pickup
→ Rider tracks driver
→ Driver arrives
→ Rider picked up
→ Trip starts
→ Driver navigates to destination
→ Trip completed
→ Final fare calculated
→ Payment captured
→ Receipt sent

Location Flow

Driver sends location every few seconds
→ Location service updates geo index
→ Tracking service streams update to rider
→ Matching service uses latest available drivers

Key Insight

Ride Sharing 是一个实时 marketplace system，不只是 map application。

🧠 Staff-Level Answer（最终版）

👉 面试回答（完整背诵版）

在设计 Uber 这类 Ride-sharing system 时，我会将它看作一个连接 riders 和 drivers 的实时 marketplace。

系统有三个关键流程： driver location updates、 rider ride requests、以及 rider-driver matching。

Driver locations 是高吞吐、短生命周期数据，所以我会使用 geohash、S2 cells 或 Redis Geo 将它们存储在快速 geo-indexed store 中。

Ride state、payment 和 trip history 则需要持久化存储，因为它们需要更强正确性。

对于 matching，系统会先在 pickup geo cell 和相邻 cells 搜索 nearby available drivers。然后根据 ETA to pickup、vehicle type、 driver availability、acceptance probability 和业务约束进行排序。

为了避免 double assignment， driver acceptance 必须使用 atomic conditional update。

我会将每个 ride 建模成 state machine，包含 requested、matching、driver assigned、 picked up、in progress、completed 和 cancelled 等状态。

ETA 和 pricing 会基于 routing、traffic data、 distance、time、ride type 和 surge multiplier 计算。

Payment 应该在 ride 前 pre-authorize，并在完成后 capture，同时使用 idempotency 防止重复扣款。

Real-time tracking 可以使用 WebSocket 或 streaming updates，但 location data 可以最终一致。

核心权衡包括 matching latency、ETA accuracy、 location freshness、consistency 和 marketplace efficiency。

最终目标是快速、可靠地将 riders 匹配给 nearby drivers，正确管理 trip state，并提供流畅的实时体验。

⭐ Final Insight

Ride Sharing 的核心不是地图，而是一个由实时供需匹配、地理索引、状态机和支付系统组合成的 marketplace platform。

🎯 Design Ride Sharing (Uber)

1️⃣ Core Framework

2️⃣ Core Requirements

Functional Requirements

Non-functional Requirements

3️⃣ Main APIs

Driver Location Update

Request Ride

Accept Ride

Update Trip State

Get Ride Status

4️⃣ Data Model

Rider Table

Driver Table

Driver Location Store

Ride Table

Ride Event Table

5️⃣ Real-time Location System

Driver Location Updates

Storage Choice

Why Not Store Every Location in Main DB?

6️⃣ Geo Indexing and Nearby Driver Search

Geohash / Grid-based Index

Search Flow

Candidate Filters

7️⃣ Matching and Dispatch

Simple Matching

Better Matching

Dispatch Flow

Avoid Double Assignment

8️⃣ Trip State Machine

Common States

State Transition Rules

Why State Machine Matters

9️⃣ ETA and Routing

ETA Use Cases

Inputs

Routing Service

🔟 Pricing and Surge

Base Price Components

Surge Pricing

Surge Region

Pricing Flow

1️⃣1️⃣ Payment Flow

Payment Timing

Why Pre-authorization?

Payment Flow

Correctness

1️⃣2️⃣ Notifications

Notification Types

Channels

1️⃣3️⃣ Real-time Tracking

Rider Tracking Driver

Driver Tracking Rider Pickup

Protocol

1️⃣4️⃣ Scaling Patterns

Pattern 1: Separate Location and Trip State

Pattern 2: Geo-sharded Location Store

Pattern 3: Async Dispatch Queue

Pattern 4: Regional Architecture

Pattern 5: Event-driven Trip Updates

1️⃣5️⃣ Failure Handling

Common Failures

Strategies

1️⃣6️⃣ Consistency Model

Stronger Consistency Needed For

Eventual Consistency Acceptable For

1️⃣7️⃣ Observability

Key Metrics

1️⃣8️⃣ End-to-End Flow

Ride Request Flow

Trip Flow

Location Flow

Key Insight

🧠 Staff-Level Answer (Final)

⭐ Final Insight

中文部分

🎯 Design Ride Sharing (Uber)

1️⃣ 核心框架

2️⃣ 核心需求