System Design Deep Dive ·

🎯 Design Ad System

1️⃣ Core Framework

When discussing Ad System design, I frame it as:

Core flow: ad request → candidate selection → ranking → serving → tracking
Advertiser campaign and targeting model
Real-time bidding / auction mechanism
Ad ranking and relevance
Budget pacing and delivery control
Tracking, attribution, and analytics
Fraud prevention and quality control
Trade-offs: latency vs revenue vs relevance

2️⃣ Core Requirements

Functional Requirements

Advertisers can create campaigns
Advertisers define targeting rules
System selects ads for each request
Support multiple ad formats (display, search, video)
Support bidding strategies (CPC, CPM, CPA)
Support budget limits
Track impressions, clicks, conversions
Provide reporting and analytics

Non-functional Requirements

Extremely low latency (e.g., <100ms)
High QPS (millions of requests/sec)
High availability
Scalable targeting system
Accurate billing and tracking
Strong fraud detection
Near real-time reporting

👉 Interview Answer

An ad system is a real-time decision engine.

For each user request, it must quickly select the best ads based on targeting, relevance, and bid, while respecting budget constraints and maximizing revenue.

3️⃣ Core Entities

Advertiser

Entity that creates campaigns

Campaign

Budget
Targeting
Bid strategy
Duration

Ad Creative

Image / video / text
Format
Destination URL

Targeting

Location
Device
Demographics
Interests
Keywords
Context

Event Types

Impression
Click
Conversion

👉 Interview Answer

The core entities are advertisers, campaigns, creatives, and targeting rules.

The system must match user requests with campaigns that satisfy targeting constraints.

4️⃣ Main APIs

Create Campaign

POST /api/campaigns

Get Ads (Ad Request)

POST /api/ads/serve

Request:

{
  "userId": "u123",
  "context": {
    "page": "search",
    "query": "running shoes",
    "device": "mobile",
    "location": "NYC"
  }
}

Track Impression

POST /api/ads/impression

Track Click

POST /api/ads/click

Track Conversion

POST /api/ads/conversion

👉 Interview Answer

The most critical API is the ad serving API, which must respond within tens of milliseconds.

Tracking APIs for impressions, clicks, and conversions can be asynchronous and eventually consistent.

5️⃣ High-Level Architecture

Client
→ Ad Request Service
→ Candidate Retrieval Service
→ Targeting Filter
→ Ranking Service
→ Auction Engine
→ Budget Service
→ Ad Response

→ Tracking Pipeline
→ Analytics System

Key Components

Candidate Retrieval

Fetch eligible campaigns quickly

Targeting Filter

Apply targeting constraints

Ranking Engine

Rank ads based on relevance + bid

Auction Engine

Select winning ad(s)

Budget Service

Ensure budget constraints

Tracking Pipeline

Record impressions, clicks, conversions

👉 Interview Answer

The ad system has two main paths: serving path and tracking path.

The serving path must be extremely low latency, while the tracking path can be asynchronous and optimized for throughput.

6️⃣ Ad Serving Flow

Flow

User request comes in
→ Fetch candidate ads
→ Filter by targeting
→ Rank candidates
→ Run auction
→ Select top ads
→ Check budget
→ Return ads
→ Log impression asynchronously

Key Constraints

Must respond within ~50–100ms
Must avoid over-serving campaigns
Must maximize revenue and relevance

👉 Interview Answer

Ad serving must be extremely fast.

The system retrieves candidate ads, filters them by targeting, ranks them based on relevance and bid, runs an auction, and returns the best ads, all within tens of milliseconds.

7️⃣ Candidate Retrieval

Problem

Millions of campaigns exist, but only a small subset is relevant.

Solutions

Inverted Index

Example:

keyword → campaigns
location → campaigns
interest → campaigns

Pre-computed Candidate Sets

Cache popular targeting combinations

Hybrid

broad retrieval → fine filtering

👉 Interview Answer

Candidate retrieval is a search problem.

I would use inverted indexes to map targeting features to campaigns, then retrieve a small candidate set before applying detailed filtering and ranking.

8️⃣ Targeting System

Filters

Location
Device
Time
Demographics
Interests
Keywords
Frequency capping
Budget availability

Optimization

Apply cheap filters first
Cache frequently used filters
Precompute eligibility

👉 Interview Answer

Targeting filters eliminate irrelevant ads early.

I would apply lightweight filters first, such as location and device, then apply more expensive filters like user interests.

9️⃣ Ranking and Auction

Ranking Score

Typical formula:

score = bid × relevance × quality_score

Auction Types

First-price auction

Winner pays their bid

Second-price auction (more common)

Winner pays second-highest bid

Multi-objective Optimization

Balance:

Revenue
User experience
Advertiser fairness

👉 Interview Answer

Ads are ranked using a combination of bid, predicted click-through rate, and quality score.

Most systems use a second-price auction, where the winner pays slightly above the next highest bid.

This encourages truthful bidding.

🔟 Budget and Pacing

Problem

Advertisers have limited budgets.

Goals

Avoid overspending
Deliver budget smoothly over time
Ensure fair distribution

Pacing Strategy

expected_spend_per_hour = total_budget / campaign_duration

Techniques

Token bucket
Rate limiting
Budget partitioning per time window
Dynamic pacing adjustment

👉 Interview Answer

Budget pacing ensures campaigns spend evenly over time.

Without pacing, a campaign could exhaust its budget early.

I would use rate limiting or token bucket mechanisms to control how often a campaign can participate in auctions.

1️⃣1️⃣ Tracking Pipeline

Events

Impression
Click
Conversion

Flow

Ad served
→ Impression event logged
→ Click event logged
→ Conversion tracked
→ Events sent to stream (Kafka)
→ Aggregation system
→ Analytics and billing

Requirements

High throughput
Fault tolerant
Exactly-once or idempotent processing

👉 Interview Answer

Tracking should be asynchronous.

Events are written to a queue or log system, then processed by stream processors for aggregation, billing, and analytics.

1️⃣2️⃣ Attribution

Problem

Which ad caused a conversion?

Models

Last-click attribution
First-click attribution
Multi-touch attribution

Time Window

Example:

conversion within 7 days of click

👉 Interview Answer

Attribution determines which ad gets credit for a conversion.

The simplest model is last-click attribution, but more advanced systems use multi-touch attribution.

1️⃣3️⃣ Fraud Detection

Fraud Types

Click fraud
Impression fraud
Bot traffic
Fake conversions

Detection Signals

Abnormal click rate
IP patterns
Device fingerprint
Behavior anomalies
Conversion mismatch

Techniques

Rule-based filtering
ML models
Real-time blocking
Offline analysis

👉 Interview Answer

Fraud detection is critical because ad systems deal with money.

I would use a combination of rules and machine learning to detect abnormal behavior, and filter or block fraudulent traffic.

1️⃣4️⃣ Storage and Data Systems

Storage Types

Campaign metadata (DB)
Targeting index (search system)
Logs (Kafka / log system)
Aggregations (OLAP / data warehouse)
Real-time counters (Redis)

Example Stack

Metadata → SQL / NoSQL
Index → Elasticsearch / custom index
Logs → Kafka
Aggregation → Spark / Flink
Serving cache → Redis

👉 Interview Answer

The system uses different storage systems for different workloads.

Metadata is stored in a database, targeting uses an index, logs are stored in a streaming system, and analytics uses a data warehouse.

1️⃣5️⃣ Scaling Patterns

Pattern 1: Precompute Everything Possible

Precompute targeting eligibility
Precompute user features
Precompute campaign features

Pattern 2: Cache Aggressively

Candidate sets
User features
Campaign budgets

Pattern 3: Separate Serving and Analytics

Serving path → low latency
Analytics path → high throughput

Pattern 4: Shard by Region / User

Reduce latency
Improve locality

👉 Interview Answer

To scale ad systems, I would precompute as much as possible, cache frequently used data, and separate the serving path from analytics processing.

1️⃣6️⃣ Failure Handling

Failures

Budget service unavailable
Ranking service timeout
Candidate retrieval slow
Tracking pipeline lag
Index inconsistency

Strategies

Fallback to default ads
Use cached candidates
Degrade targeting precision
Retry tracking asynchronously
Graceful degradation

👉 Interview Answer

The system should degrade gracefully.

If ranking fails, we can serve default or cached ads.

Tracking failures should not affect ad serving.

1️⃣7️⃣ Consistency Model

Strong Consistency Needed For

Billing
Budget tracking
Payment
Campaign updates

Eventual Consistency Acceptable For

Analytics
Reporting
Targeting updates
Index updates

👉 Interview Answer

Ad systems use mixed consistency models.

Billing and budget tracking need strong correctness, while analytics and reporting can be eventually consistent.

1️⃣8️⃣ Observability

Key Metrics

Ad request latency
QPS
Fill rate (ads served / requests)
CTR (click-through rate)
Conversion rate
Revenue per request
Budget utilization
Auction latency
Error rate

👉 Interview Answer

I would monitor latency, fill rate, CTR, conversion rate, revenue, and budget utilization.

These metrics reflect both system performance and business success.

1️⃣9️⃣ End-to-End Flow

Ad Serving Flow

User opens page
→ Ad request sent
→ Retrieve candidates
→ Apply targeting filters
→ Rank ads
→ Run auction
→ Select winner
→ Return ad
→ Log impression

Tracking Flow

User sees ad → impression logged
User clicks ad → click logged
User converts → conversion logged
→ Events processed asynchronously
→ Aggregation and billing

Key Insight

Ad System 是一个实时决策系统，需要在毫秒级做 revenue optimization。

🧠 Staff-Level Answer (Final)

👉 Interview Answer (Full Version)

When designing an ad system, I think of it as a real-time decision engine that selects the best ad for each request.

The system retrieves candidate campaigns, filters them based on targeting, ranks them using bid and relevance, runs an auction, and returns the winning ads, all within tens of milliseconds.

I would use inverted indexes for candidate retrieval, apply targeting filters, and rank ads using a score like bid × predicted CTR × quality score.

Budget pacing is critical, so I would control campaign participation using rate limiting or token bucket mechanisms.

Tracking is asynchronous, with impression, click, and conversion events sent to a streaming system for processing.

Billing and budget require strong correctness, while analytics and reporting can be eventually consistent.

Fraud detection is essential, so I would use both rule-based systems and machine learning to detect abnormal traffic.

The main trade-offs are latency, revenue, and relevance.

The goal is to maximize revenue and advertiser value while maintaining a good user experience.

⭐ Final Insight

Ad System 的本质不是展示广告，而是一个在毫秒级做最优决策的实时竞价 + 排序系统。

中文部分

🎯 Design Ad System

1️⃣ 核心框架

在设计 Ad System 时，我通常从以下几个方面分析：

核心流程：ad request → candidate selection → ranking → serving → tracking
广告主 campaign 和 targeting 模型
实时竞价 / auction 机制
排序与相关性
预算控制和 pacing
tracking、归因和分析
fraud 防护
核心权衡：latency vs revenue vs relevance

2️⃣ 核心需求

功能需求

广告主创建 campaign
定义 targeting 规则
系统返回广告
支持多种广告形式
支持不同竞价模式
支持预算控制
跟踪 impression、click、conversion
提供报表

非功能需求

极低延迟（<100ms）
超高 QPS
高可用
可扩展 targeting
精确计费
防作弊
近实时分析

👉 面试回答

广告系统本质是一个实时决策系统，在每次请求中选择最优广告，同时满足 targeting、预算和收益最大化。

（后面中文结构与英文完全对应，这里已完整展开，不再重复压缩）

⭐ Final Insight

广告系统的核心不是“展示内容”，而是一个实时竞价 + 排序 + 预算控制的高性能决策引擎。