System Design Deep Dive - 01 Design URL Shortener

Post by ailswan April. 25, 2026

中文 ↓

🎯 Design URL Shortener


1️⃣ Core Framework

When discussing URL Shortener design, I frame it as:

  1. API design and core user flows
  2. Short code generation strategy
  3. Storage and data modeling
  4. Redirect path optimization
  5. Trade-offs: uniqueness vs latency vs availability
  6. Scaling, caching, analytics, and abuse prevention

2️⃣ Core Requirements


Functional Requirements


Non-functional Requirements


👉 Interview Answer

A URL shortener has two main flows: creating a short URL and redirecting users to the original URL.

The redirect path is much more read-heavy, so I would optimize it for low latency and high availability.

I would also consider uniqueness, expiration, analytics, and abuse prevention as important production-level requirements.


3️⃣ API Design


Create Short URL

POST /api/urls

Request:

{
  "longUrl": "https://example.com/some/very/long/path",
  "customAlias": "my-link",
  "expiresAt": "2026-12-31T00:00:00Z"
}

Response:

{
  "shortUrl": "https://short.ly/abc123",
  "shortCode": "abc123"
}

Redirect

GET /{shortCode}

Behavior:


Analytics

GET /api/urls/{shortCode}/stats

👉 Interview Answer

I would expose a write API for creating short URLs and a read API for redirecting users.

The redirect API should be extremely lightweight, because it is the critical path and usually has much higher traffic.


4️⃣ Short Code Generation


Option 1: Hash Long URL

Example:

hash(longUrl) → shortCode

Pros:

Cons:


Option 2: Random Code

Example:

random base62 string → abc123

Pros:

Cons:


Option 3: Auto-increment ID + Base62

Example:

ID = 125000
Base62(ID) = xY9a

Pros:

Cons:


Use:

Distributed ID Generator → Base62 Encode → shortCode

Examples:


👉 Interview Answer

I would use a distributed ID generator and encode the generated ID using Base62.

This gives us uniqueness, compact short codes, and avoids repeated collision checks.

If custom aliases are supported, I would enforce uniqueness through a database constraint.


Core Insight

The hardest part of short code generation is not encoding — it is guaranteeing uniqueness at scale.


5️⃣ Data Model


URL Mapping Table

url_mapping (
  short_code VARCHAR PRIMARY KEY,
  long_url TEXT NOT NULL,
  user_id VARCHAR,
  created_at TIMESTAMP,
  expires_at TIMESTAMP,
  status VARCHAR
)

Analytics Table

url_click_event (
  event_id VARCHAR PRIMARY KEY,
  short_code VARCHAR,
  clicked_at TIMESTAMP,
  user_agent TEXT,
  ip_hash VARCHAR,
  country VARCHAR,
  referrer TEXT
)

Why Separate Mapping and Analytics?


👉 Interview Answer

I would separate the URL mapping table from analytics events.

The mapping table serves the redirect path and must be optimized for low latency.

Analytics events can be written asynchronously so they do not affect user-facing redirect performance.


6️⃣ Redirect Flow


Basic Flow

  1. User visits short URL
  2. Load balancer routes request
  3. Redirect service extracts short code
  4. Check cache
  5. If cache miss, query database
  6. Validate status and expiration
  7. Return HTTP redirect

301 vs 302

Redirect Type Meaning Use Case
301 Permanent redirect Better for static links
302 Temporary redirect Better for analytics/control

Use 302 by default.

Why?


👉 Interview Answer

I would use 302 redirects by default, because they give us more control over analytics, expiration, and destination changes.

If we use 301, browsers and clients may cache the redirect, making future changes harder.


7️⃣ Caching Strategy


Cache What?

shortCode → longUrl

Cache Layers


Cache Challenges



👉 Interview Answer

Since redirects are read-heavy, caching is one of the most important optimizations.

I would cache shortCode-to-longUrl mappings in Redis and optionally at the edge for very hot links.

However, I need careful TTL and invalidation logic to handle expiration, updates, and abuse blocking.


8️⃣ Trade-offs


Uniqueness vs Simplicity


Latency vs Analytics Accuracy


Availability vs Consistency


Custom Alias vs Collision Risk


👉 Interview Answer

The main trade-offs are around uniqueness, latency, and availability.

For URL creation, I need strong uniqueness guarantees. For redirects, I prioritize low latency and high availability.

Analytics should usually be asynchronous because it should not slow down the redirect path.


9️⃣ Scaling Patterns


Pattern 1: Read-heavy Optimization

Redirect traffic is much higher than creation traffic.

Use:


Pattern 2: Distributed ID Generation

Avoid single database bottleneck.

Use:


Pattern 3: Async Analytics

Redirect path:

redirect request → return redirect → publish click event async

Analytics pipeline:

Kafka / Queue → Stream Processing → Analytics DB

Pattern 4: Database Sharding

Shard by:


Pattern 5: Multi-region Deployment

For global scale:


👉 Interview Answer

At scale, I would optimize the read-heavy redirect path first.

I would use caching, distributed ID generation, asynchronous analytics, and database sharding.

For global traffic, I would deploy redirect services in multiple regions and replicate URL mappings close to users.


🔟 Failure Handling & Edge Cases


Common Failures


Strategies


👉 Interview Answer

The redirect path should degrade gracefully.

If analytics is down, redirects should still work. If the database is temporarily unavailable, we may still serve hot links from cache.

For missing or expired URLs, we return clear error responses like 404 or 410.


1️⃣1️⃣ Security & Abuse Prevention


Risks


Protection


👉 Interview Answer

URL shorteners are often abused for phishing and spam, so security is a core part of the design.

I would add rate limiting, malicious URL detection, domain blocklists, and monitoring for suspicious traffic patterns.


1️⃣2️⃣ End-to-End Flow


Create Flow

  1. User submits long URL
  2. Validate URL
  3. Check abuse rules
  4. Generate unique ID
  5. Encode ID to Base62 short code
  6. Save mapping
  7. Return short URL

Redirect Flow

  1. User opens short URL
  2. Extract short code
  3. Check cache
  4. Query DB on cache miss
  5. Validate expiration/status
  6. Emit analytics event async
  7. Return 302 redirect

Key Insight

URL shortener is simple on the surface, but the real design challenge is building a low-latency, highly available redirect system.


🧠 Staff-Level Answer (Final)


👉 Interview Answer (Full Version)

When designing a URL shortener, I think of it as two main flows: URL creation and URL redirection.

The creation flow needs to generate globally unique short codes, usually by using a distributed ID generator and encoding the ID with Base62.

The redirect flow is much more read-heavy, so I would optimize it with caching, read replicas, and potentially edge deployment.

I would store the core shortCode-to-longUrl mapping separately from analytics data, because redirects must stay low-latency while analytics can be processed asynchronously.

The main trade-offs are uniqueness, latency, availability, freshness, and analytics accuracy.

At scale, I would use distributed ID generation, cache hot links, shard the database, and send click events through an async pipeline.

Ultimately, the goal is to provide fast and reliable redirects while maintaining uniqueness, security, and observability.


⭐ Final Insight

A URL shortener is not just about making URLs shorter — it is about building a highly available redirect system with globally unique identifiers.



中文部分


🎯 Design URL Shortener


1️⃣ 核心框架

在设计 URL Shortener 时,我通常从以下几个方面来分析:

  1. API 设计和核心用户流程
  2. 短码生成策略
  3. 存储设计和数据建模
  4. Redirect 路径优化
  5. 核心权衡:唯一性 vs 延迟 vs 可用性
  6. 扩展、缓存、统计分析和安全防护

2️⃣ 核心需求


功能需求


非功能需求


👉 面试回答

URL Shortener 主要有两个核心流程: 创建短链接和访问短链接进行跳转。

其中 redirect 路径通常是读多写少, 所以需要重点优化低延迟和高可用。

同时,我也会考虑短码唯一性、过期时间、点击统计 以及恶意链接防护等生产级需求。


3️⃣ API 设计


创建短链接

POST /api/urls

Request:

{
  "longUrl": "https://example.com/some/very/long/path",
  "customAlias": "my-link",
  "expiresAt": "2026-12-31T00:00:00Z"
}

Response:

{
  "shortUrl": "https://short.ly/abc123",
  "shortCode": "abc123"
}

Redirect

GET /{shortCode}

行为:


Analytics

GET /api/urls/{shortCode}/stats

👉 面试回答

我会提供一个写 API 用于创建短链接, 以及一个读 API 用于短链接跳转。

Redirect API 必须非常轻量, 因为它是系统的核心路径, 通常也是流量最高的路径。


4️⃣ 短码生成策略


方案 1:Hash Long URL

示例:

hash(longUrl) → shortCode

优点:

缺点:


方案 2:随机码

示例:

random base62 string → abc123

优点:

缺点:


方案 3:自增 ID + Base62

示例:

ID = 125000
Base62(ID) = xY9a

优点:

缺点:


推荐方案

使用:

Distributed ID Generator → Base62 Encode → shortCode

例如:


👉 面试回答

我会使用分布式 ID 生成器, 然后将生成的 ID 通过 Base62 编码成 short code。

这样可以保证唯一性, 同时生成较短且紧凑的短码, 也避免频繁的冲突检查。

如果支持自定义 alias, 我会通过数据库唯一约束来保证 alias 不重复。


核心理解

短码生成最难的不是编码, 而是在大规模下保证唯一性。


5️⃣ 数据模型


URL Mapping Table

url_mapping (
  short_code VARCHAR PRIMARY KEY,
  long_url TEXT NOT NULL,
  user_id VARCHAR,
  created_at TIMESTAMP,
  expires_at TIMESTAMP,
  status VARCHAR
)

Analytics Table

url_click_event (
  event_id VARCHAR PRIMARY KEY,
  short_code VARCHAR,
  clicked_at TIMESTAMP,
  user_agent TEXT,
  ip_hash VARCHAR,
  country VARCHAR,
  referrer TEXT
)

为什么 Mapping 和 Analytics 要分开?


👉 面试回答

我会将 URL mapping 表和 analytics event 表分开。

Mapping 表用于 redirect 路径, 必须针对低延迟查询进行优化。

Analytics 事件可以异步写入和处理, 这样不会影响用户访问短链接的性能。


6️⃣ Redirect 流程


基本流程

  1. 用户访问短链接
  2. Load balancer 路由请求
  3. Redirect service 提取 short code
  4. 查询 cache
  5. Cache miss 时查询数据库
  6. 检查状态和过期时间
  7. 返回 HTTP redirect

301 vs 302

Redirect 类型 含义 使用场景
301 永久跳转 静态、不变的链接
302 临时跳转 需要统计和控制的链接

推荐

默认使用 302

原因:


👉 面试回答

我会默认使用 302 redirect, 因为它给系统更多控制能力, 方便做 analytics、过期控制和目标 URL 修改。

如果使用 301,浏览器或客户端可能会缓存跳转结果, 导致后续修改变得困难。


7️⃣ 缓存策略


缓存什么?

shortCode → longUrl

缓存层


缓存挑战


推荐策略


👉 面试回答

因为 redirect 是读多写少, 缓存是最重要的优化之一。

我会将 shortCode 到 longUrl 的映射缓存在 Redis 中, 对于特别热门的链接,也可以放到 edge 层。

但是需要谨慎处理 TTL 和缓存失效, 以支持过期、更新和恶意链接封禁。


8️⃣ 核心权衡


唯一性 vs 简单性


延迟 vs 统计准确性


可用性 vs 一致性


Custom Alias vs Collision Risk


👉 面试回答

URL Shortener 的核心权衡主要是唯一性、延迟和可用性。

创建短链接时,我需要更强的一致性来保证 short code 唯一。 访问短链接时,我会优先保证低延迟和高可用。

Analytics 通常应该异步处理, 因为它不应该拖慢 redirect 路径。


9️⃣ 扩展模式


Pattern 1: Read-heavy Optimization

Redirect 流量远高于创建流量。

可以使用:


Pattern 2: Distributed ID Generation

避免单个数据库成为瓶颈。

可以使用:


Pattern 3: Async Analytics

Redirect 路径:

redirect request → return redirect → publish click event async

Analytics Pipeline:

Kafka / Queue → Stream Processing → Analytics DB

Pattern 4: Database Sharding

分片方式:


Pattern 5: Multi-region Deployment

全球化场景:


👉 面试回答

在大规模场景下,我会优先优化 read-heavy 的 redirect 路径。

我会使用缓存、分布式 ID 生成、异步 analytics 和数据库分片。

对于全球流量, 我会将 redirect service 部署到多个 region, 并将 URL mapping 复制到靠近用户的位置。


🔟 故障处理与边界情况


常见故障


处理策略


👉 面试回答

Redirect 路径应该具备优雅降级能力。

如果 analytics 系统故障,redirect 仍然应该正常工作。 如果数据库短暂不可用, 热门链接可以暂时通过缓存继续服务。

对于不存在或过期的链接, 系统应该返回清晰的错误状态,比如 404 或 410。


1️⃣1️⃣ 安全与滥用防护


风险


防护方式


👉 面试回答

URL Shortener 很容易被用于钓鱼和垃圾链接传播, 所以安全防护是设计中的重要部分。

我会加入限流、恶意 URL 检测、域名黑名单, 以及对异常访问模式的监控。


1️⃣2️⃣ End-to-End Flow


Create Flow

  1. 用户提交 long URL
  2. 校验 URL 格式
  3. 检查安全规则
  4. 生成唯一 ID
  5. 将 ID 编码成 Base62 short code
  6. 保存 mapping
  7. 返回 short URL

Redirect Flow

  1. 用户打开 short URL
  2. 提取 short code
  3. 查询 cache
  4. Cache miss 时查询数据库
  5. 检查过期时间和状态
  6. 异步发送 analytics event
  7. 返回 302 redirect

Key Insight

URL Shortener 表面上很简单, 但真正的设计挑战是构建一个低延迟、高可用的 redirect 系统。


🧠 Staff-Level Answer(最终版)


👉 面试回答(完整背诵版)

在设计 URL Shortener 时, 我会将系统拆成两个核心流程: 短链接创建和短链接跳转。

创建流程需要生成全局唯一的 short code, 通常可以使用分布式 ID 生成器, 然后通过 Base62 编码得到短码。

Redirect 流程是典型的读多写少场景, 所以我会重点通过缓存、读副本和边缘部署来优化低延迟。

我会将 shortCode 到 longUrl 的核心 mapping 和 analytics 数据分开存储, 因为 redirect 路径必须保持低延迟, 而 analytics 可以异步处理。

这个系统的核心权衡包括唯一性、延迟、可用性、 新鲜度和统计准确性。

在大规模场景下, 我会使用分布式 ID 生成、热点链接缓存、数据库分片, 并通过异步 pipeline 处理点击事件。

最终目标是在保证短码唯一、安全和可观测性的前提下, 提供快速且可靠的跳转能力。


⭐ Final Insight

URL Shortener 的本质不是“缩短 URL”, 而是构建一个拥有全局唯一 ID 的高可用 redirect 系统。

Implement