System Design Deep Dive - 30 Design API Gateway

Post by ailswan May. 23, 2026

中文 ↓

🎯 Design API Gateway

1️⃣ Core Framework

When discussing API Gateway design, I frame it as:

  1. Request routing and service discovery
  2. Authentication and authorization
  3. Rate limiting and throttling
  4. Request / response transformation
  5. Load balancing and resilience
  6. Observability and logging
  7. Security and policy enforcement
  8. Trade-offs: latency vs control vs reliability

2️⃣ Core Requirements


Functional Requirements


Non-functional Requirements


👉 Interview Answer

An API Gateway is the entry point for client traffic.

It handles routing, authentication, authorization, rate limiting, TLS termination, request validation, observability, and resilience policies.

The main challenge is enforcing cross-cutting concerns without adding too much latency or becoming a single point of failure.


3️⃣ Core Concepts


API Gateway

A centralized entry layer between clients and backend services.

Client → API Gateway → Backend Services

Route

A route maps an incoming request to a backend service.

Example:

GET /api/orders/{id} → order-service
POST /api/payments → payment-service

Policy

A policy defines behavior applied at the gateway.

Examples:


Upstream Service

The backend service that receives the request.


👉 Interview Answer

I would treat the API Gateway as a policy enforcement and routing layer.

It should centralize cross-cutting concerns like auth, rate limiting, TLS, observability, and traffic control, while keeping business logic inside backend services.


4️⃣ Main APIs / Config


Route Config

{
  "routeId": "orders-get",
  "method": "GET",
  "path": "/api/orders/{orderId}",
  "upstreamService": "order-service",
  "authRequired": true,
  "rateLimitPolicy": "standard-user"
}

Rate Limit Policy

{
  "policyId": "standard-user",
  "limit": 1000,
  "window": "1m",
  "scope": "userId"
}

Service Registry Entry

{
  "serviceName": "order-service",
  "instances": [
    {
      "host": "10.0.1.10",
      "port": 8080,
      "healthy": true
    }
  ]
}

Gateway Admin API

POST /api/gateway/routes
PATCH /api/gateway/routes/{routeId}
GET /api/gateway/metrics

👉 Interview Answer

The gateway is mostly driven by configuration.

Route configs define where traffic goes, policies define how requests are handled, and service discovery tells the gateway which backend instances are healthy.


5️⃣ High-Level Architecture


Client
→ DNS / Global Load Balancer
→ API Gateway Cluster
→ Auth / Policy Engine
→ Rate Limiter
→ Router
→ Load Balancer
→ Backend Services

Gateway Logs / Metrics / Traces
→ Observability Pipeline

Main Components

Listener


Auth Module


Policy Engine


Rate Limiter


Router


Load Balancer


👉 Interview Answer

The gateway receives the request, terminates TLS, authenticates the caller, applies policies, checks rate limits, routes the request, load balances to a healthy backend, and records logs, metrics, and traces.


6️⃣ Request Flow


Client sends request
→ Gateway terminates TLS
→ Match route
→ Authenticate request
→ Authorize access
→ Validate request
→ Apply rate limit
→ Transform request if needed
→ Select backend service
→ Forward request
→ Receive response
→ Transform response if needed
→ Return response to client
→ Emit logs/metrics/traces

👉 Interview Answer

Request processing should be modular.

Each stage handles one responsibility: route matching, authentication, authorization, rate limiting, validation, transformation, forwarding, and observability.

This makes policies easier to configure and reason about.


7️⃣ Routing


Routing Types

Path-based Routing

/api/users/* → user-service
/api/orders/* → order-service

Host-based Routing

api.example.com → public-api
admin.example.com → admin-api

Header-based Routing

X-Version: v2 → service-v2

Weighted Routing

90% → service-v1
10% → service-v2

Used for:


👉 Interview Answer

The gateway should support path-based, host-based, header-based, and weighted routing.

Weighted routing is useful for canary deployments and gradual rollout of new service versions.


8️⃣ Authentication and Authorization


Authentication

Common methods:


Gateway Responsibilities


Authorization

Can happen at:

Gateway level: coarse-grained access
Service level: fine-grained business permission

👉 Interview Answer

The gateway should handle coarse-grained authentication and basic authorization.

It can validate JWTs or API keys, extract user and tenant context, and reject unauthorized requests early.

Fine-grained business authorization should still live in backend services.


9️⃣ Rate Limiting and Throttling


Why Needed?

Protect system from:


Common Limit Dimensions


Algorithms


Example

tenant t123:
1000 requests/minute for /api/orders

👉 Interview Answer

Rate limiting protects backend services and enforces fairness.

I would support limits by user, tenant, API key, IP, and route.

Token bucket is a good default because it allows controlled bursts while enforcing average rate.


🔟 Request Validation and Transformation


Request Validation

Validate:


Request Transformation

Examples:


Response Transformation

Examples:


👉 Interview Answer

The gateway can validate request shape and apply lightweight transformations.

However, heavy business logic should not live in the gateway, because that makes it harder to maintain and scale independently.


1️⃣1️⃣ Service Discovery and Load Balancing


Service Discovery Options


Load Balancing Strategies


Health Checks

Gateway should avoid unhealthy instances.

only route to healthy endpoints

👉 Interview Answer

The gateway needs service discovery to know where backend services are running.

It should use health-aware load balancing and avoid sending traffic to unhealthy instances.


1️⃣2️⃣ Resilience Policies


Timeout

Every upstream call should have a timeout.


Retry

Retry only safe operations.

Good candidates:

GET
idempotent PUT
idempotent DELETE

Be careful with:

POST payment
POST order

Circuit Breaker

Stop sending traffic to failing service temporarily.


Bulkhead

Limit how many resources one backend can consume.


👉 Interview Answer

The gateway should enforce resilience policies like timeouts, limited retries, circuit breakers, and bulkheads.

Retries must be used carefully, especially for non-idempotent operations like payments or order creation.


1️⃣3️⃣ API Versioning


Versioning Approaches

Path Versioning

/api/v1/orders
/api/v2/orders

Header Versioning

Accept-Version: v2

Weighted Version Routing

5% traffic → v2
95% traffic → v1

👉 Interview Answer

The gateway can help with API versioning by routing different versions to different backend services.

Path versioning is simple, while header-based or weighted routing gives more flexibility for gradual migration.


1️⃣4️⃣ Observability


Gateway Should Emit


Important Fields

request_id
trace_id
user_id
tenant_id
route_id
upstream_service
status_code
latency_ms

👉 Interview Answer

The gateway is a great place for observability because all external traffic passes through it.

I would emit access logs, metrics, and distributed traces with request ID, route ID, user ID, tenant ID, upstream service, status code, and latency.


1️⃣5️⃣ Security


Security Responsibilities


Important Rule

Gateway is not the only security boundary.

Backend services should still validate critical permissions.


👉 Interview Answer

The gateway should enforce common security controls, including TLS, token validation, rate limits, CORS, request size limits, and header sanitization.

But backend services should still validate sensitive business permissions.


1️⃣6️⃣ Caching


What Can Be Cached?


Cache Rules


👉 Interview Answer

Gateway caching can reduce backend load, especially for public GET requests.

But caching must be safe.

Cache keys must include user or tenant context when responses are personalized.


1️⃣7️⃣ Config Management


Config Includes


Requirements


👉 Interview Answer

Gateway behavior is configuration-driven.

Config changes can affect production traffic immediately, so they should be validated, versioned, audited, and rollbackable.


1️⃣8️⃣ Scaling Patterns


Pattern 1: Stateless Gateway Nodes

Easy horizontal scaling.


Pattern 2: Global Load Balancer

Routes users to nearest healthy region.


Pattern 3: Local Caches

Cache config, JWKS, service discovery, and policies.


Pattern 4: Distributed Rate Limiter

Needed for global limits across gateway nodes.


Pattern 5: Multi-region Deployment

Avoid single-region dependency.


👉 Interview Answer

API Gateway nodes should be mostly stateless, so they can scale horizontally.

Config and discovery data can be cached locally.

For global rate limits, we need a distributed rate limiter or regional limits with reconciliation.


1️⃣9️⃣ Failure Handling


Common Failures


Strategies


👉 Interview Answer

The gateway should fail safely.

If config service is unavailable, use last-known-good config.

If auth key fetching fails, use cached public keys until TTL expires.

If an upstream is unhealthy, route around it or return a controlled error.


2️⃣0️⃣ Consistency Model


Stronger Consistency Needed For


Eventual Consistency Acceptable For


👉 Interview Answer

API Gateways use mixed consistency.

Security-sensitive policies and emergency deny rules need fast and reliable propagation.

Normal config changes, metrics, and logs can be eventually consistent.


2️⃣1️⃣ End-to-End Flow


Normal Request Flow

Client request
→ DNS / Load Balancer
→ API Gateway
→ TLS termination
→ Route match
→ Auth validation
→ Rate limit check
→ Request validation
→ Load balance to backend
→ Backend response
→ Gateway logs metrics/traces
→ Return response

Config Update Flow

Admin updates route config
→ Config validation
→ Versioned config saved
→ Config published
→ Gateway nodes pull or receive update
→ Gateways apply new config
→ Metrics monitored

Failure Flow

Backend errors increase
→ Circuit breaker opens
→ Gateway stops routing temporarily
→ Requests fail fast or use fallback
→ Health checks recover service
→ Circuit breaker closes

Key Insight

API Gateway is not just a reverse proxy — it is a centralized traffic control, policy enforcement, and resilience layer.


🧠 Staff-Level Answer (Final)


👉 Interview Answer (Full Version)

When designing an API Gateway, I think of it as the entry point and traffic control layer for backend services.

The gateway handles cross-cutting concerns such as routing, TLS termination, authentication, authorization, rate limiting, request validation, transformation, observability, and resilience policies.

A request first reaches the gateway through DNS or a load balancer. The gateway terminates TLS, matches the route, validates the caller’s token or API key, extracts user and tenant context, checks rate limits, validates the request, and forwards it to a healthy backend instance.

Routing can be path-based, host-based, header-based, or weighted for canary releases.

For authentication, the gateway can validate JWTs, API keys, or mTLS certificates, but backend services should still enforce fine-grained business authorization.

Rate limiting should support dimensions such as IP, user, tenant, API key, route, and service.

Token bucket is a good default because it supports bursts while controlling average rate.

The gateway should enforce timeouts, retries for safe idempotent requests, circuit breakers, and health-aware load balancing.

API Gateway nodes should be mostly stateless and horizontally scalable.

Route config, service discovery data, auth public keys, and policies can be cached locally.

For failure handling, the gateway should use last-known-good config, cached auth keys, health checks, circuit breakers, and regional failover.

The main trade-offs are latency, reliability, security, operational complexity, and how much logic belongs in the gateway versus backend services.

Ultimately, the goal is to provide a secure, reliable, observable, and scalable entry point for all API traffic without turning the gateway into a business-logic bottleneck.


⭐ Final Insight

API Gateway 的核心不是简单反向代理, 而是一个集 routing、auth、rate limiting、observability、resilience 和 traffic control 于一体的入口控制层。



中文部分


🎯 Design API Gateway


1️⃣ 核心框架

在设计 API Gateway 时,我通常从以下几个方面分析:

  1. Request routing 和 service discovery
  2. Authentication 和 authorization
  3. Rate limiting 和 throttling
  4. Request / response transformation
  5. Load balancing 和 resilience
  6. Observability 和 logging
  7. Security 和 policy enforcement
  8. 核心权衡:latency vs control vs reliability

2️⃣ 核心需求


功能需求


非功能需求


👉 面试回答

API Gateway 是 client traffic 的入口。

它处理 routing、authentication、authorization、 rate limiting、TLS termination、request validation、 observability 和 resilience policies。

核心挑战是在执行这些 cross-cutting concerns 的同时, 不引入过多 latency, 也不能成为 single point of failure。


3️⃣ 核心概念


API Gateway

位于 clients 和 backend services 之间的统一入口层。

Client → API Gateway → Backend Services

Route

Route 将 incoming request 映射到 backend service。

示例:

GET /api/orders/{id} → order-service
POST /api/payments → payment-service

Policy

Policy 定义 gateway 上应用的行为。

例如:


Upstream Service

接收请求的 backend service。


👉 面试回答

我会把 API Gateway 看作 policy enforcement 和 routing layer。

它应该集中处理 auth、rate limiting、TLS、 observability 和 traffic control 等通用能力, 但 business logic 应该留在 backend services 中。


4️⃣ Main APIs / Config


Route Config

{
  "routeId": "orders-get",
  "method": "GET",
  "path": "/api/orders/{orderId}",
  "upstreamService": "order-service",
  "authRequired": true,
  "rateLimitPolicy": "standard-user"
}

Rate Limit Policy

{
  "policyId": "standard-user",
  "limit": 1000,
  "window": "1m",
  "scope": "userId"
}

Service Registry Entry

{
  "serviceName": "order-service",
  "instances": [
    {
      "host": "10.0.1.10",
      "port": 8080,
      "healthy": true
    }
  ]
}

Gateway Admin API

POST /api/gateway/routes
PATCH /api/gateway/routes/{routeId}
GET /api/gateway/metrics

👉 面试回答

Gateway 通常主要由 configuration 驱动。

Route configs 定义 traffic 去哪里, policies 定义如何处理 requests, service discovery 告诉 gateway 哪些 backend instances 是健康的。


5️⃣ High-Level Architecture


Client
→ DNS / Global Load Balancer
→ API Gateway Cluster
→ Auth / Policy Engine
→ Rate Limiter
→ Router
→ Load Balancer
→ Backend Services

Gateway Logs / Metrics / Traces
→ Observability Pipeline

Main Components

Listener


Auth Module


Policy Engine


Rate Limiter


Router


Load Balancer


👉 面试回答

Gateway 接收 request, terminate TLS, authenticate caller, 应用 policies, 检查 rate limits, 路由请求, load balance 到健康 backend, 并记录 logs、metrics 和 traces。


6️⃣ Request Flow


Client sends request
→ Gateway terminates TLS
→ Match route
→ Authenticate request
→ Authorize access
→ Validate request
→ Apply rate limit
→ Transform request if needed
→ Select backend service
→ Forward request
→ Receive response
→ Transform response if needed
→ Return response to client
→ Emit logs/metrics/traces

👉 面试回答

Request processing 应该模块化。

每个阶段负责一个职责: route matching、authentication、authorization、 rate limiting、validation、transformation、 forwarding 和 observability。

这样 policies 更容易配置和理解。


7️⃣ Routing


Routing Types

Path-based Routing

/api/users/* → user-service
/api/orders/* → order-service

Host-based Routing

api.example.com → public-api
admin.example.com → admin-api

Header-based Routing

X-Version: v2 → service-v2

Weighted Routing

90% → service-v1
10% → service-v2

用于:


👉 面试回答

Gateway 应该支持 path-based、host-based、 header-based 和 weighted routing。

Weighted routing 对 canary deployments 和新 service version 的 gradual rollout 很有用。


8️⃣ Authentication and Authorization


Authentication

常见方式:


Gateway Responsibilities


Authorization

可以发生在:

Gateway level: coarse-grained access
Service level: fine-grained business permission

👉 面试回答

Gateway 应该处理 coarse-grained authentication 和基础 authorization。

它可以验证 JWTs 或 API keys, 提取 user 和 tenant context, 并提前拒绝 unauthorized requests。

Fine-grained business authorization 仍然应该放在 backend services 中。


9️⃣ Rate Limiting and Throttling


Why Needed?

保护系统免受:


Common Limit Dimensions


Algorithms


Example

tenant t123:
1000 requests/minute for /api/orders

👉 面试回答

Rate limiting 用来保护 backend services 并保证 fairness。

我会支持按 user、tenant、API key、IP 和 route 限流。

Token bucket 是好的默认选择, 因为它允许受控 burst, 同时限制平均速率。


🔟 Request Validation and Transformation


Request Validation

验证:


Request Transformation

示例:


Response Transformation

示例:


👉 面试回答

Gateway 可以验证 request shape 并执行轻量 transformations。

但是 heavy business logic 不应该放在 gateway, 否则维护和独立扩展会变得困难。


1️⃣1️⃣ Service Discovery and Load Balancing


Service Discovery Options


Load Balancing Strategies


Health Checks

Gateway 应该避免 unhealthy instances。

only route to healthy endpoints

👉 面试回答

Gateway 需要 service discovery, 才知道 backend services 运行在哪里。

它应该使用 health-aware load balancing, 避免把流量发送到 unhealthy instances。


1️⃣2️⃣ Resilience Policies


Timeout

每个 upstream call 都应该有 timeout。


Retry

只 retry 安全操作。

适合:

GET
idempotent PUT
idempotent DELETE

谨慎:

POST payment
POST order

Circuit Breaker

临时停止向失败 service 发送流量。


Bulkhead

限制某个 backend 消耗的资源量。


👉 面试回答

Gateway 应该执行 resilience policies, 例如 timeouts、limited retries、 circuit breakers 和 bulkheads。

Retries 必须谨慎使用, 尤其是 payments 或 order creation 这类 non-idempotent operations。


1️⃣3️⃣ API Versioning


Versioning Approaches

Path Versioning

/api/v1/orders
/api/v2/orders

Header Versioning

Accept-Version: v2

Weighted Version Routing

5% traffic → v2
95% traffic → v1

👉 面试回答

Gateway 可以帮助处理 API versioning, 将不同版本路由到不同 backend services。

Path versioning 简单; header-based 或 weighted routing 更适合 gradual migration。


1️⃣4️⃣ Observability


Gateway Should Emit


Important Fields

request_id
trace_id
user_id
tenant_id
route_id
upstream_service
status_code
latency_ms

👉 面试回答

Gateway 是 observability 的好位置, 因为所有 external traffic 都经过它。

我会输出 access logs、metrics 和 distributed traces, 包含 request ID、route ID、user ID、 tenant ID、upstream service、status code 和 latency。


1️⃣5️⃣ Security


Security Responsibilities


Important Rule

Gateway 不是唯一安全边界。

Backend services 仍然应该验证关键权限。


👉 面试回答

Gateway 应该执行通用安全控制, 包括 TLS、token validation、rate limits、 CORS、request size limits 和 header sanitization。

但是 backend services 仍然应该验证敏感业务权限。


1️⃣6️⃣ Caching


What Can Be Cached?


Cache Rules


👉 面试回答

Gateway caching 可以降低 backend load, 特别适合 public GET requests。

但 caching 必须安全。

如果 response 是 personalized, cache key 必须包含 user 或 tenant context。


1️⃣7️⃣ Config Management


Config Includes


Requirements


👉 面试回答

Gateway behavior 是 configuration-driven。

Config changes 可以立即影响 production traffic, 所以它们必须被 validated、versioned、 audited,并支持 rollback。


1️⃣8️⃣ Scaling Patterns


Pattern 1: Stateless Gateway Nodes

方便 horizontal scaling。


Pattern 2: Global Load Balancer

将用户路由到最近健康 region。


Pattern 3: Local Caches

缓存 config、JWKS、service discovery 和 policies。


Pattern 4: Distributed Rate Limiter

用于跨 gateway nodes 的 global limits。


Pattern 5: Multi-region Deployment

避免 single-region dependency。


👉 面试回答

API Gateway nodes 应该尽量 stateless, 这样可以水平扩展。

Config 和 discovery data 可以本地缓存。

对 global rate limits, 需要 distributed rate limiter, 或使用 regional limits 加 reconciliation。


1️⃣9️⃣ Failure Handling


Common Failures


Strategies


👉 面试回答

Gateway 应该 fail safely。

如果 config service 不可用, 使用 last-known-good config。

如果 auth key fetching 失败, 在 TTL 内使用 cached public keys。

如果 upstream 不健康, gateway 应该绕开它或返回受控错误。


2️⃣0️⃣ Consistency Model


需要较强一致性的场景


可以最终一致的场景


👉 面试回答

API Gateway 使用 mixed consistency。

Security-sensitive policies 和 emergency deny rules 需要快速且可靠地传播。

Normal config changes、metrics 和 logs 可以最终一致。


2️⃣1️⃣ End-to-End Flow


Normal Request Flow

Client request
→ DNS / Load Balancer
→ API Gateway
→ TLS termination
→ Route match
→ Auth validation
→ Rate limit check
→ Request validation
→ Load balance to backend
→ Backend response
→ Gateway logs metrics/traces
→ Return response

Config Update Flow

Admin updates route config
→ Config validation
→ Versioned config saved
→ Config published
→ Gateway nodes pull or receive update
→ Gateways apply new config
→ Metrics monitored

Failure Flow

Backend errors increase
→ Circuit breaker opens
→ Gateway stops routing temporarily
→ Requests fail fast or use fallback
→ Health checks recover service
→ Circuit breaker closes

Key Insight

API Gateway 不是简单 reverse proxy, 而是 centralized traffic control、policy enforcement 和 resilience layer。


🧠 Staff-Level Answer(最终版)


👉 面试回答(完整背诵版)

在设计 API Gateway 时, 我会把它看作 backend services 的入口 和 traffic control layer。

Gateway 负责处理 cross-cutting concerns, 包括 routing、TLS termination、authentication、 authorization、rate limiting、request validation、 transformation、observability 和 resilience policies。

一个 request 首先通过 DNS 或 load balancer 到达 gateway。 Gateway terminate TLS, 匹配 route, 验证 caller token 或 API key, 提取 user 和 tenant context, 检查 rate limits, 验证 request, 然后转发到健康的 backend instance。

Routing 可以是 path-based、host-based、 header-based,或者用于 canary release 的 weighted routing。

对 authentication, gateway 可以验证 JWTs、API keys 或 mTLS certificates, 但 backend services 仍然应该执行 fine-grained business authorization。

Rate limiting 应该支持 IP、user、tenant、 API key、route 和 service 等维度。

Token bucket 是好的默认选择, 因为它支持 burst, 同时控制平均速率。

Gateway 应该执行 timeouts、 对安全幂等请求执行 retries、 使用 circuit breakers, 并执行 health-aware load balancing。

API Gateway nodes 应该尽量 stateless, 方便水平扩展。

Route config、service discovery data、 auth public keys 和 policies 可以本地缓存。

对 failure handling, gateway 应该使用 last-known-good config、 cached auth keys、health checks、 circuit breakers 和 regional failover。

核心权衡包括 latency、reliability、security、 operational complexity, 以及哪些逻辑应该放在 gateway, 哪些应该留在 backend services。

最终目标是为所有 API traffic 提供一个 secure、reliable、observable 和 scalable 的入口, 但不要让 gateway 变成 business-logic bottleneck。


⭐ Final Insight

API Gateway 的核心不是简单反向代理, 而是一个集 routing、auth、rate limiting、observability、resilience 和 traffic control 于一体的入口控制层。

Implement