d&d-t System Design Deep Dive ·

🎯 Design API Gateway

1️⃣ Core Framework

When discussing API Gateway design, I frame it as:

Request routing and service discovery
Authentication and authorization
Rate limiting and throttling
Request / response transformation
Load balancing and resilience
Observability and logging
Security and policy enforcement
Trade-offs: latency vs control vs reliability

2️⃣ Core Requirements

Functional Requirements

Route client requests to backend services
Support path-based and host-based routing
Authenticate requests
Authorize access to APIs
Enforce rate limits
Support TLS termination
Support request validation
Support request / response transformation
Support API versioning
Support logging, metrics, and tracing

Non-functional Requirements

Low latency
High availability
High throughput
Scalable routing
Secure by default
Fault isolation
Good observability
Graceful degradation

👉 Interview Answer

An API Gateway is the entry point for client traffic.

It handles routing, authentication, authorization, rate limiting, TLS termination, request validation, observability, and resilience policies.

The main challenge is enforcing cross-cutting concerns without adding too much latency or becoming a single point of failure.

3️⃣ Core Concepts

API Gateway

A centralized entry layer between clients and backend services.

Client → API Gateway → Backend Services

Route

A route maps an incoming request to a backend service.

Example:

GET /api/orders/{id} → order-service
POST /api/payments → payment-service

Policy

A policy defines behavior applied at the gateway.

Examples:

Auth policy
Rate limit policy
Retry policy
Timeout policy
Logging policy

Upstream Service

The backend service that receives the request.

👉 Interview Answer

I would treat the API Gateway as a policy enforcement and routing layer.

It should centralize cross-cutting concerns like auth, rate limiting, TLS, observability, and traffic control, while keeping business logic inside backend services.

4️⃣ Main APIs / Config

Route Config

{
  "routeId": "orders-get",
  "method": "GET",
  "path": "/api/orders/{orderId}",
  "upstreamService": "order-service",
  "authRequired": true,
  "rateLimitPolicy": "standard-user"
}

Rate Limit Policy

{
  "policyId": "standard-user",
  "limit": 1000,
  "window": "1m",
  "scope": "userId"
}

Service Registry Entry

{
  "serviceName": "order-service",
  "instances": [
    {
      "host": "10.0.1.10",
      "port": 8080,
      "healthy": true
    }
  ]
}

Gateway Admin API

POST /api/gateway/routes
PATCH /api/gateway/routes/{routeId}
GET /api/gateway/metrics

👉 Interview Answer

The gateway is mostly driven by configuration.

Route configs define where traffic goes, policies define how requests are handled, and service discovery tells the gateway which backend instances are healthy.

5️⃣ High-Level Architecture

Client
→ DNS / Global Load Balancer
→ API Gateway Cluster
→ Auth / Policy Engine
→ Rate Limiter
→ Router
→ Load Balancer
→ Backend Services

Gateway Logs / Metrics / Traces
→ Observability Pipeline

Main Components

Listener

Accepts HTTP / HTTPS requests
Handles TLS termination

Auth Module

Validates tokens or API keys
Extracts user / tenant context

Policy Engine

Applies route-specific rules
Enforces auth, quotas, validation, and transformations

Rate Limiter

Protects backend services
Enforces user / tenant / IP limits

Router

Matches request path and method
Selects upstream service

Load Balancer

Chooses healthy backend instance

👉 Interview Answer

The gateway receives the request, terminates TLS, authenticates the caller, applies policies, checks rate limits, routes the request, load balances to a healthy backend, and records logs, metrics, and traces.

6️⃣ Request Flow

Client sends request
→ Gateway terminates TLS
→ Match route
→ Authenticate request
→ Authorize access
→ Validate request
→ Apply rate limit
→ Transform request if needed
→ Select backend service
→ Forward request
→ Receive response
→ Transform response if needed
→ Return response to client
→ Emit logs/metrics/traces

👉 Interview Answer

Request processing should be modular.

Each stage handles one responsibility: route matching, authentication, authorization, rate limiting, validation, transformation, forwarding, and observability.

This makes policies easier to configure and reason about.

7️⃣ Routing

Routing Types

Path-based Routing

/api/users/* → user-service
/api/orders/* → order-service

Host-based Routing

api.example.com → public-api
admin.example.com → admin-api

Header-based Routing

X-Version: v2 → service-v2

Weighted Routing

90% → service-v1
10% → service-v2

Used for:

Canary release
Blue-green deployment
A/B testing

👉 Interview Answer

The gateway should support path-based, host-based, header-based, and weighted routing.

Weighted routing is useful for canary deployments and gradual rollout of new service versions.

8️⃣ Authentication and Authorization

Authentication

Common methods:

JWT
OAuth2
API key
mTLS
Session cookie
Service-to-service token

Gateway Responsibilities

Validate token signature
Check token expiration
Extract claims
Attach identity context to request
Reject invalid requests early

Authorization

Can happen at:

Gateway level: coarse-grained access
Service level: fine-grained business permission

👉 Interview Answer

The gateway should handle coarse-grained authentication and basic authorization.

It can validate JWTs or API keys, extract user and tenant context, and reject unauthorized requests early.

Fine-grained business authorization should still live in backend services.

9️⃣ Rate Limiting and Throttling

Why Needed?

Protect system from:

Abuse
DDoS-like traffic
Buggy clients
Noisy tenants
Backend overload

Common Limit Dimensions

IP address
User ID
Tenant ID
API key
Route
Service
Region

Algorithms

Token bucket
Leaky bucket
Fixed window
Sliding window
Distributed counters

Example

tenant t123:
1000 requests/minute for /api/orders

👉 Interview Answer

Rate limiting protects backend services and enforces fairness.

I would support limits by user, tenant, API key, IP, and route.

Token bucket is a good default because it allows controlled bursts while enforcing average rate.

🔟 Request Validation and Transformation

Request Validation

Validate:

Required headers
Query parameters
JSON schema
Payload size
Content type
API version

Request Transformation

Examples:

Add user context headers
Rewrite path
Convert external API format to internal format
Remove sensitive headers
Add correlation ID

Response Transformation

Examples:

Normalize error response
Remove internal fields
Compress response
Add caching headers

👉 Interview Answer

The gateway can validate request shape and apply lightweight transformations.

However, heavy business logic should not live in the gateway, because that makes it harder to maintain and scale independently.

1️⃣1️⃣ Service Discovery and Load Balancing

Service Discovery Options

Static config
DNS
Consul / Eureka
Kubernetes service discovery
Cloud service registry

Load Balancing Strategies

Round robin
Least connections
Random
Weighted
Locality-aware routing
Health-aware routing

Health Checks

Gateway should avoid unhealthy instances.

only route to healthy endpoints

👉 Interview Answer

The gateway needs service discovery to know where backend services are running.

It should use health-aware load balancing and avoid sending traffic to unhealthy instances.

1️⃣2️⃣ Resilience Policies

Timeout

Every upstream call should have a timeout.

Retry

Retry only safe operations.

Good candidates:

GET
idempotent PUT
idempotent DELETE

Be careful with:

POST payment
POST order

Circuit Breaker

Stop sending traffic to failing service temporarily.

Bulkhead

Limit how many resources one backend can consume.

👉 Interview Answer

The gateway should enforce resilience policies like timeouts, limited retries, circuit breakers, and bulkheads.

Retries must be used carefully, especially for non-idempotent operations like payments or order creation.

1️⃣3️⃣ API Versioning

Versioning Approaches

Path Versioning

/api/v1/orders
/api/v2/orders

Header Versioning

Accept-Version: v2

Weighted Version Routing

5% traffic → v2
95% traffic → v1

👉 Interview Answer

The gateway can help with API versioning by routing different versions to different backend services.

Path versioning is simple, while header-based or weighted routing gives more flexibility for gradual migration.

1️⃣4️⃣ Observability

Gateway Should Emit

Access logs
Request count
Error rate
Latency
Upstream latency
Rate limit rejections
Auth failures
Route match failures
Circuit breaker state
Trace IDs

Important Fields

request_id
trace_id
user_id
tenant_id
route_id
upstream_service
status_code
latency_ms

👉 Interview Answer

The gateway is a great place for observability because all external traffic passes through it.

I would emit access logs, metrics, and distributed traces with request ID, route ID, user ID, tenant ID, upstream service, status code, and latency.

1️⃣5️⃣ Security

Security Responsibilities

TLS termination
mTLS for internal services
JWT / API key validation
WAF integration
Request size limits
Header sanitization
CORS policy
IP allowlist / denylist
DDoS protection integration

Important Rule

Gateway is not the only security boundary.

Backend services should still validate critical permissions.

👉 Interview Answer

The gateway should enforce common security controls, including TLS, token validation, rate limits, CORS, request size limits, and header sanitization.

But backend services should still validate sensitive business permissions.

1️⃣6️⃣ Caching

What Can Be Cached?

Public GET responses
Static metadata
Auth public keys / JWKS
Route config
Service discovery data
Rate limit counters

Cache Rules

Respect cache-control headers
Do not cache user-specific sensitive data accidentally
Include tenant/user context in cache key if needed
Use short TTL for dynamic data

👉 Interview Answer

Gateway caching can reduce backend load, especially for public GET requests.

But caching must be safe.

Cache keys must include user or tenant context when responses are personalized.

1️⃣7️⃣ Config Management

Config Includes

Routes
Upstreams
Auth policies
Rate limit policies
Timeout / retry policies
Transform rules
CORS rules

Requirements

Versioned config
Validated before publish
Rollback support
Gradual rollout
Audit trail
Environment-specific config

👉 Interview Answer

Gateway behavior is configuration-driven.

Config changes can affect production traffic immediately, so they should be validated, versioned, audited, and rollbackable.

1️⃣8️⃣ Scaling Patterns

Pattern 1: Stateless Gateway Nodes

Easy horizontal scaling.

Pattern 2: Global Load Balancer

Routes users to nearest healthy region.

Pattern 3: Local Caches

Cache config, JWKS, service discovery, and policies.

Pattern 4: Distributed Rate Limiter

Needed for global limits across gateway nodes.

Pattern 5: Multi-region Deployment

Avoid single-region dependency.

👉 Interview Answer

API Gateway nodes should be mostly stateless, so they can scale horizontally.

Config and discovery data can be cached locally.

For global rate limits, we need a distributed rate limiter or regional limits with reconciliation.

1️⃣9️⃣ Failure Handling

Common Failures

Backend service down
Service discovery stale
Auth provider unavailable
Rate limiter unavailable
Gateway config bad
Upstream timeout
Partial regional outage
DDoS traffic spike

Strategies

Circuit breaker
Health-aware routing
Fallback to cached auth keys
Last-known-good config
Graceful degradation
Retry safe requests
Regional failover
Emergency deny / allow rules

👉 Interview Answer

The gateway should fail safely.

If config service is unavailable, use last-known-good config.

If auth key fetching fails, use cached public keys until TTL expires.

If an upstream is unhealthy, route around it or return a controlled error.

2️⃣0️⃣ Consistency Model

Stronger Consistency Needed For

Security policy changes
Auth revocation
Emergency denylist
Critical route changes
Audit logs

Eventual Consistency Acceptable For

Normal route config propagation
Metrics
Logs aggregation
Service discovery updates
Non-critical rate limit dashboards

👉 Interview Answer

API Gateways use mixed consistency.

Security-sensitive policies and emergency deny rules need fast and reliable propagation.

Normal config changes, metrics, and logs can be eventually consistent.

2️⃣1️⃣ End-to-End Flow

Normal Request Flow

Client request
→ DNS / Load Balancer
→ API Gateway
→ TLS termination
→ Route match
→ Auth validation
→ Rate limit check
→ Request validation
→ Load balance to backend
→ Backend response
→ Gateway logs metrics/traces
→ Return response

Config Update Flow

Admin updates route config
→ Config validation
→ Versioned config saved
→ Config published
→ Gateway nodes pull or receive update
→ Gateways apply new config
→ Metrics monitored

Failure Flow

Backend errors increase
→ Circuit breaker opens
→ Gateway stops routing temporarily
→ Requests fail fast or use fallback
→ Health checks recover service
→ Circuit breaker closes

Key Insight

API Gateway is not just a reverse proxy — it is a centralized traffic control, policy enforcement, and resilience layer.

🧠 Staff-Level Answer (Final)

👉 Interview Answer (Full Version)

When designing an API Gateway, I think of it as the entry point and traffic control layer for backend services.

The gateway handles cross-cutting concerns such as routing, TLS termination, authentication, authorization, rate limiting, request validation, transformation, observability, and resilience policies.

A request first reaches the gateway through DNS or a load balancer. The gateway terminates TLS, matches the route, validates the caller’s token or API key, extracts user and tenant context, checks rate limits, validates the request, and forwards it to a healthy backend instance.

Routing can be path-based, host-based, header-based, or weighted for canary releases.

For authentication, the gateway can validate JWTs, API keys, or mTLS certificates, but backend services should still enforce fine-grained business authorization.

Rate limiting should support dimensions such as IP, user, tenant, API key, route, and service.

Token bucket is a good default because it supports bursts while controlling average rate.

The gateway should enforce timeouts, retries for safe idempotent requests, circuit breakers, and health-aware load balancing.

API Gateway nodes should be mostly stateless and horizontally scalable.

Route config, service discovery data, auth public keys, and policies can be cached locally.

For failure handling, the gateway should use last-known-good config, cached auth keys, health checks, circuit breakers, and regional failover.

The main trade-offs are latency, reliability, security, operational complexity, and how much logic belongs in the gateway versus backend services.

Ultimately, the goal is to provide a secure, reliable, observable, and scalable entry point for all API traffic without turning the gateway into a business-logic bottleneck.

⭐ Final Insight

API Gateway 的核心不是简单反向代理，而是一个集 routing、auth、rate limiting、observability、resilience 和 traffic control 于一体的入口控制层。

中文部分

🎯 Design API Gateway

1️⃣ 核心框架

在设计 API Gateway 时，我通常从以下几个方面分析：

Request routing 和 service discovery
Authentication 和 authorization
Rate limiting 和 throttling
Request / response transformation
Load balancing 和 resilience
Observability 和 logging
Security 和 policy enforcement
核心权衡：latency vs control vs reliability

2️⃣ 核心需求

功能需求

将 client requests 路由到 backend services
支持 path-based 和 host-based routing
认证 requests
授权 API access
执行 rate limits
支持 TLS termination
支持 request validation
支持 request / response transformation
支持 API versioning
支持 logging、metrics 和 tracing

非功能需求

低延迟
高可用
高吞吐
可扩展 routing
默认安全
故障隔离
良好 observability
优雅降级

👉 面试回答

API Gateway 是 client traffic 的入口。

它处理 routing、authentication、authorization、 rate limiting、TLS termination、request validation、 observability 和 resilience policies。

核心挑战是在执行这些 cross-cutting concerns 的同时，不引入过多 latency，也不能成为 single point of failure。

3️⃣ 核心概念

API Gateway

位于 clients 和 backend services 之间的统一入口层。

Client → API Gateway → Backend Services

Route

Route 将 incoming request 映射到 backend service。

示例：

GET /api/orders/{id} → order-service
POST /api/payments → payment-service

Policy

Policy 定义 gateway 上应用的行为。

例如：

Auth policy
Rate limit policy
Retry policy
Timeout policy
Logging policy

Upstream Service

接收请求的 backend service。

👉 面试回答

我会把 API Gateway 看作 policy enforcement 和 routing layer。

它应该集中处理 auth、rate limiting、TLS、 observability 和 traffic control 等通用能力，但 business logic 应该留在 backend services 中。

4️⃣ Main APIs / Config

Route Config

{
  "routeId": "orders-get",
  "method": "GET",
  "path": "/api/orders/{orderId}",
  "upstreamService": "order-service",
  "authRequired": true,
  "rateLimitPolicy": "standard-user"
}

Rate Limit Policy

{
  "policyId": "standard-user",
  "limit": 1000,
  "window": "1m",
  "scope": "userId"
}

Service Registry Entry

{
  "serviceName": "order-service",
  "instances": [
    {
      "host": "10.0.1.10",
      "port": 8080,
      "healthy": true
    }
  ]
}

Gateway Admin API

POST /api/gateway/routes
PATCH /api/gateway/routes/{routeId}
GET /api/gateway/metrics

👉 面试回答

Gateway 通常主要由 configuration 驱动。

Route configs 定义 traffic 去哪里， policies 定义如何处理 requests， service discovery 告诉 gateway 哪些 backend instances 是健康的。

5️⃣ High-Level Architecture

Client
→ DNS / Global Load Balancer
→ API Gateway Cluster
→ Auth / Policy Engine
→ Rate Limiter
→ Router
→ Load Balancer
→ Backend Services

Gateway Logs / Metrics / Traces
→ Observability Pipeline

Main Components

Listener

接收 HTTP / HTTPS requests
处理 TLS termination

Auth Module

验证 tokens 或 API keys
提取 user / tenant context

Policy Engine

应用 route-specific rules
执行 auth、quotas、validation 和 transformations

Rate Limiter

保护 backend services
执行 user / tenant / IP limits

Router

匹配 request path 和 method
选择 upstream service

Load Balancer

选择健康 backend instance

👉 面试回答

Gateway 接收 request， terminate TLS， authenticate caller，应用 policies，检查 rate limits，路由请求， load balance 到健康 backend，并记录 logs、metrics 和 traces。

6️⃣ Request Flow

Client sends request
→ Gateway terminates TLS
→ Match route
→ Authenticate request
→ Authorize access
→ Validate request
→ Apply rate limit
→ Transform request if needed
→ Select backend service
→ Forward request
→ Receive response
→ Transform response if needed
→ Return response to client
→ Emit logs/metrics/traces

👉 面试回答

Request processing 应该模块化。

每个阶段负责一个职责： route matching、authentication、authorization、 rate limiting、validation、transformation、 forwarding 和 observability。

这样 policies 更容易配置和理解。

7️⃣ Routing

Routing Types

Path-based Routing

/api/users/* → user-service
/api/orders/* → order-service

Host-based Routing

api.example.com → public-api
admin.example.com → admin-api

Header-based Routing

X-Version: v2 → service-v2

Weighted Routing

90% → service-v1
10% → service-v2

用于：

Canary release
Blue-green deployment
A/B testing

👉 面试回答

Gateway 应该支持 path-based、host-based、 header-based 和 weighted routing。

Weighted routing 对 canary deployments 和新 service version 的 gradual rollout 很有用。

8️⃣ Authentication and Authorization

Authentication

常见方式：

JWT
OAuth2
API key
mTLS
Session cookie
Service-to-service token

Gateway Responsibilities

Validate token signature
Check token expiration
Extract claims
Attach identity context to request
Reject invalid requests early

Authorization

可以发生在：

Gateway level: coarse-grained access
Service level: fine-grained business permission

👉 面试回答

Gateway 应该处理 coarse-grained authentication 和基础 authorization。

它可以验证 JWTs 或 API keys，提取 user 和 tenant context，并提前拒绝 unauthorized requests。

Fine-grained business authorization 仍然应该放在 backend services 中。

9️⃣ Rate Limiting and Throttling

Why Needed?

保护系统免受：

Abuse
DDoS-like traffic
Buggy clients
Noisy tenants
Backend overload

Common Limit Dimensions

IP address
User ID
Tenant ID
API key
Route
Service
Region

Algorithms

Token bucket
Leaky bucket
Fixed window
Sliding window
Distributed counters

Example

tenant t123:
1000 requests/minute for /api/orders

👉 面试回答

Rate limiting 用来保护 backend services 并保证 fairness。

我会支持按 user、tenant、API key、IP 和 route 限流。

Token bucket 是好的默认选择，因为它允许受控 burst，同时限制平均速率。

🔟 Request Validation and Transformation

Request Validation

验证：

Required headers
Query parameters
JSON schema
Payload size
Content type
API version

Request Transformation

示例：

Add user context headers
Rewrite path
Convert external API format to internal format
Remove sensitive headers
Add correlation ID

Response Transformation

示例：

Normalize error response
Remove internal fields
Compress response
Add caching headers

👉 面试回答

Gateway 可以验证 request shape 并执行轻量 transformations。

但是 heavy business logic 不应该放在 gateway，否则维护和独立扩展会变得困难。

1️⃣1️⃣ Service Discovery and Load Balancing

Service Discovery Options

Static config
DNS
Consul / Eureka
Kubernetes service discovery
Cloud service registry

Load Balancing Strategies

Round robin
Least connections
Random
Weighted
Locality-aware routing
Health-aware routing

Health Checks

Gateway 应该避免 unhealthy instances。

only route to healthy endpoints

👉 面试回答

Gateway 需要 service discovery，才知道 backend services 运行在哪里。

它应该使用 health-aware load balancing，避免把流量发送到 unhealthy instances。

1️⃣2️⃣ Resilience Policies

Timeout

每个 upstream call 都应该有 timeout。

Retry

只 retry 安全操作。

适合：

GET
idempotent PUT
idempotent DELETE

谨慎：

POST payment
POST order

Circuit Breaker

临时停止向失败 service 发送流量。

Bulkhead

限制某个 backend 消耗的资源量。

👉 面试回答

Gateway 应该执行 resilience policies，例如 timeouts、limited retries、 circuit breakers 和 bulkheads。

Retries 必须谨慎使用，尤其是 payments 或 order creation 这类 non-idempotent operations。

1️⃣3️⃣ API Versioning

Versioning Approaches

Path Versioning

/api/v1/orders
/api/v2/orders

Header Versioning

Accept-Version: v2

Weighted Version Routing

5% traffic → v2
95% traffic → v1

👉 面试回答

Gateway 可以帮助处理 API versioning，将不同版本路由到不同 backend services。

Path versioning 简单； header-based 或 weighted routing 更适合 gradual migration。

1️⃣4️⃣ Observability

Gateway Should Emit

Access logs
Request count
Error rate
Latency
Upstream latency
Rate limit rejections
Auth failures
Route match failures
Circuit breaker state
Trace IDs

Important Fields

request_id
trace_id
user_id
tenant_id
route_id
upstream_service
status_code
latency_ms

👉 面试回答

Gateway 是 observability 的好位置，因为所有 external traffic 都经过它。

我会输出 access logs、metrics 和 distributed traces，包含 request ID、route ID、user ID、 tenant ID、upstream service、status code 和 latency。

1️⃣5️⃣ Security

Security Responsibilities

TLS termination
mTLS for internal services
JWT / API key validation
WAF integration
Request size limits
Header sanitization
CORS policy
IP allowlist / denylist
DDoS protection integration

Important Rule

Gateway 不是唯一安全边界。

Backend services 仍然应该验证关键权限。

👉 面试回答

Gateway 应该执行通用安全控制，包括 TLS、token validation、rate limits、 CORS、request size limits 和 header sanitization。

但是 backend services 仍然应该验证敏感业务权限。

1️⃣6️⃣ Caching

What Can Be Cached?

Public GET responses
Static metadata
Auth public keys / JWKS
Route config
Service discovery data
Rate limit counters

Cache Rules

尊重 cache-control headers
不要误缓存 user-specific sensitive data
如有需要，cache key 包含 tenant / user context
Dynamic data 使用短 TTL

👉 面试回答

Gateway caching 可以降低 backend load，特别适合 public GET requests。

但 caching 必须安全。

如果 response 是 personalized， cache key 必须包含 user 或 tenant context。

1️⃣7️⃣ Config Management

Config Includes

Routes
Upstreams
Auth policies
Rate limit policies
Timeout / retry policies
Transform rules
CORS rules

Requirements

Versioned config
Validated before publish
Rollback support
Gradual rollout
Audit trail
Environment-specific config

👉 面试回答

Gateway behavior 是 configuration-driven。

Config changes 可以立即影响 production traffic，所以它们必须被 validated、versioned、 audited，并支持 rollback。

1️⃣8️⃣ Scaling Patterns

Pattern 1: Stateless Gateway Nodes

方便 horizontal scaling。

Pattern 2: Global Load Balancer

将用户路由到最近健康 region。

Pattern 3: Local Caches

缓存 config、JWKS、service discovery 和 policies。

Pattern 4: Distributed Rate Limiter

用于跨 gateway nodes 的 global limits。

Pattern 5: Multi-region Deployment

避免 single-region dependency。

👉 面试回答

API Gateway nodes 应该尽量 stateless，这样可以水平扩展。

Config 和 discovery data 可以本地缓存。

对 global rate limits，需要 distributed rate limiter，或使用 regional limits 加 reconciliation。

1️⃣9️⃣ Failure Handling

Common Failures

Backend service down
Service discovery stale
Auth provider unavailable
Rate limiter unavailable
Gateway config bad
Upstream timeout
Partial regional outage
DDoS traffic spike

Strategies

Circuit breaker
Health-aware routing
Fallback to cached auth keys
Last-known-good config
Graceful degradation
Retry safe requests
Regional failover
Emergency deny / allow rules

👉 面试回答

Gateway 应该 fail safely。

如果 config service 不可用，使用 last-known-good config。

如果 auth key fetching 失败，在 TTL 内使用 cached public keys。

如果 upstream 不健康， gateway 应该绕开它或返回受控错误。

2️⃣0️⃣ Consistency Model

需要较强一致性的场景

Security policy changes
Auth revocation
Emergency denylist
Critical route changes
Audit logs

可以最终一致的场景

Normal route config propagation
Metrics
Logs aggregation
Service discovery updates
Non-critical rate limit dashboards

👉 面试回答

API Gateway 使用 mixed consistency。

Security-sensitive policies 和 emergency deny rules 需要快速且可靠地传播。

Normal config changes、metrics 和 logs 可以最终一致。

2️⃣1️⃣ End-to-End Flow

Normal Request Flow

Client request
→ DNS / Load Balancer
→ API Gateway
→ TLS termination
→ Route match
→ Auth validation
→ Rate limit check
→ Request validation
→ Load balance to backend
→ Backend response
→ Gateway logs metrics/traces
→ Return response

Config Update Flow

Admin updates route config
→ Config validation
→ Versioned config saved
→ Config published
→ Gateway nodes pull or receive update
→ Gateways apply new config
→ Metrics monitored

Failure Flow

Backend errors increase
→ Circuit breaker opens
→ Gateway stops routing temporarily
→ Requests fail fast or use fallback
→ Health checks recover service
→ Circuit breaker closes

Key Insight

API Gateway 不是简单 reverse proxy，而是 centralized traffic control、policy enforcement 和 resilience layer。

🧠 Staff-Level Answer（最终版）

👉 面试回答（完整背诵版）

在设计 API Gateway 时，我会把它看作 backend services 的入口和 traffic control layer。

Gateway 负责处理 cross-cutting concerns，包括 routing、TLS termination、authentication、 authorization、rate limiting、request validation、 transformation、observability 和 resilience policies。

一个 request 首先通过 DNS 或 load balancer 到达 gateway。 Gateway terminate TLS，匹配 route，验证 caller token 或 API key，提取 user 和 tenant context，检查 rate limits，验证 request，然后转发到健康的 backend instance。

Routing 可以是 path-based、host-based、 header-based，或者用于 canary release 的 weighted routing。

对 authentication， gateway 可以验证 JWTs、API keys 或 mTLS certificates，但 backend services 仍然应该执行 fine-grained business authorization。

Rate limiting 应该支持 IP、user、tenant、 API key、route 和 service 等维度。

Token bucket 是好的默认选择，因为它支持 burst，同时控制平均速率。

Gateway 应该执行 timeouts、对安全幂等请求执行 retries、使用 circuit breakers，并执行 health-aware load balancing。

API Gateway nodes 应该尽量 stateless，方便水平扩展。

Route config、service discovery data、 auth public keys 和 policies 可以本地缓存。

对 failure handling， gateway 应该使用 last-known-good config、 cached auth keys、health checks、 circuit breakers 和 regional failover。

核心权衡包括 latency、reliability、security、 operational complexity，以及哪些逻辑应该放在 gateway，哪些应该留在 backend services。

最终目标是为所有 API traffic 提供一个 secure、reliable、observable 和 scalable 的入口，但不要让 gateway 变成 business-logic bottleneck。

⭐ Final Insight

API Gateway 的核心不是简单反向代理，而是一个集 routing、auth、rate limiting、observability、resilience 和 traffic control 于一体的入口控制层。