·

System Design Deep Dive - 07 Global Load Balancing Architectures

Post by ailswan May. 24, 2026

中文 ↓

🎯 Global Load Balancing Architectures


1️⃣ Core Framework

When discussing Global Load Balancing, I frame it as:

  1. Why global load balancing exists
  2. Traffic distribution strategies
  3. DNS-based load balancing
  4. Anycast-based routing
  5. Global load balancers
  6. Health checks and failover
  7. Regional routing strategies
  8. Trade-offs: latency vs availability vs complexity

2️⃣ What Is Global Load Balancing?

Global Load Balancing (GLB) distributes user traffic across multiple regions.

Instead of:

Users

↓

Single Region

We have:

Users

↓

Global Load Balancer

↓

Region A

Region B

Region C

Goals


👉 Interview Memorization

Global Load Balancing distributes traffic across multiple geographic regions to improve latency, availability, fault tolerance, and scalability.


3️⃣ Why Not Use One Region?


Single Region Architecture

Users

↓

US-East

Problems

Long Latency

Asia User

↓

US-East

Regional Failure

US-East ❌

↓

Entire Service Down

Capacity Limits

One Region

↓

Finite Capacity

👉 Interview Memorization

Single-region architectures are simple but suffer from higher latency, lower availability, and limited scalability.


4️⃣ Global Traffic Routing


High-Level Flow

User

↓

Global Routing Layer

↓

Best Region

↓

Application

Core Question

What is the best region?

Possible answers:


👉 Interview Memorization

Global traffic routing determines which region should serve each request based on latency, health, capacity, cost, or business rules.


5️⃣ DNS-Based Global Load Balancing


Architecture

User

↓

DNS

↓

Region A

Example

api.company.com

↓

1.1.1.1

or

api.company.com

↓

2.2.2.2

Routing Logic

DNS can choose:


Advantages


Disadvantages


👉 Interview Memorization

DNS-based load balancing routes users to regions by returning different IP addresses.

It is simple and scalable but suffers from DNS caching delays.


6️⃣ GeoDNS Routing


How It Works

US User

↓

US Region
Europe User

↓

Europe Region

Example

New York

↓

US-East
Paris

↓

Europe-West

Benefit

Lower latency.


Problem

Closest geography does not always mean lowest latency.


👉 Interview Memorization

GeoDNS routes users based on geographic location.

It improves latency but may not always select the fastest available region.


7️⃣ Latency-Based Routing


Principle

Route users to the fastest region.


Example

User

↓

Measure RTT

↓

Select Lowest Latency Region

Benefits


Challenges


👉 Interview Memorization

Latency-based routing selects the region with the lowest observed latency rather than relying purely on geographic proximity.


8️⃣ Weighted Routing


Example

Region A = 80%

Region B = 20%

Use Cases


Example

New Region

↓

10% traffic

↓

Validate

↓

Increase gradually

👉 Interview Memorization

Weighted routing distributes traffic according to predefined percentages and is commonly used for migrations and canary deployments.


9️⃣ Anycast Routing


Concept

Multiple regions advertise the same IP.


Example

Region A

1.1.1.1
Region B

1.1.1.1
Region C

1.1.1.1

Network Routes User

User

↓

Nearest Network Path

↓

Closest Healthy Region

Benefits


Challenges


👉 Interview Memorization

Anycast allows multiple regions to share the same IP address and relies on internet routing protocols to direct users to the nearest healthy location.


🔟 Global Load Balancer Layer


Architecture

Users

↓

Global Load Balancer

↓

Regional Load Balancers

↓

Applications

Responsibilities


Examples


👉 Interview Memorization

Global load balancers sit above regional infrastructure and intelligently route traffic to healthy regions based on routing policies.


1️⃣1️⃣ Health-Based Routing


Principle

Never send traffic to unhealthy regions.


Example

Region A ✓

Region B ❌

Region C ✓

Traffic becomes:

Region A

Region C

Only.


Requirements

Reliable health checks.


👉 Interview Memorization

Health-based routing ensures traffic is only directed to healthy regions and is one of the most important capabilities of a global load balancing system.


1️⃣2️⃣ Active-Active Architecture


Architecture

Region A ✓

Region B ✓

Region C ✓

All regions serve traffic.


Advantages


Challenges


👉 Interview Memorization

Active-active architectures maximize availability and performance but require sophisticated replication and consistency mechanisms.


1️⃣3️⃣ Active-Passive Architecture


Architecture

Primary Region

↓

Standby Region

Advantages


Drawbacks


👉 Interview Memorization

Active-passive architectures are simpler and easier to operate but often result in higher recovery times and lower infrastructure utilization.


1️⃣4️⃣ Traffic Steering Strategies


Common Strategies

Nearest Region

Lowest geographic distance

Lowest Latency

Fastest response time

Least Loaded

Lowest CPU / Request Volume

Cost Optimized

Cheapest Region

Compliance Aware

EU User

↓

EU Region Only

👉 Interview Memorization

Traffic steering strategies can optimize for latency, cost, capacity, compliance, or availability depending on business requirements.


1️⃣5️⃣ Global Failover


Failure Scenario

Region A ❌

Traffic Shift

Region B

Region C

Requirements


Challenge

Avoid overload.


👉 Interview Memorization

Global failover requires both traffic rerouting and sufficient spare capacity in healthy regions to absorb additional load.


1️⃣6️⃣ Capacity Planning


Question

Can surviving regions
handle failure traffic?

Example

3 Regions

100k RPS each

Region fails:

Remaining regions

Must absorb traffic

Common Rule

N+1 Capacity

👉 Interview Memorization

Capacity planning is critical because failover is only successful if healthy regions have enough spare capacity to absorb redirected traffic.


1️⃣7️⃣ Observability


Monitor


Example Dashboard

Global Traffic Map

👉 Interview Memorization

Observability provides visibility into routing behavior, regional health, traffic distribution, and failover readiness.


1️⃣8️⃣ Common Failure Modes


Examples


Lesson

Routing systems fail too.

👉 Interview Memorization

Global load balancing systems must themselves be highly available because routing failures can impact every region simultaneously.


1️⃣9️⃣ Best Practices


Practical Rules


Design Principle

Traffic should always flow
to the best healthy region.

👉 Interview Memorization

Effective global load balancing continuously routes traffic to the healthiest and most appropriate region while minimizing latency and maximizing availability.


🧠 Staff-Level Answer Final


👉 Full Interview Answer

Global Load Balancing distributes traffic across multiple geographic regions to improve availability, latency, fault tolerance, and scalability.

The routing layer determines which region should handle each request using factors such as latency, health status, capacity, compliance requirements, and business policies.

Common approaches include DNS-based routing, GeoDNS, latency-based routing, weighted routing, Anycast, and dedicated global load balancers.

Active-active architectures provide lower latency and faster failover, while active-passive architectures simplify consistency and operations.

Health-based routing is critical because traffic should never be sent to unhealthy regions.

Successful global load balancing requires accurate health checks, sufficient capacity planning, observability, automated failover, and continuous testing.

Ultimately, the goal is to ensure that users are always served by the best healthy region while maintaining high availability and operational resilience.


⭐ Final Insight

Global Load Balancing 的核心不是:

“把流量分散”

而是:

Traffic Routing

  • Regional Health
  • Latency Optimization
  • Capacity Planning
  • Failover
  • Compliance
  • Observability

最重要的一句话:

Traffic should always flow to the best healthy region.


中文部分


🎯 Global Load Balancing Architectures(全球负载均衡架构)


1️⃣ 核心框架

讨论 Global Load Balancing(全球负载均衡) 时,我通常从以下几个方面分析:

  1. 为什么需要 Global Load Balancing
  2. 流量分发策略
  3. DNS 负载均衡
  4. Anycast 路由
  5. Global Load Balancer
  6. Health Check 与 Failover
  7. Regional Routing Strategy
  8. 核心权衡:Latency vs Availability vs Complexity

2️⃣ 什么是 Global Load Balancing?

Global Load Balancing(GLB)负责将用户流量分发到多个 Region。

Users

↓

Global Load Balancer

↓

Region A

Region B

Region C

目标


👉 面试背诵版

Global Load Balancing 通过将流量分发到多个地理区域来提升系统的延迟表现、可用性、容灾能力和扩展能力。


3️⃣ 为什么不能只用一个 Region?

单 Region

Users

↓

US-East

问题:


👉 面试背诵版

单 Region 架构虽然简单,但存在高延迟、低可用性和扩展受限的问题,因此大型互联网系统通常采用多 Region 架构。


4️⃣ Global Traffic Routing

核心流程:

User

↓

Global Routing Layer

↓

Best Region

↓

Application

选择标准


👉 面试背诵版

Global Routing 的核心问题是决定哪个 Region 最适合处理当前请求,而决策依据通常包括延迟、健康状态、容量和业务规则。


5️⃣ DNS-Based Global Load Balancing

工作方式

api.company.com

↓

Region A IP

api.company.com

↓

Region B IP

优点


缺点


👉 面试背诵版

DNS Routing 是最常见的全球负载均衡方式,但其 Failover 速度会受到 DNS 缓存影响。


6️⃣ GeoDNS

工作方式

纽约用户

↓

US-East
巴黎用户

↓

Europe-West

优势

降低延迟。


缺点

地理距离不一定代表网络延迟最优。


👉 面试背诵版

GeoDNS 根据用户地理位置路由流量,但最接近的 Region 不一定是网络延迟最低的 Region。


7️⃣ Latency-Based Routing

工作方式

测量RTT

↓

选择最低延迟Region

优势

提升用户体验。


缺点

实现复杂。


👉 面试背诵版

Latency Routing 基于实际网络延迟而不是地理位置,因此通常能够提供更好的用户体验。


8️⃣ Weighted Routing

Example

Region A = 80%

Region B = 20%

用途


👉 面试背诵版

Weighted Routing 按照预设比例分配流量,常用于渐进式发布和流量迁移。


9️⃣ Anycast

工作方式

多个 Region 使用相同 IP。

1.1.1.1

同时出现在:


网络自动选择最佳路径。


👉 面试背诵版

Anycast 利用互联网路由协议自动将用户请求发送到最近的健康节点,无需修改 DNS。


🔟 Global Load Balancer

职责:


👉 面试背诵版

Global Load Balancer 位于所有 Region 之上,根据健康状态和路由策略决定流量流向。


1️⃣1️⃣ Health-Based Routing

原则:

绝不向不健康Region发送流量

👉 面试背诵版

Health-Based Routing 是全球负载均衡系统中最重要的能力之一,它保证流量始终发送到健康 Region。


1️⃣2️⃣ Active-Active

多个 Region 同时提供服务。

优点:

缺点:


👉 面试背诵版

Active-Active 能够最大化可用性和性能,但需要复杂的数据同步机制。


1️⃣3️⃣ Active-Passive

Primary

↓

Standby

优点:

缺点:


👉 面试背诵版

Active-Passive 更容易保证一致性,但资源利用率较低。


1️⃣4️⃣ Traffic Steering Strategy

常见策略:


👉 面试背诵版

流量调度策略取决于业务目标,可以优化延迟、成本、容量或合规性。


1️⃣5️⃣ Global Failover

Region A ❌

↓

Region B

Region C

要求:


👉 面试背诵版

Global Failover 不仅需要重新路由流量,还要求其他 Region 具备足够容量接管业务。


1️⃣6️⃣ Capacity Planning

核心问题:

Region挂了

剩余Region扛得住吗?

N+1 原则

至少预留一个Region容量

👉 面试背诵版

Failover 成功的前提是剩余 Region 拥有足够容量处理额外流量。


1️⃣7️⃣ Observability

监控:


👉 面试背诵版

Observability 帮助团队实时了解全球流量分布和系统健康状态。


1️⃣8️⃣ 常见故障


👉 面试背诵版

Global Routing 系统本身也是关键基础设施,因此必须具备极高可靠性。


1️⃣9️⃣ Best Practices


Design Principle

Traffic should always flow
to the best healthy region.

👉 面试背诵版

优秀的全球负载均衡系统会持续将流量发送到最合适、最健康的 Region。


🧠 Staff-Level 面试答案

Global Load Balancing 是现代多 Region 系统的核心组件。

它负责根据延迟、健康状态、容量和业务规则将流量路由到最佳 Region。

常见实现包括 DNS Routing、GeoDNS、Latency Routing、Anycast 和 Global Load Balancer。

Active-Active 提供更高性能和更快恢复速度,Active-Passive 则提供更简单的运维模式。

成功的全球负载均衡设计需要可靠的 Health Check、容量规划、自动 Failover、持续监控以及定期演练。

最终目标是确保用户始终连接到最优且健康的 Region。


Implement