🎯 Global Load Balancing Architectures
1️⃣ Core Framework
When discussing Global Load Balancing, I frame it as:
- Why global load balancing exists
- Traffic distribution strategies
- DNS-based load balancing
- Anycast-based routing
- Global load balancers
- Health checks and failover
- Regional routing strategies
- Trade-offs: latency vs availability vs complexity
2️⃣ What Is Global Load Balancing?
Global Load Balancing (GLB) distributes user traffic across multiple regions.
Instead of:
Users
↓
Single Region
We have:
Users
↓
Global Load Balancer
↓
Region A
Region B
Region C
Goals
- Lower latency
- Higher availability
- Better fault tolerance
- Disaster recovery
- Capacity distribution
- Global scalability
👉 Interview Memorization
Global Load Balancing distributes traffic across multiple geographic regions to improve latency, availability, fault tolerance, and scalability.
3️⃣ Why Not Use One Region?
Single Region Architecture
Users
↓
US-East
Problems
Long Latency
Asia User
↓
US-East
Regional Failure
US-East ❌
↓
Entire Service Down
Capacity Limits
One Region
↓
Finite Capacity
👉 Interview Memorization
Single-region architectures are simple but suffer from higher latency, lower availability, and limited scalability.
4️⃣ Global Traffic Routing
High-Level Flow
User
↓
Global Routing Layer
↓
Best Region
↓
Application
Core Question
What is the best region?
Possible answers:
- Closest region
- Lowest latency region
- Cheapest region
- Least loaded region
- Healthy region
👉 Interview Memorization
Global traffic routing determines which region should serve each request based on latency, health, capacity, cost, or business rules.
5️⃣ DNS-Based Global Load Balancing
Architecture
User
↓
DNS
↓
Region A
Example
api.company.com
↓
1.1.1.1
or
api.company.com
↓
2.2.2.2
Routing Logic
DNS can choose:
- Geo routing
- Weighted routing
- Latency routing
- Failover routing
Advantages
- Simple
- Cheap
- Widely supported
Disadvantages
- DNS caching
- Slow failover
- Limited control
👉 Interview Memorization
DNS-based load balancing routes users to regions by returning different IP addresses.
It is simple and scalable but suffers from DNS caching delays.
6️⃣ GeoDNS Routing
How It Works
US User
↓
US Region
Europe User
↓
Europe Region
Example
New York
↓
US-East
Paris
↓
Europe-West
Benefit
Lower latency.
Problem
Closest geography does not always mean lowest latency.
👉 Interview Memorization
GeoDNS routes users based on geographic location.
It improves latency but may not always select the fastest available region.
7️⃣ Latency-Based Routing
Principle
Route users to the fastest region.
Example
User
↓
Measure RTT
↓
Select Lowest Latency Region
Benefits
- Better user experience
- Dynamic optimization
Challenges
- Continuous measurements
- Operational complexity
👉 Interview Memorization
Latency-based routing selects the region with the lowest observed latency rather than relying purely on geographic proximity.
8️⃣ Weighted Routing
Example
Region A = 80%
Region B = 20%
Use Cases
- Canary releases
- Gradual migration
- Cost optimization
- Capacity balancing
Example
New Region
↓
10% traffic
↓
Validate
↓
Increase gradually
👉 Interview Memorization
Weighted routing distributes traffic according to predefined percentages and is commonly used for migrations and canary deployments.
9️⃣ Anycast Routing
Concept
Multiple regions advertise the same IP.
Example
Region A
1.1.1.1
Region B
1.1.1.1
Region C
1.1.1.1
Network Routes User
User
↓
Nearest Network Path
↓
Closest Healthy Region
Benefits
- Extremely fast routing
- Automatic failover
- No DNS changes
Challenges
- Harder debugging
- Complex networking
👉 Interview Memorization
Anycast allows multiple regions to share the same IP address and relies on internet routing protocols to direct users to the nearest healthy location.
🔟 Global Load Balancer Layer
Architecture
Users
↓
Global Load Balancer
↓
Regional Load Balancers
↓
Applications
Responsibilities
- Health checks
- Traffic routing
- Failover
- Capacity balancing
- Latency optimization
Examples
- Google Cloud Load Balancer
- AWS Global Accelerator
- Cloudflare
- Akamai
👉 Interview Memorization
Global load balancers sit above regional infrastructure and intelligently route traffic to healthy regions based on routing policies.
1️⃣1️⃣ Health-Based Routing
Principle
Never send traffic to unhealthy regions.
Example
Region A ✓
Region B ❌
Region C ✓
Traffic becomes:
Region A
Region C
Only.
Requirements
Reliable health checks.
👉 Interview Memorization
Health-based routing ensures traffic is only directed to healthy regions and is one of the most important capabilities of a global load balancing system.
1️⃣2️⃣ Active-Active Architecture
Architecture
Region A ✓
Region B ✓
Region C ✓
All regions serve traffic.
Advantages
- Lower latency
- Better utilization
- Faster failover
Challenges
- Data consistency
- Cross-region replication
- Operational complexity
👉 Interview Memorization
Active-active architectures maximize availability and performance but require sophisticated replication and consistency mechanisms.
1️⃣3️⃣ Active-Passive Architecture
Architecture
Primary Region
↓
Standby Region
Advantages
- Simpler design
- Easier consistency
- Easier recovery
Drawbacks
- Underutilized capacity
- Slower failover
👉 Interview Memorization
Active-passive architectures are simpler and easier to operate but often result in higher recovery times and lower infrastructure utilization.
1️⃣4️⃣ Traffic Steering Strategies
Common Strategies
Nearest Region
Lowest geographic distance
Lowest Latency
Fastest response time
Least Loaded
Lowest CPU / Request Volume
Cost Optimized
Cheapest Region
Compliance Aware
EU User
↓
EU Region Only
👉 Interview Memorization
Traffic steering strategies can optimize for latency, cost, capacity, compliance, or availability depending on business requirements.
1️⃣5️⃣ Global Failover
Failure Scenario
Region A ❌
Traffic Shift
Region B
Region C
Requirements
- Health detection
- Route updates
- Capacity headroom
- Data availability
Challenge
Avoid overload.
👉 Interview Memorization
Global failover requires both traffic rerouting and sufficient spare capacity in healthy regions to absorb additional load.
1️⃣6️⃣ Capacity Planning
Question
Can surviving regions
handle failure traffic?
Example
3 Regions
100k RPS each
Region fails:
Remaining regions
Must absorb traffic
Common Rule
N+1 Capacity
👉 Interview Memorization
Capacity planning is critical because failover is only successful if healthy regions have enough spare capacity to absorb redirected traffic.
1️⃣7️⃣ Observability
Monitor
- Regional latency
- Error rates
- Traffic distribution
- Health check status
- Routing decisions
- Failover events
- Capacity utilization
Example Dashboard
Global Traffic Map
👉 Interview Memorization
Observability provides visibility into routing behavior, regional health, traffic distribution, and failover readiness.
1️⃣8️⃣ Common Failure Modes
Examples
- DNS propagation delays
- Incorrect routing rules
- Traffic imbalance
- Health check failures
- Anycast route instability
- Capacity exhaustion
- Regional overload
Lesson
Routing systems fail too.
👉 Interview Memorization
Global load balancing systems must themselves be highly available because routing failures can impact every region simultaneously.
1️⃣9️⃣ Best Practices
Practical Rules
- Deploy multiple regions
- Use health-based routing
- Automate failover
- Monitor latency continuously
- Maintain spare capacity
- Test failover regularly
- Use Anycast when appropriate
- Design for regional failures
- Track routing decisions
Design Principle
Traffic should always flow
to the best healthy region.
👉 Interview Memorization
Effective global load balancing continuously routes traffic to the healthiest and most appropriate region while minimizing latency and maximizing availability.
🧠 Staff-Level Answer Final
👉 Full Interview Answer
Global Load Balancing distributes traffic across multiple geographic regions to improve availability, latency, fault tolerance, and scalability.
The routing layer determines which region should handle each request using factors such as latency, health status, capacity, compliance requirements, and business policies.
Common approaches include DNS-based routing, GeoDNS, latency-based routing, weighted routing, Anycast, and dedicated global load balancers.
Active-active architectures provide lower latency and faster failover, while active-passive architectures simplify consistency and operations.
Health-based routing is critical because traffic should never be sent to unhealthy regions.
Successful global load balancing requires accurate health checks, sufficient capacity planning, observability, automated failover, and continuous testing.
Ultimately, the goal is to ensure that users are always served by the best healthy region while maintaining high availability and operational resilience.
⭐ Final Insight
Global Load Balancing 的核心不是:
“把流量分散”
而是:
Traffic Routing
- Regional Health
- Latency Optimization
- Capacity Planning
- Failover
- Compliance
- Observability
最重要的一句话:
Traffic should always flow to the best healthy region.
中文部分
🎯 Global Load Balancing Architectures(全球负载均衡架构)
1️⃣ 核心框架
讨论 Global Load Balancing(全球负载均衡) 时,我通常从以下几个方面分析:
- 为什么需要 Global Load Balancing
- 流量分发策略
- DNS 负载均衡
- Anycast 路由
- Global Load Balancer
- Health Check 与 Failover
- Regional Routing Strategy
- 核心权衡:Latency vs Availability vs Complexity
2️⃣ 什么是 Global Load Balancing?
Global Load Balancing(GLB)负责将用户流量分发到多个 Region。
Users
↓
Global Load Balancer
↓
Region A
Region B
Region C
目标
- 降低延迟
- 提高可用性
- 提高容灾能力
- 支持全球扩展
- 平衡容量
👉 面试背诵版
Global Load Balancing 通过将流量分发到多个地理区域来提升系统的延迟表现、可用性、容灾能力和扩展能力。
3️⃣ 为什么不能只用一个 Region?
单 Region
Users
↓
US-East
问题:
- 亚洲用户延迟高
- Region 挂掉全站不可用
- 单区域容量有限
👉 面试背诵版
单 Region 架构虽然简单,但存在高延迟、低可用性和扩展受限的问题,因此大型互联网系统通常采用多 Region 架构。
4️⃣ Global Traffic Routing
核心流程:
User
↓
Global Routing Layer
↓
Best Region
↓
Application
选择标准
- 最近 Region
- 最低延迟
- 最低负载
- 最低成本
- 合规要求
👉 面试背诵版
Global Routing 的核心问题是决定哪个 Region 最适合处理当前请求,而决策依据通常包括延迟、健康状态、容量和业务规则。
5️⃣ DNS-Based Global Load Balancing
工作方式
api.company.com
↓
Region A IP
或
api.company.com
↓
Region B IP
优点
- 简单
- 成本低
- 易扩展
缺点
- DNS 缓存
- Failover 慢
- 控制能力有限
👉 面试背诵版
DNS Routing 是最常见的全球负载均衡方式,但其 Failover 速度会受到 DNS 缓存影响。
6️⃣ GeoDNS
工作方式
纽约用户
↓
US-East
巴黎用户
↓
Europe-West
优势
降低延迟。
缺点
地理距离不一定代表网络延迟最优。
👉 面试背诵版
GeoDNS 根据用户地理位置路由流量,但最接近的 Region 不一定是网络延迟最低的 Region。
7️⃣ Latency-Based Routing
工作方式
测量RTT
↓
选择最低延迟Region
优势
提升用户体验。
缺点
实现复杂。
👉 面试背诵版
Latency Routing 基于实际网络延迟而不是地理位置,因此通常能够提供更好的用户体验。
8️⃣ Weighted Routing
Example
Region A = 80%
Region B = 20%
用途
- Canary Release
- 流量迁移
- Capacity Balancing
👉 面试背诵版
Weighted Routing 按照预设比例分配流量,常用于渐进式发布和流量迁移。
9️⃣ Anycast
工作方式
多个 Region 使用相同 IP。
1.1.1.1
同时出现在:
- Region A
- Region B
- Region C
网络自动选择最佳路径。
👉 面试背诵版
Anycast 利用互联网路由协议自动将用户请求发送到最近的健康节点,无需修改 DNS。
🔟 Global Load Balancer
职责:
- Health Check
- Routing
- Failover
- Capacity Balancing
- Latency Optimization
👉 面试背诵版
Global Load Balancer 位于所有 Region 之上,根据健康状态和路由策略决定流量流向。
1️⃣1️⃣ Health-Based Routing
原则:
绝不向不健康Region发送流量
👉 面试背诵版
Health-Based Routing 是全球负载均衡系统中最重要的能力之一,它保证流量始终发送到健康 Region。
1️⃣2️⃣ Active-Active
多个 Region 同时提供服务。
优点:
- 更低延迟
- 更快 Failover
缺点:
- 数据一致性复杂
👉 面试背诵版
Active-Active 能够最大化可用性和性能,但需要复杂的数据同步机制。
1️⃣3️⃣ Active-Passive
Primary
↓
Standby
优点:
- 简单
- 易维护
缺点:
- 资源利用率低
👉 面试背诵版
Active-Passive 更容易保证一致性,但资源利用率较低。
1️⃣4️⃣ Traffic Steering Strategy
常见策略:
- Nearest Region
- Lowest Latency
- Least Loaded
- Cost Optimized
- Compliance Aware
👉 面试背诵版
流量调度策略取决于业务目标,可以优化延迟、成本、容量或合规性。
1️⃣5️⃣ Global Failover
Region A ❌
↓
Region B
Region C
要求:
- 健康检测
- 流量切换
- 足够容量
👉 面试背诵版
Global Failover 不仅需要重新路由流量,还要求其他 Region 具备足够容量接管业务。
1️⃣6️⃣ Capacity Planning
核心问题:
Region挂了
剩余Region扛得住吗?
N+1 原则
至少预留一个Region容量
👉 面试背诵版
Failover 成功的前提是剩余 Region 拥有足够容量处理额外流量。
1️⃣7️⃣ Observability
监控:
- Latency
- Traffic Distribution
- Error Rate
- Health Check
- Capacity
👉 面试背诵版
Observability 帮助团队实时了解全球流量分布和系统健康状态。
1️⃣8️⃣ 常见故障
- DNS 缓存
- 错误路由
- Anycast 不稳定
- Capacity Exhaustion
- Health Check 故障
👉 面试背诵版
Global Routing 系统本身也是关键基础设施,因此必须具备极高可靠性。
1️⃣9️⃣ Best Practices
- Multi-Region
- Health-Based Routing
- Automated Failover
- Capacity Planning
- Continuous Monitoring
- Regular Testing
Design Principle
Traffic should always flow
to the best healthy region.
👉 面试背诵版
优秀的全球负载均衡系统会持续将流量发送到最合适、最健康的 Region。
🧠 Staff-Level 面试答案
Global Load Balancing 是现代多 Region 系统的核心组件。
它负责根据延迟、健康状态、容量和业务规则将流量路由到最佳 Region。
常见实现包括 DNS Routing、GeoDNS、Latency Routing、Anycast 和 Global Load Balancer。
Active-Active 提供更高性能和更快恢复速度,Active-Passive 则提供更简单的运维模式。
成功的全球负载均衡设计需要可靠的 Health Check、容量规划、自动 Failover、持续监控以及定期演练。
最终目标是确保用户始终连接到最优且健康的 Region。
Implement