·

System Design Deep Dive - 02 Global Traffic Routing: GeoDNS vs Anycast

Post by ailswan May. 24, 2026

中文 ↓

🎯 Global Traffic Routing: GeoDNS vs Anycast


1️⃣ Core Framework

When discussing global traffic routing, I frame it as:

  1. Why global routing is needed
  2. GeoDNS routing
  3. Anycast routing
  4. Health checks and failover
  5. Latency and availability
  6. Operational complexity
  7. Failure modes
  8. Trade-offs: control vs speed vs resilience

2️⃣ Why Global Traffic Routing Exists

Global traffic routing sends users to the best serving location.

The goal is to improve:


Basic Flow

User
→ Global Routing Layer
→ Best Region / Edge Location
→ Application Service

👉 Interview Answer

Global traffic routing decides which region or edge location should serve a user request.

It improves latency, availability, failover, and load distribution.

The two common approaches are GeoDNS and Anycast.


3️⃣ What Is GeoDNS?


Definition

GeoDNS routes users based on DNS responses.

The DNS system returns different IP addresses depending on:


Example

User in New York
→ DNS returns US-East IP

User in London
→ DNS returns Europe IP

👉 Interview Answer

GeoDNS uses DNS responses to route users to different regions.

It can route based on geography, latency, weights, or health checks.

It is simple and widely used, but DNS caching can make failover slower.


4️⃣ What Is Anycast?


Definition

Anycast allows the same IP address to be announced from multiple locations.

Network routing sends the user to the nearest or best network path.


Example

Same IP announced from:
US-East
Europe
Asia

User traffic automatically goes to closest network location.

Key Idea

Same IP.
Multiple locations.
Network chooses path.

👉 Interview Answer

Anycast advertises the same IP address from multiple locations.

Internet routing sends users to the nearest or best available location based on network path.

It is commonly used by CDNs, DNS providers, and edge networks.


5️⃣ Core Difference


GeoDNS

DNS decides where user goes.

Anycast

Network routing decides where user goes.

Comparison Table

Dimension GeoDNS Anycast
Routing layer DNS Network / BGP
IP address Different IPs per region Same IP globally
Control More application-level control More network-level routing
Failover speed Affected by DNS TTL Often faster
User location accuracy Depends on DNS resolver Based on network path
Operational complexity Lower to medium Higher
Best for Region routing Edge routing / DNS / CDN

👉 Interview Answer

The main difference is routing layer.

GeoDNS makes routing decisions through DNS responses.

Anycast makes routing decisions through network routing.

GeoDNS gives more explicit regional control, while Anycast provides fast edge-level routing.


6️⃣ GeoDNS Architecture


Architecture

User
→ Recursive DNS Resolver
→ Authoritative DNS
→ GeoDNS Policy
→ Return Region IP
→ User connects to selected region

GeoDNS Policy Inputs


Example

80% traffic → Region A
20% traffic → Region B

👉 Interview Answer

GeoDNS works by returning different DNS answers based on routing policy.

The policy can consider geography, health, weights, latency, compliance, or region capacity.


7️⃣ Anycast Architecture


Architecture

User
→ Same Anycast IP
→ Internet Routing
→ Nearest Edge / Region
→ Application Gateway

How It Works

Multiple locations advertise the same IP prefix using BGP.

Routers choose the best path.


Common Use Cases


👉 Interview Answer

Anycast works by advertising the same IP prefix from multiple locations.

The internet routing system sends traffic to the nearest or best path.

This makes Anycast useful for CDNs, DNS, DDoS protection, and global edge services.


8️⃣ Failover Behavior


GeoDNS Failover

Region A unhealthy
→ DNS stops returning Region A IP
→ New DNS lookups go to Region B

Problem

Existing cached DNS records may still point to Region A until TTL expires.


Anycast Failover

Edge A unhealthy
→ Stop announcing route
→ Traffic shifts to next closest location

Difference

Anycast can fail over faster at the routing layer, but it requires stronger network operations.


👉 Interview Answer

GeoDNS failover depends on DNS TTL and resolver caching.

Anycast failover can be faster because unhealthy locations stop advertising the route.

However, Anycast requires more advanced network infrastructure and BGP operations.


9️⃣ Latency


GeoDNS Latency

GeoDNS estimates best region using DNS-level signals.

But it may route based on recursive resolver location, not actual user location.


Anycast Latency

Anycast uses network path selection.

It often routes users to a nearby edge location.


Important Caveat

Closest network path does not always mean best application performance.


👉 Interview Answer

GeoDNS can improve latency by returning region-specific IPs, but it may be inaccurate if the DNS resolver is far from the user.

Anycast often gives better edge proximity, but the best network path is not always the best application path.


🔟 Control and Policy


GeoDNS Gives More Explicit Control

GeoDNS supports:


Anycast Gives Less Direct Per-user Control

Anycast routing is controlled by network path and BGP behavior.


Example

GeoDNS:
Send German users only to EU region.

Anycast:
Network routes to best announced path.

👉 Interview Answer

GeoDNS gives more explicit traffic-control policies.

It is easier to implement regional compliance, weighted migration, and controlled failover.

Anycast is more automatic, but gives less precise application-level control.


1️⃣1️⃣ Health Checks


Why Health Checks Matter

Routing should avoid unhealthy regions.


GeoDNS Health Checks

DNS provider checks regional endpoints.

If region unhealthy:
stop returning that region IP

Anycast Health Checks

Local edge health controls route announcement.

If edge unhealthy:
withdraw BGP route

👉 Interview Answer

Health checks are critical for global routing.

GeoDNS uses health checks to decide which IPs to return.

Anycast uses health checks to decide whether a location should continue advertising a route.


1️⃣2️⃣ DNS Caching Problem


Why DNS Caching Matters

DNS responses are cached by resolvers and clients.

Even if DNS policy changes, some users may keep using old IPs.


Example

TTL = 300 seconds

Region fails now.
Some users may continue using old region IP for up to 5 minutes.

Mitigation


👉 Interview Answer

DNS caching is the main limitation of GeoDNS failover.

Even after DNS stops returning an unhealthy region, cached records may continue routing users to the failed region until TTL expires.


1️⃣3️⃣ Anycast Operational Challenges


Challenges

Anycast is powerful, but operationally complex.

Challenges include:


Important Point

Anycast works best for stateless or edge-friendly services.


👉 Interview Answer

Anycast requires strong network operations.

Teams must manage BGP announcements, route withdrawals, traffic imbalance, and debugging.

It is especially suitable for stateless edge services.


1️⃣4️⃣ Stateful Connections


Why State Matters

Anycast routing can shift traffic if routes change.

For long-lived TCP connections, this may cause problems.


Examples


Mitigation


👉 Interview Answer

Anycast is easiest for stateless request-response traffic.

Long-lived or stateful connections require careful design, because route changes may move users to a different location.


1️⃣5️⃣ When to Use GeoDNS


Choose GeoDNS When


Example

Route US users to US region.
Route EU users to EU region.
Shift 10% traffic to new region.

👉 Interview Answer

I would use GeoDNS when I need explicit regional control, compliance routing, weighted traffic shifting, and simpler operations.

It is a good fit for multi-region application routing.


1️⃣6️⃣ When to Use Anycast


Choose Anycast When


Example

Global CDN edge
→ Same Anycast IP worldwide
→ Users routed to nearest edge

👉 Interview Answer

I would use Anycast for global edge services, CDN, DNS, DDoS protection, and low-latency edge routing.

It is powerful, but requires mature network operations.


1️⃣7️⃣ Hybrid Design


Many Production Systems Use Both

GeoDNS
→ Route user to broad region

Anycast
→ Route user to nearest edge inside global network

Example

DNS routes user to global service IP.
Anycast routes to nearest edge.
Edge proxies to healthy origin region.

Why Hybrid Works


👉 Interview Answer

Many large systems combine GeoDNS and Anycast.

GeoDNS provides regional policy control, while Anycast provides fast edge routing.

The edge can then proxy traffic to the healthiest origin region.


1️⃣8️⃣ Common Failure Modes


Failure Modes

Global routing can fail because of:


Example

DNS routes user away from failed region,
but cached DNS still points to old IP.

👉 Interview Answer

Global routing failures often come from stale DNS, incorrect health checks, BGP instability, traffic imbalance, or hidden single-region dependencies.

Routing alone does not guarantee application availability.


1️⃣9️⃣ Observability


What to Monitor


Debugging Questions


👉 Interview Answer

Global routing needs observability across DNS, network routing, edge locations, origins, latency, traffic distribution, and health checks.

Without this, routing failures are extremely hard to debug.


2️⃣0️⃣ Best Practices


Practical Rules


Design Principle

GeoDNS gives control.
Anycast gives proximity.
Healthy systems need both routing and application failover.

👉 Interview Answer

GeoDNS and Anycast solve different parts of global routing.

GeoDNS gives explicit policy control.

Anycast gives network-level proximity and fast edge routing.

A reliable global system usually combines routing, health checks, stateless service design, and application-level failover.


🧠 Staff-Level Answer Final


👉 Interview Answer Full Version

GeoDNS and Anycast are two common approaches for global traffic routing.

GeoDNS makes routing decisions at the DNS layer.

The authoritative DNS server returns different IP addresses based on geography, latency, health checks, weights, compliance, or failover policy.

This gives strong regional control.

It is useful when we need to route users to specific regions, shift traffic gradually, enforce data residency, or fail over from one region to another.

The main weakness of GeoDNS is DNS caching.

Even after a region is marked unhealthy, resolvers and clients may keep using old DNS answers until TTL expires.

Anycast works at the network routing layer.

The same IP address is advertised from multiple locations using BGP.

Internet routing sends users to the nearest or best network path.

This is very useful for CDNs, DNS providers, DDoS protection, global edge services, and low-latency routing.

Anycast can fail over quickly when a location withdraws its route, but it requires mature network operations.

Teams must handle BGP announcements, route leaks, traffic imbalance, debugging difficulty, and stateful connection issues.

The key difference is: GeoDNS lets DNS decide based on policy, while Anycast lets the network route based on path.

GeoDNS gives more explicit control.

Anycast gives better edge proximity and faster network-level failover.

Many production systems combine both.

GeoDNS may route users to broad regions or global service endpoints, while Anycast routes them to the nearest edge.

The edge can then proxy to the healthiest origin region.

The system still needs application-level health checks, retry logic, stateless service design, externalized sessions, and observability.

The core principle is: GeoDNS gives control, Anycast gives proximity, and reliable global systems need both routing and application failover.


⭐ Final Insight

Global Traffic Routing 的核心不是:

“选 GeoDNS 还是 Anycast”

而是理解它们解决的问题不同:

GeoDNS 负责 policy control。

Anycast 负责 network proximity。

真正可靠的 global system 还需要:

Health Checks

  • Application Failover
  • Stateless Services
  • Externalized Session State
  • Observability
  • Failover Testing。

最重要的一句话:

GeoDNS gives control.

Anycast gives proximity.

Reliable systems need both routing and application failover.


中文部分


🎯 Global Traffic Routing: GeoDNS vs Anycast


1️⃣ 核心框架

讨论 global traffic routing 时,我通常从这些方面分析:

  1. 为什么需要 global routing
  2. GeoDNS routing
  3. Anycast routing
  4. Health checks and failover
  5. Latency and availability
  6. Operational complexity
  7. Failure modes
  8. 核心权衡:control vs speed vs resilience

2️⃣ 为什么需要 Global Traffic Routing?

Global traffic routing 把用户请求发送到最合适的 serving location。

目标是提升:


Basic Flow

User
→ Global Routing Layer
→ Best Region / Edge Location
→ Application Service

👉 面试回答

Global traffic routing 决定哪个 region 或 edge location 应该处理 user request。

它提升 latency、availability、 failover 和 load distribution。

两种常见方式是 GeoDNS 和 Anycast。


3️⃣ 什么是 GeoDNS?


Definition

GeoDNS 通过 DNS responses 路由 users。

DNS system 会根据这些因素返回不同 IP:


Example

User in New York
→ DNS returns US-East IP

User in London
→ DNS returns Europe IP

👉 面试回答

GeoDNS 使用 DNS responses 把 users 路由到不同 regions。

它可以基于 geography、latency、 weights 或 health checks 路由。

它简单且常用, 但 DNS caching 会让 failover 变慢。


4️⃣ 什么是 Anycast?


Definition

Anycast 允许多个 locations 同时 announce 同一个 IP address。

Network routing 会把 user 发送到最近或最好的 network path。


Example

Same IP announced from:
US-East
Europe
Asia

User traffic automatically goes to closest network location.

Key Idea

Same IP.
Multiple locations.
Network chooses path.

👉 面试回答

Anycast 从多个 locations advertise 同一个 IP address。

Internet routing 会根据 network path 把 users 发送到最近或最优 location。

它常用于 CDNs、DNS providers 和 edge networks。


5️⃣ 核心区别


GeoDNS

DNS decides where user goes.

Anycast

Network routing decides where user goes.

Comparison Table

Dimension GeoDNS Anycast
Routing layer DNS Network / BGP
IP address Different IPs per region Same IP globally
Control More application-level control More network-level routing
Failover speed Affected by DNS TTL Often faster
User location accuracy Depends on DNS resolver Based on network path
Operational complexity Lower to medium Higher
Best for Region routing Edge routing / DNS / CDN

👉 面试回答

核心区别是 routing layer。

GeoDNS 通过 DNS responses 做 routing decisions。

Anycast 通过 network routing 做 routing decisions。

GeoDNS 提供更明确的 regional control, Anycast 提供更快的 edge-level routing。


6️⃣ GeoDNS Architecture


Architecture

User
→ Recursive DNS Resolver
→ Authoritative DNS
→ GeoDNS Policy
→ Return Region IP
→ User connects to selected region

GeoDNS Policy Inputs


Example

80% traffic → Region A
20% traffic → Region B

👉 面试回答

GeoDNS 通过根据 routing policy 返回不同 DNS answers 工作。

Policy 可以考虑 geography、health、 weights、latency、compliance 或 region capacity。


7️⃣ Anycast Architecture


Architecture

User
→ Same Anycast IP
→ Internet Routing
→ Nearest Edge / Region
→ Application Gateway

How It Works

多个 locations 使用 BGP advertise 同一个 IP prefix。

Routers 选择 best path。


Common Use Cases


👉 面试回答

Anycast 通过从多个 locations advertise 同一个 IP prefix 工作。

Internet routing system 把 traffic 发送到 nearest 或 best path。

这让 Anycast 适合 CDNs、DNS、 DDoS protection 和 global edge services。


8️⃣ Failover Behavior


GeoDNS Failover

Region A unhealthy
→ DNS stops returning Region A IP
→ New DNS lookups go to Region B

Problem

已有 cached DNS records 可能在 TTL 过期前仍然指向 Region A。


Anycast Failover

Edge A unhealthy
→ Stop announcing route
→ Traffic shifts to next closest location

Difference

Anycast 可以在 routing layer 更快 fail over, 但需要更强 network operations。


👉 面试回答

GeoDNS failover 受 DNS TTL 和 resolver caching 影响。

Anycast failover 可能更快, 因为 unhealthy locations 可以停止 advertising route。

但 Anycast 需要更高级的 network infrastructure 和 BGP operations。


9️⃣ Latency


GeoDNS Latency

GeoDNS 使用 DNS-level signals 估算最佳 region。

但它可能基于 recursive resolver location, 而不是 actual user location。


Anycast Latency

Anycast 使用 network path selection。

它经常把 users 路由到 nearby edge location。


Important Caveat

Closest network path 不一定等于 best application performance。


👉 面试回答

GeoDNS 可以通过返回 region-specific IPs 改善 latency, 但如果 DNS resolver 离 user 很远, 可能不准确。

Anycast 通常提供更好的 edge proximity, 但 best network path 不一定是 best application path。


🔟 Control and Policy


GeoDNS Gives More Explicit Control

GeoDNS 支持:


Anycast Gives Less Direct Per-user Control

Anycast routing 由 network path 和 BGP behavior 控制。


Example

GeoDNS:
Send German users only to EU region.

Anycast:
Network routes to best announced path.

👉 面试回答

GeoDNS 提供更明确的 traffic-control policies。

它更容易实现 regional compliance、 weighted migration 和 controlled failover。

Anycast 更 automatic, 但 application-level control 没那么精确。


1️⃣1️⃣ Health Checks


为什么 Health Checks 重要?

Routing 应该避开 unhealthy regions。


GeoDNS Health Checks

DNS provider 检查 regional endpoints。

If region unhealthy:
stop returning that region IP

Anycast Health Checks

Local edge health 控制 route announcement。

If edge unhealthy:
withdraw BGP route

👉 面试回答

Health checks 对 global routing 很关键。

GeoDNS 使用 health checks 决定返回哪些 IPs。

Anycast 使用 health checks 决定某个 location 是否继续 advertise route。


1️⃣2️⃣ DNS Caching Problem


为什么 DNS Caching 重要?

DNS responses 会被 resolvers 和 clients cache。

即使 DNS policy 改了, 部分 users 仍可能使用旧 IP。


Example

TTL = 300 seconds

Region fails now.
Some users may continue using old region IP for up to 5 minutes.

Mitigation


👉 面试回答

DNS caching 是 GeoDNS failover 的主要限制。

即使 DNS 停止返回 unhealthy region, cached records 仍可能在 TTL 过期前 把 users 路由到 failed region。


1️⃣3️⃣ Anycast Operational Challenges


Challenges

Anycast 很强大, 但 operationally complex。

Challenges include:


Important Point

Anycast 最适合 stateless 或 edge-friendly services。


👉 面试回答

Anycast 需要强 network operations。

团队必须管理 BGP announcements、 route withdrawals、traffic imbalance 和 debugging。

它特别适合 stateless edge services。


1️⃣4️⃣ Stateful Connections


为什么 State 重要?

如果 routes 变化, Anycast routing 可能移动 traffic。

对 long-lived TCP connections, 这会造成问题。


Examples


Mitigation


👉 面试回答

Anycast 最适合 stateless request-response traffic。

Long-lived 或 stateful connections 需要谨慎设计, 因为 route changes 可能把 users 移动到另一个 location。


1️⃣5️⃣ When to Use GeoDNS


Choose GeoDNS When


Example

Route US users to US region.
Route EU users to EU region.
Shift 10% traffic to new region.

👉 面试回答

当我需要 explicit regional control、 compliance routing、weighted traffic shifting 和 simpler operations 时, 我会使用 GeoDNS。

它适合 multi-region application routing。


1️⃣6️⃣ When to Use Anycast


Choose Anycast When


Example

Global CDN edge
→ Same Anycast IP worldwide
→ Users routed to nearest edge

👉 面试回答

我会在 global edge services、CDN、DNS、 DDoS protection 和 low-latency edge routing 中使用 Anycast。

它很强大, 但需要成熟的 network operations。


1️⃣7️⃣ Hybrid Design


Many Production Systems Use Both

GeoDNS
→ Route user to broad region

Anycast
→ Route user to nearest edge inside global network

Example

DNS routes user to global service IP.
Anycast routes to nearest edge.
Edge proxies to healthy origin region.

Why Hybrid Works


👉 面试回答

很多 large systems 会结合 GeoDNS 和 Anycast。

GeoDNS 提供 regional policy control, Anycast 提供 fast edge routing。

Edge 再把 traffic proxy 到 healthiest origin region。


1️⃣8️⃣ Common Failure Modes


Failure Modes

Global routing 可能失败因为:


Example

DNS routes user away from failed region,
but cached DNS still points to old IP.

👉 面试回答

Global routing failures 经常来自 stale DNS、incorrect health checks、 BGP instability、traffic imbalance 或 hidden single-region dependencies。

Routing alone 不能保证 application availability。


1️⃣9️⃣ Observability


What to Monitor


Debugging Questions


👉 面试回答

Global routing 需要跨 DNS、network routing、 edge locations、origins、latency、 traffic distribution 和 health checks 的 observability。

没有这些, routing failures 极难 debug。


2️⃣0️⃣ Best Practices


Practical Rules


Design Principle

GeoDNS gives control.
Anycast gives proximity.
Healthy systems need both routing and application failover.

👉 面试回答

GeoDNS 和 Anycast 解决 global routing 的不同部分。

GeoDNS 提供 explicit policy control。

Anycast 提供 network-level proximity 和 fast edge routing。

Reliable global system 通常结合 routing、health checks、 stateless service design 和 application-level failover。


🧠 Staff-Level Answer Final


👉 面试回答完整版本

GeoDNS 和 Anycast 是 global traffic routing 的两种常见方式。

GeoDNS 在 DNS layer 做 routing decisions。

Authoritative DNS server 根据 geography、latency、health checks、 weights、compliance 或 failover policy 返回不同 IP addresses。

这提供很强的 regional control。

当我们需要把 users 路由到特定 regions、 gradual traffic shifting、 enforce data residency, 或从一个 region fail over 到另一个 region 时, GeoDNS 很有用。

GeoDNS 的主要弱点是 DNS caching。

即使一个 region 被标记为 unhealthy, resolvers 和 clients 也可能继续使用 old DNS answers, 直到 TTL 过期。

Anycast 在 network routing layer 工作。

同一个 IP address 通过 BGP 从多个 locations advertise。

Internet routing 把 users 发送到最近 或 best network path。

这非常适合 CDNs、DNS providers、 DDoS protection、global edge services 和 low-latency routing。

当某个 location withdraws its route 时, Anycast 可以快速 fail over。

但它需要成熟的 network operations。

团队必须处理 BGP announcements、 route leaks、traffic imbalance、 debugging difficulty 和 stateful connection issues。

核心区别是: GeoDNS 让 DNS 根据 policy 决定; Anycast 让 network 根据 path routing。

GeoDNS 提供更多 explicit control。

Anycast 提供更好的 edge proximity 和更快 network-level failover。

很多 production systems 会结合两者。

GeoDNS 可能把 users route 到 broad regions 或 global service endpoints, Anycast 再把 users route 到最近 edge。

Edge 可以继续 proxy 到 healthiest origin region。

系统仍然需要 application-level health checks、 retry logic、stateless service design、 externalized sessions 和 observability。

核心原则是: GeoDNS gives control, Anycast gives proximity, and reliable global systems need both routing and application failover。


⭐ Final Insight

Global Traffic Routing 的核心不是:

“选 GeoDNS 还是 Anycast”

而是理解它们解决的问题不同:

GeoDNS 负责 policy control。

Anycast 负责 network proximity。

真正可靠的 global system 还需要:

Health Checks

  • Application Failover
  • Stateless Services
  • Externalized Session State
  • Observability
  • Failover Testing。

最重要的一句话:

GeoDNS gives control.

Anycast gives proximity.

Reliable systems need both routing and application failover.


Implement