q&a-p Scaling & Architecture ·

🎯 Core Load Balancing Framework

When discussing load balancing in system design, I typically evaluate across three dimensions:

L4 vs L7 Load Balancing
Server-side vs Client-side Load Balancing
Trade-offs: performance, flexibility, and system complexity

1️⃣ L4 vs L7 Load Balancing

L4 Load Balancer (Transport Layer)

Definition:

Operates at TCP/UDP level
Routes based on IP + port

Strengths:

Very fast (minimal parsing)
Low latency
High throughput
Simple and stable

Limitations:

No understanding of request content
Cannot route based on URL / headers / cookies

Best fit:

High-throughput systems
Simple routing
Internal service traffic

L4 load balancers are optimized for performance. They forward traffic without inspecting application-level data, making them ideal for high-QPS systems where latency matters most.

L7 Load Balancer (Application Layer)

Definition:

Operates at HTTP / HTTPS level
Routes based on request content

Capabilities:

Path-based routing (/api vs /images)
Header-based routing
Cookie-based routing (session affinity)
TLS termination

Strengths:

Flexible routing
Supports canary / A/B testing
Better observability

Limitations:

Higher latency (parsing overhead)
More CPU intensive

Best fit:

API gateways
Microservices routing
User-facing traffic

L7 load balancers provide rich routing capabilities. They are essential when routing decisions depend on request semantics, such as paths, headers, or user sessions.

L4 vs L7 Summary

Aspect	L4	L7
Speed	Very fast	Slower
Routing logic	IP/Port	URL/Header/Cookie
Flexibility	Low	High
CPU cost	Low	Higher
Use case	Internal traffic	User-facing APIs

2️⃣ Server-side vs Client-side Load Balancing

Server-side Load Balancing

How it works:

Client → Load Balancer → Service instances

Examples:

NGINX
HAProxy
AWS Application Load Balancer

Strengths:

Centralized control
Simple client logic
Easy to manage

Limitations:

Additional network hop
Potential bottleneck
Cost overhead

Server-side load balancing is the most common approach. It simplifies client logic and centralizes routing decisions, but introduces an extra hop in the request path.

Client-side Load Balancing

How it works:

Client queries service registry
Client directly calls instance

Examples:

Netflix Ribbon
gRPC
Envoy (sidecar pattern)

Strengths:

Removes extra hop → lower latency
Better load distribution
Scales naturally

Challenges:

More complex client logic
Requires service discovery (e.g., Consul, Eureka)
Harder to manage globally

Client-side load balancing pushes routing logic to the client. This improves latency and scalability, but increases system complexity and requires service discovery.

3️⃣ Trade-offs & System Design Decisions

When to choose L4

Ultra low latency requirement
Very high throughput (e.g., millions QPS)
No need for smart routing

👉 Example:

Internal RPC traffic
Database proxy layer

When to choose L7

Need routing by path / user / feature
API gateway / edge layer
Canary release / A/B testing

👉 Example:

Microservices architecture
Public APIs

When to use Client-side LB

Large-scale microservices
Internal service-to-service calls
Need to reduce hop latency

👉 Often combined with:

Service discovery
Service mesh (e.g., Envoy)

Hybrid Approach (Very Common)

Most real systems combine all three:

Client → L7 (API Gateway) → L4 (internal LB) → Services
                           ↑
                   client-side LB (optional)

In practice, modern architectures rarely rely on a single approach. We typically combine L7 at the edge for flexibility, L4 internally for performance, and sometimes client-side load balancing for service-to-service calls.

🧠 Senior / Staff-Level Answer

When discussing load balancing, I distinguish between L4 and L7 based on routing capability and performance. L4 is optimized for speed and high throughput, while L7 enables flexible routing based on request semantics. I also differentiate between server-side and client-side load balancing. Server-side simplifies architecture but adds an extra hop, whereas client-side improves latency and scalability at the cost of complexity. In large-scale systems, we usually adopt a hybrid approach — L7 at the edge, L4 internally, and client-side load balancing for service-to-service communication.

⭐ Staff-Level Insight (Bonus)

Load balancing is not just about distributing traffic — it’s about where you place intelligence in the system.

L7 centralizes intelligence, client-side distributes it, and L4 removes it for performance.

The real design decision is choosing the right level of intelligence for each layer of your architecture.

中文部分

🎯 核心框架

在系统设计中讨论 负载均衡（Load Balancing） 时，我通常从三个维度分析：

L4 vs L7
Server-side vs Client-side
性能 vs 灵活性 vs 复杂度权衡

1️⃣ L4 vs L7

L4

基于 IP + Port 转发
性能极高
不理解请求内容

👉 适合：

内部服务通信
高吞吐场景

L7

基于 HTTP 内容路由
支持 path / header / cookie
支持灰度发布

👉 适合：

API Gateway
微服务路由

2️⃣ Server vs Client LB

Server-side

统一入口
简单
多一跳

Client-side

无额外跳
更快
更复杂

3️⃣ 设计权衡

L4 → 性能优先
L7 → 灵活优先
Client-side → 延迟优化 + 扩展性

🧠 总结

L4 解决性能问题 L7 解决路由问题 Client-side 解决扩展问题

实际系统通常三者结合使用

note：

Client ↓ L4 Load Balancer（超高性能） ↓ L7 Gateway（智能处理） ↓ Services

🎯 Core Load Balancing Framework

1️⃣ L4 vs L7 Load Balancing

L4 Load Balancer (Transport Layer)

L7 Load Balancer (Application Layer)

L4 vs L7 Summary

2️⃣ Server-side vs Client-side Load Balancing

Server-side Load Balancing

Client-side Load Balancing

3️⃣ Trade-offs & System Design Decisions

When to choose L4

When to choose L7

When to use Client-side LB

Hybrid Approach (Very Common)

🧠 Senior / Staff-Level Answer

⭐ Staff-Level Insight (Bonus)

中文部分

🎯 核心框架

1️⃣ L4 vs L7

L4

L7

2️⃣ Server vs Client LB

Server-side

Client-side

3️⃣ 设计权衡

🧠 总结

note：

Implement