🎯 Core Load Balancing Framework
When discussing load balancing in system design, I typically evaluate across three dimensions:
- L4 vs L7 Load Balancing
- Server-side vs Client-side Load Balancing
- Trade-offs: performance, flexibility, and system complexity
1️⃣ L4 vs L7 Load Balancing
L4 Load Balancer (Transport Layer)
Definition:
- Operates at TCP/UDP level
- Routes based on IP + port
Strengths:
- Very fast (minimal parsing)
- Low latency
- High throughput
- Simple and stable
Limitations:
- No understanding of request content
- Cannot route based on URL / headers / cookies
Best fit:
- High-throughput systems
- Simple routing
- Internal service traffic
L4 load balancers are optimized for performance. They forward traffic without inspecting application-level data, making them ideal for high-QPS systems where latency matters most.
L7 Load Balancer (Application Layer)
Definition:
- Operates at HTTP / HTTPS level
- Routes based on request content
Capabilities:
- Path-based routing (/api vs /images)
- Header-based routing
- Cookie-based routing (session affinity)
- TLS termination
Strengths:
- Flexible routing
- Supports canary / A/B testing
- Better observability
Limitations:
- Higher latency (parsing overhead)
- More CPU intensive
Best fit:
- API gateways
- Microservices routing
- User-facing traffic
L7 load balancers provide rich routing capabilities. They are essential when routing decisions depend on request semantics, such as paths, headers, or user sessions.
L4 vs L7 Summary
| Aspect | L4 | L7 |
|---|---|---|
| Speed | Very fast | Slower |
| Routing logic | IP/Port | URL/Header/Cookie |
| Flexibility | Low | High |
| CPU cost | Low | Higher |
| Use case | Internal traffic | User-facing APIs |
2️⃣ Server-side vs Client-side Load Balancing
Server-side Load Balancing
How it works:
- Client → Load Balancer → Service instances
Examples:
- NGINX
- HAProxy
- AWS Application Load Balancer
Strengths:
- Centralized control
- Simple client logic
- Easy to manage
Limitations:
- Additional network hop
- Potential bottleneck
- Cost overhead
Server-side load balancing is the most common approach. It simplifies client logic and centralizes routing decisions, but introduces an extra hop in the request path.
Client-side Load Balancing
How it works:
- Client queries service registry
- Client directly calls instance
Examples:
- Netflix Ribbon
- gRPC
- Envoy (sidecar pattern)
Strengths:
- Removes extra hop → lower latency
- Better load distribution
- Scales naturally
Challenges:
- More complex client logic
- Requires service discovery (e.g., Consul, Eureka)
- Harder to manage globally
Client-side load balancing pushes routing logic to the client. This improves latency and scalability, but increases system complexity and requires service discovery.
3️⃣ Trade-offs & System Design Decisions
When to choose L4
- Ultra low latency requirement
- Very high throughput (e.g., millions QPS)
- No need for smart routing
👉 Example:
- Internal RPC traffic
- Database proxy layer
When to choose L7
- Need routing by path / user / feature
- API gateway / edge layer
- Canary release / A/B testing
👉 Example:
- Microservices architecture
- Public APIs
When to use Client-side LB
- Large-scale microservices
- Internal service-to-service calls
- Need to reduce hop latency
👉 Often combined with:
- Service discovery
- Service mesh (e.g., Envoy)
Hybrid Approach (Very Common)
Most real systems combine all three:
Client → L7 (API Gateway) → L4 (internal LB) → Services
↑
client-side LB (optional)
In practice, modern architectures rarely rely on a single approach. We typically combine L7 at the edge for flexibility, L4 internally for performance, and sometimes client-side load balancing for service-to-service calls.
🧠 Senior / Staff-Level Answer
When discussing load balancing, I distinguish between L4 and L7 based on routing capability and performance. L4 is optimized for speed and high throughput, while L7 enables flexible routing based on request semantics. I also differentiate between server-side and client-side load balancing. Server-side simplifies architecture but adds an extra hop, whereas client-side improves latency and scalability at the cost of complexity. In large-scale systems, we usually adopt a hybrid approach — L7 at the edge, L4 internally, and client-side load balancing for service-to-service communication.
⭐ Staff-Level Insight (Bonus)
Load balancing is not just about distributing traffic — it’s about where you place intelligence in the system.
L7 centralizes intelligence, client-side distributes it, and L4 removes it for performance.
The real design decision is choosing the right level of intelligence for each layer of your architecture.
中文部分
🎯 核心框架
在系统设计中讨论 负载均衡(Load Balancing) 时,我通常从三个维度分析:
- L4 vs L7
- Server-side vs Client-side
- 性能 vs 灵活性 vs 复杂度权衡
1️⃣ L4 vs L7
L4
- 基于 IP + Port 转发
- 性能极高
- 不理解请求内容
👉 适合:
- 内部服务通信
- 高吞吐场景
L7
- 基于 HTTP 内容路由
- 支持 path / header / cookie
- 支持灰度发布
👉 适合:
- API Gateway
- 微服务路由
2️⃣ Server vs Client LB
Server-side
- 统一入口
- 简单
- 多一跳
Client-side
- 无额外跳
- 更快
- 更复杂
3️⃣ 设计权衡
- L4 → 性能优先
- L7 → 灵活优先
- Client-side → 延迟优化 + 扩展性
🧠 总结
L4 解决性能问题 L7 解决路由问题 Client-side 解决扩展问题
实际系统通常三者结合使用
note:
Client ↓ L4 Load Balancer(超高性能) ↓ L7 Gateway(智能处理) ↓ Services
Implement