sd-rps Real Production Systems ·

🎯 How Airbnb Handles Search & Ranking

1️⃣ Core Search Framework (Staff-Level)

When discussing an Airbnb-like search and ranking system, I frame it as:

Query understanding
Candidate retrieval
Availability and policy filtering
Feature enrichment
Ranking and personalization
Diversity and marketplace controls
Booking feedback loops
Trade-offs: relevance vs availability freshness vs latency vs fairness

2️⃣ Core Problem

Airbnb search is not just text search.

It must return listings that are:

geographically relevant
available for the requested dates
suitable for guest filters
trusted and high quality
likely to convert
fair to hosts and marketplace health

👉 Interview Answer

Airbnb search combines information retrieval with realtime marketplace constraints. A listing is not a good result if it is unavailable, unsafe, too expensive for the intent, or unlikely to convert. The system must retrieve broadly, filter accurately, and rank for both guest relevance and marketplace quality.

3️⃣ High-Level Architecture

Search Request
  location + dates + guests + filters
        ↓
Query Understanding
        ↓
Search Index Candidate Retrieval
        ↓
Availability / Policy Filtering
        ↓
Feature Enrichment
        ↓
Ranking Model
        ↓
Diversity / Business Rules
        ↓
Search Results Page

4️⃣ Candidate Retrieval

First-stage retrieval may use:

location radius
map bounding box
destination intent
text query
listing type
price range
capacity
basic quality filters

The goal is high recall.

👉 Interview Answer

Candidate retrieval should be fast and recall-oriented. I would use a search index with geographic fields, listing metadata, and basic filters to produce a few thousand possible listings before doing expensive checks and ranking.

5️⃣ Availability Filtering

Availability is harder than normal search filtering because booking state changes constantly.

Important data:

listing calendar
nightly availability
minimum stay
maximum stay
blocked dates
host rules
instant book eligibility
pricing and fees

👉 Interview Answer

Availability filtering needs fresher data than the general search index. I would avoid relying only on stale indexed availability. The common design is to retrieve candidates from the index and then perform a fresher availability check before final ranking.

6️⃣ Ranking Signals

Guest-side signals:

query location match
price fit
review quality
photos
amenities
cancellation policy
historical conversion
personalization

Host and marketplace signals:

host reliability
response rate
booking acceptance
inventory diversity
new listing exploration
trust and safety signals

👉 Interview Answer

Ranking should optimize more than clicks. For Airbnb, a good ranking target includes booking probability, guest satisfaction, host reliability, price fit, and trust signals. Optimizing only click-through can surface attractive but poor-converting or unreliable listings.

7️⃣ Personalization and Diversity

Personalization examples:

family trip vs solo trip
budget preference
preferred neighborhoods
amenity preference
past booking behavior

Diversity controls:

avoid showing identical homes
mix price ranges
avoid over-concentrating one neighborhood
give high-quality new listings some exposure

8️⃣ Freshness and Consistency

Common race:

Search result says available
        ↓
User clicks listing
        ↓
Another guest books it
        ↓
Checkout fails

Mitigations:

fresher availability checks near booking
optimistic validation at checkout
clear unavailable state
cache invalidation from booking events
short TTL for availability-derived results

9️⃣ Staff-Level Trade-offs

Decision	Benefit	Cost
Index more availability	Faster filtering	Staleness risk
Check availability online	More accurate	Higher latency
Rich personalization	Higher conversion	Harder debugging
Diversity controls	Better marketplace health	May reduce short-term CTR
Exploration for new listings	Better supply growth	Less certain relevance

🔟 Failure Handling

If ranking model is slow:

degrade to simpler ranker
use cached feature values
reduce candidate count
apply deterministic fallback sort

If availability service is degraded:

reduce date-sensitive result confidence
show fewer results with stricter validation
validate again on listing page or checkout

中文部分

中文速记

一句话

Airbnb Search 不是普通搜索，而是“搜索召回 + 实时库存/可订性过滤 + 个性化排序”的 marketplace search。

背诵要点

第一阶段用 search index 做高召回
availability 变化快，不能完全依赖索引里的旧数据
ranking 目标不是 CTR，而是 booking probability、trust、price fit 和 guest satisfaction
checkout 前必须再次验证 availability
核心权衡是 query latency vs booking-state freshness

中文面试回答

我会把 Airbnb 搜索系统拆成两层。第一层是高召回的 candidate retrieval，用搜索索引根据地点、日期、人数、价格、房源类型和基础过滤条件找出候选房源。第二层是更实时的 availability 和 policy check，因为房源日历和预订状态变化很快，索引里的可订状态可能已经过期。

排序阶段不能只优化点击率。更合理的目标是预订转化率、价格匹配、房源质量、评论、host reliability、取消政策、trust and safety，以及用户的个性化偏好。最后还需要 diversity 和 marketplace control，避免结果都集中在同一区域、同一价格段或同一种房型。

Staff 级重点是：一个看起来相关但不能预订的房源不是好结果。所以系统要把快速搜索和新鲜库存校验分离，在 listing page 或 checkout 前做最终验证。

✅ Final Interview Answer

An Airbnb-like search system is a marketplace search system, not just a document search engine. I would first use a search index to retrieve geographically and semantically relevant listings with high recall. Then I would apply fresher availability, policy, and trust filters because indexed data can be stale. After that, I would enrich candidates with price, review, host, guest preference, and conversion features and rank them with a model.

The key challenge is balancing fast search latency with fresh booking state. A great result is useless if it cannot be booked. At staff level, I would separate fast candidate retrieval from fresher validation and ranking, then use feedback from impressions, clicks, booking attempts, cancellations, and guest satisfaction to improve the system continuously.

System Design Deep Dive - 05 How Airbnb Handles Search & Ranking

🎯 How Airbnb Handles Search & Ranking

1️⃣ Core Search Framework (Staff-Level)

2️⃣ Core Problem

3️⃣ High-Level Architecture

4️⃣ Candidate Retrieval

5️⃣ Availability Filtering

6️⃣ Ranking Signals

7️⃣ Personalization and Diversity

8️⃣ Freshness and Consistency

9️⃣ Staff-Level Trade-offs

🔟 Failure Handling

中文部分

中文速记

一句话

背诵要点

中文面试回答

✅ Final Interview Answer

Implement