System Design Deep Dive - 25 Design Email System

Post by ailswan May. 18, 2026

中文 ↓

🎯 Design Email System

1️⃣ Core Framework

When discussing Email System design, I frame it as:

  1. Email composition and submission
  2. Message storage and metadata
  3. Sending pipeline and SMTP delivery
  4. Queue, retry, bounce, and failure handling
  5. Inbox, search, and folder management
  6. Templates, bulk email, and notification email
  7. Spam, abuse, reputation, and rate limiting
  8. Trade-offs: deliverability vs latency vs reliability

2️⃣ Core Requirements


Functional Requirements


Non-functional Requirements


👉 Interview Answer

An email system stores, sends, receives, indexes, and organizes messages.

The core challenges are reliable delivery, durable message storage, scalable mailbox search, spam prevention, bounce handling, and maintaining sender reputation.


3️⃣ Core Concepts


Email Message

An email contains:


Mailbox

A mailbox stores messages for a user.

Common folders:

Inbox
Sent
Drafts
Trash
Spam
Archive

SMTP

SMTP is used to send email between mail servers.


IMAP / POP3

Used by email clients to retrieve emails.


👉 Interview Answer

I would separate message storage from mailbox views.

The message body and attachments are stored durably, while mailbox metadata tracks folders, labels, read state, and thread relationships.


4️⃣ Main APIs


Send Email

POST /api/emails/send

Request:

{
  "from": "alice@example.com",
  "to": ["bob@example.com"],
  "cc": [],
  "bcc": [],
  "subject": "Hello",
  "body": "Hi Bob",
  "attachments": ["file_123"]
}

Save Draft

POST /api/emails/drafts

Get Inbox

GET /api/mailbox/inbox?limit=50&cursor=xxx

Get Email

GET /api/emails/{emailId}

Search Email

GET /api/emails/search?q=invoice

Update Mailbox State

POST /api/emails/{emailId}/labels

👉 Interview Answer

I would expose APIs for sending email, saving drafts, reading mailbox folders, fetching individual messages, searching email, and updating labels or read state.

Sending should be asynchronous because SMTP delivery can be slow or fail.


5️⃣ Data Model


Message Table

email_message (
  message_id VARCHAR PRIMARY KEY,
  sender_id VARCHAR,
  from_address VARCHAR,
  subject TEXT,
  body_location TEXT,
  headers JSON,
  status VARCHAR,
  created_at TIMESTAMP
)

Recipient Table

email_recipient (
  message_id VARCHAR,
  recipient_address VARCHAR,
  recipient_type VARCHAR, -- to, cc, bcc
  delivery_status VARCHAR,
  provider_response JSON,
  updated_at TIMESTAMP,
  PRIMARY KEY (message_id, recipient_address)
)

Mailbox Item Table

mailbox_item (
  user_id VARCHAR,
  mailbox_item_id VARCHAR,
  message_id VARCHAR,
  folder VARCHAR,
  labels ARRAY,
  read BOOLEAN,
  starred BOOLEAN,
  thread_id VARCHAR,
  received_at TIMESTAMP,
  PRIMARY KEY (user_id, mailbox_item_id)
)

Attachment Table

attachment (
  attachment_id VARCHAR PRIMARY KEY,
  message_id VARCHAR,
  file_name VARCHAR,
  content_type VARCHAR,
  size_bytes BIGINT,
  storage_location TEXT,
  checksum VARCHAR
)

Delivery Event Table

email_delivery_event (
  event_id VARCHAR PRIMARY KEY,
  message_id VARCHAR,
  recipient_address VARCHAR,
  event_type VARCHAR, -- sent, delivered, bounced, complained, opened, clicked
  created_at TIMESTAMP,
  metadata JSON
)

👉 Interview Answer

I would store message content, recipients, mailbox metadata, attachments, and delivery events separately.

This allows one physical message to appear in multiple user mailboxes with different read states, labels, and folders.


6️⃣ High-Level Architecture


Client
→ Email API
→ Message Service
→ Attachment Storage
→ Send Queue
→ Mail Delivery Workers
→ SMTP / Email Provider
→ Bounce / Complaint Handler

Incoming Mail Server
→ Inbound Processor
→ Spam Filter
→ Mailbox Service
→ Search Index

Main Components

Email API


Message Service


Send Queue


Delivery Workers


Inbound Processor


Search Index


👉 Interview Answer

I would split the system into outbound and inbound pipelines.

Outbound email goes through message storage, send queue, delivery workers, and SMTP or provider APIs.

Inbound email goes through receiving servers, spam filtering, mailbox placement, and search indexing.


7️⃣ Send Email Flow


Basic Flow

User clicks send
→ Email API validates request
→ Store message and attachments
→ Create sent mailbox item
→ Enqueue send job
→ Delivery worker sends via SMTP/provider
→ Update delivery status
→ Record delivery events

Why Async?

SMTP delivery can:


👉 Interview Answer

Sending email should be asynchronous.

Once the message is durably stored and queued, the API can return success to the user.

Delivery workers then send the email, retry failures, and update delivery status.


8️⃣ SMTP Delivery and Retry


Delivery Flow

Delivery worker
→ Resolve recipient domain MX record
→ Connect to recipient mail server
→ Send message via SMTP
→ Receive response
→ Mark sent / retry / bounced

Failure Types

Temporary Failure

Examples:

Mailbox temporarily unavailable
Server busy
Rate limited
Network timeout

Action:

retry with backoff

Permanent Failure

Examples:

Invalid recipient
Domain does not exist
Mailbox does not exist

Action:

mark bounced

👉 Interview Answer

SMTP delivery can fail temporarily or permanently.

Temporary failures should be retried with exponential backoff.

Permanent failures should mark the recipient as bounced and stop retrying.


9️⃣ Bounce and Complaint Handling


Bounce Types


Complaint

A recipient marks email as spam.


Handling Flow

Provider sends bounce/complaint event
→ Verify event
→ Store delivery event
→ Update recipient delivery status
→ Suppress future sends if needed
→ Update sender reputation

👉 Interview Answer

Bounce and complaint handling are critical for deliverability.

Hard bounces and spam complaints should update suppression lists so the system avoids repeatedly sending to bad or risky addresses.


🔟 Inbound Email Flow


Flow

External sender sends email
→ Our MX server receives message
→ Validate domain and recipient
→ Run spam / virus scanning
→ Store message
→ Create inbox mailbox item
→ Index message for search
→ Notify user

Important Checks


👉 Interview Answer

For inbound email, the system receives messages through MX servers, validates recipient, runs spam and virus checks, stores the message, creates mailbox entries, indexes it, and notifies the user.


1️⃣1️⃣ Mailbox and Folder Management


Mailbox Operations


Design

Mailbox state is per user.

Example:

same message_id
→ Alice: folder=sent
→ Bob: folder=inbox, unread=true

👉 Interview Answer

Mailbox state should be separate from message content.

The same email message may appear in multiple mailboxes, but each user has their own folder, labels, read state, and starred state.


1️⃣2️⃣ Threading / Conversation View


Threading Signals


Thread Table

email_thread (
  thread_id VARCHAR PRIMARY KEY,
  normalized_subject VARCHAR,
  latest_message_at TIMESTAMP,
  participant_addresses ARRAY
)

Why Threading Matters


👉 Interview Answer

Threading groups related emails into conversations.

I would use email headers such as Message-ID, In-Reply-To, and References, with subject normalization as fallback.


1️⃣3️⃣ Search System


Search Fields


Search Architecture

Message stored
→ Indexing event emitted
→ Search indexer parses content
→ Search engine indexes fields
→ User queries search service

Search Store

Use:

Elasticsearch / OpenSearch / custom inverted index

👉 Interview Answer

Email search should use an inverted index.

The message storage system emits indexing events, and a search indexer indexes subject, body, sender, recipients, date, labels, and attachment metadata.


1️⃣4️⃣ Attachments


Storage

Attachments should be stored in object storage.

attachment_id → object storage path

Flow

Upload attachment
→ Virus scan
→ Store in object storage
→ Attach reference to email

Requirements


👉 Interview Answer

Attachments should not be stored directly in the message database.

I would store them in object storage, scan for viruses, enforce size limits, and reference them from message metadata.


1️⃣5️⃣ Templates and Notification Emails


Use Cases


Template Data

{
  "templateId": "order_receipt",
  "variables": {
    "userName": "Alice",
    "orderId": "o123"
  }
}

Flow

Service requests template email
→ Template service renders content
→ Personalization applied
→ Email queued
→ Delivery pipeline sends email

👉 Interview Answer

For transactional and notification emails, I would use a template service.

Business services send template ID and variables, and the email system renders the message, queues it, and sends it through the normal delivery pipeline.


1️⃣6️⃣ Bulk Email and Rate Limiting


Bulk Email Challenges


Strategies


👉 Interview Answer

Bulk email must be carefully controlled.

I would use batching, domain-based throttling, suppression lists, unsubscribe enforcement, and campaign pacing to protect sender reputation.


1️⃣7️⃣ Spam, Abuse, and Reputation


Spam Signals


Abuse Prevention


👉 Interview Answer

Email systems must protect against spam and abuse.

I would use rate limits, sender reputation, domain verification, SPF/DKIM/DMARC, complaint monitoring, and spam scoring to protect deliverability.


1️⃣8️⃣ Tracking


Tracking Events


How Tracking Works

Open Tracking

Tiny tracking pixel.


Click Tracking

Redirect through tracking URL.


Privacy Concern

Tracking should respect user privacy and consent.


👉 Interview Answer

Email tracking can record delivery, opens, clicks, bounces, and complaints.

However, open and click tracking have privacy implications, so tracking should respect consent, user settings, and legal requirements.


1️⃣9️⃣ Consistency Model


Stronger Consistency Needed For


Eventual Consistency Acceptable For


👉 Interview Answer

Email systems use mixed consistency.

Message storage, access control, unsubscribe enforcement, and suppression lists require stronger correctness.

Search indexing, analytics, delivery tracking, and unread counts can be eventually consistent.


2️⃣0️⃣ Scaling Patterns


Pattern 1: Queue-based Sending

Decouple user request from SMTP delivery.


Pattern 2: Separate Message Store and Mailbox Metadata

Avoid duplicating message content.


Pattern 3: Object Storage for Large Bodies / Attachments

Reduce database pressure.


Pattern 4: Search Index for Query

Do not scan mailbox tables for full-text search.


Pattern 5: Domain-based Delivery Throttling

Protect sender reputation.


👉 Interview Answer

To scale an email system, I would use queue-based sending, separate message content from mailbox metadata, store attachments in object storage, index messages for search, and throttle delivery by domain and sender reputation.


2️⃣1️⃣ Failure Handling


Common Failures


Strategies


👉 Interview Answer

Email delivery should assume partial failures.

Messages should be durably stored before sending.

Delivery jobs should retry temporary failures, stop on permanent failures, and record delivery events for auditing and reconciliation.


2️⃣2️⃣ Observability


Key Metrics


👉 Interview Answer

I would monitor send queue depth, delivery success rate, bounce rate, complaint rate, SMTP latency, provider errors, spam classification, search indexing lag, and attachment scan failures.

These metrics directly affect reliability and deliverability.


2️⃣3️⃣ End-to-End Flow


Outbound Send Flow

User sends email
→ Validate request
→ Store message and attachments
→ Create sent mailbox item
→ Enqueue send job
→ Delivery worker sends via SMTP/provider
→ Update recipient delivery status
→ Record delivery event

Inbound Receive Flow

External sender sends email
→ MX server receives message
→ Validate recipient
→ Spam and virus scan
→ Store message
→ Create inbox mailbox item
→ Index message
→ Notify user

Bounce Flow

Recipient server rejects email
→ Bounce event received
→ Update recipient status
→ Store delivery event
→ Add to suppression list if hard bounce
→ Update sender reputation

Key Insight

Email System is not just message sending — it is a durable messaging, delivery, search, spam-control, and reputation system.


🧠 Staff-Level Answer (Final)


👉 Interview Answer (Full Version)

When designing an email system, I think of it as a durable messaging and delivery platform.

The system must support composing, sending, receiving, storing, searching, and organizing email messages.

I would separate message content from mailbox metadata. Message content and attachments are stored durably, while mailbox items store user-specific state such as folder, labels, read status, and thread ID.

For outbound email, the API stores the message first, then enqueues a send job. Delivery workers send the message through SMTP or an email provider, retry temporary failures, and mark permanent failures as bounces.

For inbound email, MX servers receive messages, validate recipients, run spam and virus checks, store messages, create inbox entries, index messages, and notify users.

Attachments should be stored in object storage, scanned for viruses, and referenced from message metadata.

Search should use an inverted index, because full-text search over mailbox tables does not scale.

Deliverability is a major concern. I would use rate limiting, suppression lists, bounce handling, complaint handling, sender reputation, and SPF/DKIM/DMARC validation.

Email systems require mixed consistency. Message storage, access control, unsubscribe enforcement, and suppression lists need stronger correctness. Search, analytics, delivery status, and unread counts can be eventually consistent.

The main trade-offs are delivery latency, reliability, storage cost, search freshness, spam prevention, and sender reputation.

Ultimately, the goal is to reliably store and deliver messages, protect users from spam and abuse, and provide fast mailbox access and search.


⭐ Final Insight

Email System 的核心不是简单发送邮件, 而是一个结合 durable message storage、SMTP delivery、mailbox indexing、spam control 和 sender reputation 的大规模消息系统。



中文部分


🎯 Design Email System


1️⃣ 核心框架

在设计 Email System 时,我通常从以下几个方面分析:

  1. Email composition and submission
  2. Message storage and metadata
  3. Sending pipeline and SMTP delivery
  4. Queue、retry、bounce 和 failure handling
  5. Inbox、search 和 folder management
  6. Templates、bulk email 和 notification email
  7. Spam、abuse、reputation 和 rate limiting
  8. 核心权衡:deliverability vs latency vs reliability

2️⃣ 核心需求


功能需求


非功能需求


👉 面试回答

Email System 负责存储、发送、接收、索引和组织邮件。

核心挑战包括可靠投递、持久化消息存储、 可扩展 mailbox search、spam 防护、 bounce handling 和 sender reputation 维护。


3️⃣ 核心概念


Email Message

一封 email 包含:


Mailbox

Mailbox 为用户存储邮件。

常见 folders:

Inbox
Sent
Drafts
Trash
Spam
Archive

SMTP

SMTP 用于 mail servers 之间发送邮件。


IMAP / POP3

用于 email clients 获取邮件。


👉 面试回答

我会将 message storage 和 mailbox views 分开。

Message body 和 attachments 持久化存储; mailbox metadata 负责 folders、labels、 read state 和 thread relationships。


4️⃣ 主要 API


Send Email

POST /api/emails/send

Request:

{
  "from": "alice@example.com",
  "to": ["bob@example.com"],
  "cc": [],
  "bcc": [],
  "subject": "Hello",
  "body": "Hi Bob",
  "attachments": ["file_123"]
}

Save Draft

POST /api/emails/drafts

Get Inbox

GET /api/mailbox/inbox?limit=50&cursor=xxx

Get Email

GET /api/emails/{emailId}

Search Email

GET /api/emails/search?q=invoice

Update Mailbox State

POST /api/emails/{emailId}/labels

👉 面试回答

我会提供 send email、save drafts、 read mailbox folders、fetch individual messages、 search email 和 update labels/read state 的 APIs。

Sending 应该是异步的, 因为 SMTP delivery 可能很慢或失败。


5️⃣ 数据模型


Message Table

email_message (
  message_id VARCHAR PRIMARY KEY,
  sender_id VARCHAR,
  from_address VARCHAR,
  subject TEXT,
  body_location TEXT,
  headers JSON,
  status VARCHAR,
  created_at TIMESTAMP
)

Recipient Table

email_recipient (
  message_id VARCHAR,
  recipient_address VARCHAR,
  recipient_type VARCHAR, -- to, cc, bcc
  delivery_status VARCHAR,
  provider_response JSON,
  updated_at TIMESTAMP,
  PRIMARY KEY (message_id, recipient_address)
)

Mailbox Item Table

mailbox_item (
  user_id VARCHAR,
  mailbox_item_id VARCHAR,
  message_id VARCHAR,
  folder VARCHAR,
  labels ARRAY,
  read BOOLEAN,
  starred BOOLEAN,
  thread_id VARCHAR,
  received_at TIMESTAMP,
  PRIMARY KEY (user_id, mailbox_item_id)
)

Attachment Table

attachment (
  attachment_id VARCHAR PRIMARY KEY,
  message_id VARCHAR,
  file_name VARCHAR,
  content_type VARCHAR,
  size_bytes BIGINT,
  storage_location TEXT,
  checksum VARCHAR
)

Delivery Event Table

email_delivery_event (
  event_id VARCHAR PRIMARY KEY,
  message_id VARCHAR,
  recipient_address VARCHAR,
  event_type VARCHAR, -- sent, delivered, bounced, complained, opened, clicked
  created_at TIMESTAMP,
  metadata JSON
)

👉 面试回答

我会将 message content、recipients、 mailbox metadata、attachments 和 delivery events 分开存储。

这样一个 physical message 可以出现在多个用户 mailbox 中, 但每个用户有自己的 read state、labels 和 folders。


6️⃣ High-Level Architecture


Client
→ Email API
→ Message Service
→ Attachment Storage
→ Send Queue
→ Mail Delivery Workers
→ SMTP / Email Provider
→ Bounce / Complaint Handler

Incoming Mail Server
→ Inbound Processor
→ Spam Filter
→ Mailbox Service
→ Search Index

Main Components

Email API


Message Service


Send Queue


Delivery Workers


Inbound Processor


Search Index


👉 面试回答

我会将系统拆成 outbound 和 inbound pipelines。

Outbound email 经过 message storage、send queue、 delivery workers 和 SMTP / provider APIs。

Inbound email 经过 receiving servers、spam filtering、 mailbox placement 和 search indexing。


7️⃣ Send Email Flow


Basic Flow

User clicks send
→ Email API validates request
→ Store message and attachments
→ Create sent mailbox item
→ Enqueue send job
→ Delivery worker sends via SMTP/provider
→ Update delivery status
→ Record delivery events

Why Async?

SMTP delivery 可能:


👉 面试回答

Sending email 应该异步执行。

一旦 message 被持久化保存并进入 queue, API 就可以给用户返回成功。

Delivery workers 之后负责发送 email、 retry failures 和更新 delivery status。


8️⃣ SMTP Delivery and Retry


Delivery Flow

Delivery worker
→ Resolve recipient domain MX record
→ Connect to recipient mail server
→ Send message via SMTP
→ Receive response
→ Mark sent / retry / bounced

Failure Types

Temporary Failure

示例:

Mailbox temporarily unavailable
Server busy
Rate limited
Network timeout

处理方式:

retry with backoff

Permanent Failure

示例:

Invalid recipient
Domain does not exist
Mailbox does not exist

处理方式:

mark bounced

👉 面试回答

SMTP delivery 可能 temporary fail 或 permanent fail。

Temporary failures 应该用 exponential backoff 重试。

Permanent failures 应该将 recipient 标记为 bounced, 并停止 retry。


9️⃣ Bounce and Complaint Handling


Bounce Types


Complaint

Recipient 将 email 标记为 spam。


Handling Flow

Provider sends bounce/complaint event
→ Verify event
→ Store delivery event
→ Update recipient delivery status
→ Suppress future sends if needed
→ Update sender reputation

👉 面试回答

Bounce 和 complaint handling 对 deliverability 很关键。

Hard bounces 和 spam complaints 应该更新 suppression lists, 避免系统反复向无效或高风险地址发送邮件。


🔟 Inbound Email Flow


Flow

External sender sends email
→ Our MX server receives message
→ Validate domain and recipient
→ Run spam / virus scanning
→ Store message
→ Create inbox mailbox item
→ Index message for search
→ Notify user

Important Checks


👉 面试回答

对 inbound email, 系统通过 MX servers 接收 messages, 校验 recipient, 进行 spam 和 virus checks, 存储 message, 创建 mailbox entries, 建立 search index, 并通知用户。


1️⃣1️⃣ Mailbox and Folder Management


Mailbox Operations


Design

Mailbox state 是 per user 的。

示例:

same message_id
→ Alice: folder=sent
→ Bob: folder=inbox, unread=true

👉 面试回答

Mailbox state 应该和 message content 分开。

同一封 email message 可以出现在多个 mailboxes, 但每个用户有自己的 folder、labels、 read state 和 starred state。


1️⃣2️⃣ Threading / Conversation View


Threading Signals


Thread Table

email_thread (
  thread_id VARCHAR PRIMARY KEY,
  normalized_subject VARCHAR,
  latest_message_at TIMESTAMP,
  participant_addresses ARRAY
)

Why Threading Matters


👉 面试回答

Threading 会将相关 emails 组合成 conversations。

我会使用 Message-ID、In-Reply-To 和 References 这些 email headers, 并用 subject normalization 作为 fallback。


1️⃣3️⃣ Search System


Search Fields


Search Architecture

Message stored
→ Indexing event emitted
→ Search indexer parses content
→ Search engine indexes fields
→ User queries search service

Search Store

使用:

Elasticsearch / OpenSearch / custom inverted index

👉 面试回答

Email search 应该使用 inverted index。

Message storage system 发布 indexing events, search indexer 会索引 subject、body、sender、 recipients、date、labels 和 attachment metadata。


1️⃣4️⃣ Attachments


Storage

Attachments 应该存储在 object storage。

attachment_id → object storage path

Flow

Upload attachment
→ Virus scan
→ Store in object storage
→ Attach reference to email

Requirements


👉 面试回答

Attachments 不应该直接存储在 message database。

我会将它们存入 object storage, 进行 virus scanning, 强制 size limits, 并在 message metadata 中引用它们。


1️⃣5️⃣ Templates and Notification Emails


Use Cases


Template Data

{
  "templateId": "order_receipt",
  "variables": {
    "userName": "Alice",
    "orderId": "o123"
  }
}

Flow

Service requests template email
→ Template service renders content
→ Personalization applied
→ Email queued
→ Delivery pipeline sends email

👉 面试回答

对 transactional 和 notification emails, 我会使用 template service。

Business services 发送 template ID 和 variables, email system 渲染 message, 将其放入 queue, 并通过正常 delivery pipeline 发送。


1️⃣6️⃣ Bulk Email and Rate Limiting


Bulk Email Challenges


Strategies


👉 面试回答

Bulk email 必须谨慎控制。

我会使用 batching、domain-based throttling、 suppression lists、unsubscribe enforcement 和 campaign pacing 来保护 sender reputation。


1️⃣7️⃣ Spam, Abuse, and Reputation


Spam Signals


Abuse Prevention


👉 面试回答

Email system 必须防止 spam 和 abuse。

我会使用 rate limits、sender reputation、 domain verification、SPF/DKIM/DMARC、 complaint monitoring 和 spam scoring 来保护 deliverability。


1️⃣8️⃣ Tracking


Tracking Events


How Tracking Works

Open Tracking

Tiny tracking pixel.


Click Tracking

Redirect through tracking URL.


Privacy Concern

Tracking 应该尊重 user privacy 和 consent。


👉 面试回答

Email tracking 可以记录 delivery、opens、clicks、 bounces 和 complaints。

但是 open 和 click tracking 有 privacy implications, 所以 tracking 应该遵守 consent、user settings 和法律要求。


1️⃣9️⃣ Consistency Model


需要较强一致性的场景


可以最终一致的场景


👉 面试回答

Email systems 使用 mixed consistency。

Message storage、access control、 unsubscribe enforcement 和 suppression lists 需要更强正确性。

Search indexing、analytics、delivery tracking 和 unread counts 可以最终一致。


2️⃣0️⃣ Scaling Patterns


Pattern 1: Queue-based Sending

将用户请求和 SMTP delivery 解耦。


Pattern 2: Separate Message Store and Mailbox Metadata

避免重复存储 message content。


Pattern 3: Object Storage for Large Bodies / Attachments

降低 database 压力。


Pattern 4: Search Index for Query

不要扫描 mailbox tables 做 full-text search。


Pattern 5: Domain-based Delivery Throttling

保护 sender reputation。


👉 面试回答

为了扩展 email system, 我会使用 queue-based sending, 将 message content 和 mailbox metadata 分开, 把 attachments 存入 object storage, 用 search index 支持搜索, 并按 domain 和 sender reputation 控制投递速度。


2️⃣1️⃣ Failure Handling


Common Failures


Strategies


👉 面试回答

Email delivery 必须假设 partial failures。

Messages 应该在发送前先持久化存储。

Delivery jobs 对 temporary failures 进行 retry, 对 permanent failures 停止 retry, 并记录 delivery events 以支持 audit 和 reconciliation。


2️⃣2️⃣ Observability


Key Metrics


👉 面试回答

我会监控 send queue depth、delivery success rate、 bounce rate、complaint rate、SMTP latency、 provider errors、spam classification、 search indexing lag 和 attachment scan failures。

这些指标直接影响 reliability 和 deliverability。


2️⃣3️⃣ End-to-End Flow


Outbound Send Flow

User sends email
→ Validate request
→ Store message and attachments
→ Create sent mailbox item
→ Enqueue send job
→ Delivery worker sends via SMTP/provider
→ Update recipient delivery status
→ Record delivery event

Inbound Receive Flow

External sender sends email
→ MX server receives message
→ Validate recipient
→ Spam and virus scan
→ Store message
→ Create inbox mailbox item
→ Index message
→ Notify user

Bounce Flow

Recipient server rejects email
→ Bounce event received
→ Update recipient status
→ Store delivery event
→ Add to suppression list if hard bounce
→ Update sender reputation

Key Insight

Email System 不是简单发送 message, 而是 durable messaging、delivery、search、spam-control 和 reputation system。


🧠 Staff-Level Answer(最终版)


👉 面试回答(完整背诵版)

在设计 Email System 时, 我会把它看作一个 durable messaging 和 delivery platform。

系统需要支持 compose、send、receive、store、search 和 organize email messages。

我会将 message content 和 mailbox metadata 分开。 Message content 和 attachments 会被持久化存储, mailbox items 则存储用户相关状态, 例如 folder、labels、read status 和 thread ID。

对 outbound email, API 会先存储 message, 然后将 send job 放入 queue。 Delivery workers 会通过 SMTP 或 email provider 发送, 对 temporary failures 重试, 并将 permanent failures 标记为 bounce。

对 inbound email, MX servers 接收 messages, 校验 recipients, 做 spam 和 virus checks, 存储 messages, 创建 inbox entries, 建立 search index, 并通知用户。

Attachments 应该存储在 object storage 中, 经过 virus scan, 并通过 message metadata 引用。

Search 应该使用 inverted index, 因为直接在 mailbox tables 上做 full-text search 不可扩展。

Deliverability 是核心问题。 我会使用 rate limiting、suppression lists、 bounce handling、complaint handling、 sender reputation 和 SPF/DKIM/DMARC validation。

Email systems 需要 mixed consistency。 Message storage、access control、 unsubscribe enforcement 和 suppression lists 需要更强正确性。 Search、analytics、delivery status 和 unread counts 可以最终一致。

核心权衡包括 delivery latency、reliability、 storage cost、search freshness、spam prevention 和 sender reputation。

最终目标是可靠地存储和投递 messages, 保护用户免受 spam 和 abuse, 并提供快速的 mailbox access 和 search。


⭐ Final Insight

Email System 的核心不是简单发送邮件, 而是一个结合 durable message storage、SMTP delivery、mailbox indexing、spam control 和 sender reputation 的大规模消息系统。

Implement