0% found this document useful (0 votes)
44 views62 pages

Alex XU Notes

Chapter 1 discusses the evolution of system design as user numbers grow, outlining five stages from a single server setup to distributed systems. Chapter 2 focuses on reliability, defining it as the system's ability to function correctly over time and detailing techniques to improve it. Chapter 3 presents a structured framework for approaching system design interviews, emphasizing the importance of organization and clear communication.

Uploaded by

anilvit7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views62 pages

Alex XU Notes

Chapter 1 discusses the evolution of system design as user numbers grow, outlining five stages from a single server setup to distributed systems. Chapter 2 focuses on reliability, defining it as the system's ability to function correctly over time and detailing techniques to improve it. Chapter 3 presents a structured framework for approaching system design interviews, emphasizing the importance of organization and clear communication.

Uploaded by

anilvit7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

Sure!

Here's a concise summary of Chapter 1: Zero to Million Users from System Design Interview – An
Insider's Guide by Alex Xu:

Chapter 1: From Zero to Million Users – The Evolution of System Design

This chapter introduces the idea that system design evolves as the number of users grows. What works for
1,000 users may not work for a million. The chapter walks through five stages of system growth:

1. Single Server Setup (0 – 1,000 users)

• All components (web server, database, business logic) are hosted on one machine.
• Simple to build and deploy.
• Bottlenecks: limited scalability and availability.

2. Application Layer Separation (1,000 – 10,000 users)

• Separate web server, application server, and database server.


• Enables independent scaling and better organization.
• Still relatively simple architecture

3. Database Optimization (10,000 – 100,000 users)

• Introduce replication (read replicas), caching (Redis, Memcached), and connection pooling.
• Improve performance and reduce database load.
• Focus on optimizing slow queries and indexing.

4. Horizontal Scaling (100,000 – 1 million users)

• Add load balancers and multiple application servers.


• Introduce CDNs for static content delivery.
• Focus on statelessness in services and auto-scaling.

5. Distributed Systems (1 million+ users)

• Systems are now complex and highly distributed.


• Use message queues, distributed storage, microservices, and asynchronous processing.
• High availability, fault tolerance, and disaster recovery become critical.
Key Takeaway:

Design should evolve gradually with growth. Premature optimization can overcomplicate the system. Focus
on solving the right problems at the right scale.

Sure! Here’s a summary of Chapter 2: Reliability from System Design Interview – An Insider’s Guide by Alex
Xu

Chapter 2: Reliability

Reliability refers to the system’s ability to work correctly and consistently over time. It’s crucial for user
trust and business continuity.

1. What is Reliability?

• A reliable system performs the correct functions as expected.


• It includes:
o Availability: System is up and accessible.
o Durability: Data is safely stored.
o Consistency: System returns correct responses.

2. Types of Failures

• Hardware Failures: Disk crash, power outage.


• Software Bugs: Logic errors, memory leaks.
• Human Errors: Misconfiguration, bad deployments.
• Overload: Traffic spikes causing service degradation.

3. Techniques to Improve Reliability


Technique Purpose
Redundancy Duplicate components (e.g. servers, DBs) to prevent single points of failure.
Replication Copy data across nodes for fault tolerance and availability.
Failover Automatically switch to standby system if one fails.
Load Balancing Distribute traffic evenly to avoid overloading any one server.
Health Checks Monitor system components and remove unhealthy instances.
Rate Limiting Prevent abuse and protect backend during traffic spikes.
Timeouts & Retries Avoid hanging on failed calls; retry with exponential backoff.
Monitoring & Alerting Detect and respond to incidents quickly.

4. Tradeoffs

• High reliability often increases complexity and cost.


• There’s a balance between performance, cost, and reliability.
• Prioritize reliability for critical systems (e.g. banking, healthcare).

Key Takeaway:

Design with failure in mind. Build systems assuming something will go wrong, and ensure graceful
degradation or recovery when it does.

Here’s a summary of Chapter 3: A Framework for System Design Interviews from System Design Interview
– An Insider’s Guide by Alex Xu:

Chapter 3: A Framework for System Design Interviews

This chapter presents a step-by-step framework to tackle system design interviews in a structured,
confident, and comprehensive way.

🔶 Why a Framework?

• System design problems are open-ended.


• Without structure, candidates may miss important parts.
• A framework helps organize thoughts, communicate clearly, and cover all aspects.
🧠 Pro Tips:

• Think out loud so the interviewer can follow your reasoning.


• Always discuss tradeoffs (e.g. SQL vs NoSQL, cache invalidation strategies).
• Use diagrams to illustrate your design clearly.

Key Takeaway:

Use a clear framework to handle design interviews like a pro — starting from understanding the problem to
proposing a scalable, maintainable solution.

Certainly! Here’s a clear and concise summary of Chapter 2: Back-of-the-Envelope Estimation from
System Design Interview – An Insider's Guide by Alex Xu:

Chapter 2: Back-of-the-Envelope Estimation

🔍 Purpose of Back-of-the-Envelope Estimation

• To quickly estimate scale, load, and storage needs of a system.


• Helps determine feasibility and drive design choices.
• Interviewers look for your ability to reason with numbers and make realistic assumptions.
📌 What to Estimate

This chapter teaches you to estimate key system metrics:

Metric Description
Users Daily Active Users (DAU), Monthly Active Users (MAU)
QPS (Queries Per Second) How much traffic your system handles per second
Storage Needs How much data your system stores over time
Bandwidth How much data is transferred between components
Read/Write Ratios Determines database and cache optimization needs

🧠 How to Estimate – General Approach

1. Define a scenario

e.g., "Design Instagram: 100 million MAU"

2. Make reasonable assumptions


a. DAU = 10% of MAU
b. Each user uploads 1 photo/day
c. Each photo = 2 MB
d. Each request = 1 KB metadata
3. Calculate metrics
a. Daily Uploads: 10M photos
b. Daily Storage: 10M * 2MB = 20 TB/day
c. QPS: Total requests per second = (DAU × actions/user) ÷ seconds/day
4. Use powers of 10 for simplicity

Example: 1 day ≈ 100,000 seconds

⚖ Why It Matters in System Design

• Helps identify performance bottlenecks


• Informs decisions like:
o Should we use caching?
o How should we partition data?
o Do we need a CDN?
• Proves you're thinking practically and at scale

✅ Key Takeaways

• Don’t aim for perfect precision—estimate within a reasonable range


• Be explicit with your assumptions
• Break down large problems into manageable parts
• Practice estimating for common systems like:
o Twitter feeds
o YouTube views
o WhatsApp messages
o Amazon orders

Great! Here's a Back-of-the-Envelope Estimation Cheat Sheet and Worksheet Template you can use to
practice and structure your answers during system design interviews:

✅ Cheat Sheet: Quick Reference for Estimations

Metric Approximate Value


1 Day ~100,000 seconds
1 Month ~30 days
1 Year ~365 days
1 KB (text) ~1 short message (e.g., tweet)
1 MB (image) ~1 medium-quality photo
1 GB ~500 high-res photos
1 User Action ~1 KB (request size)
Active Users (DAU) ~10% of Monthly Active Users (MAU)
Photos per User/day ~1 (for Instagram-like apps)
Videos per User/day ~0.1 (for YouTube-like apps)
Avg. Size of Video ~10–100 MB
Network Bandwidth Requests/sec × Data/request

🧾 Worksheet Template: Estimation Step-by-Step

Use this template during interviews or practice sessions.

🔹 1. Define the Problem

Example: Design a photo-sharing app like Instagram for 100M MAU

🔹 2. Estimate Active Users

• MAU = 100 million


• DAU = 10% of MAU = 10 million

🔹 3. Estimate User Actions

Assume:
• 1 photo upload per user per day
• 1 photo = 2 MB
• 10 views per photo (read-heavy app)

🔹 4. Estimate Storage

• Daily Storage = DAU × photo size


• 10M × 2 MB = 20 TB/day
• Monthly Storage = 20 TB/day × 30 = 600 TB/month

🔹 5. Estimate QPS (Queries Per Second)

Assume:

• 1M photo views/hour
• Requests per second = total requests / seconds in a day

→ (10M uploads + 100M views) / 100,000

→ ~1,100 QPS

🔹 6. Estimate Bandwidth

Assume:

• Each view transfers 500 KB

→ 100M × 500 KB = 50 TB/day

🔹 7. Read/Write Ratio

• Writes = 10M (uploads)


• Reads = 100M (views)
• Read:Write = 10:1 → Optimize for reads (e.g., CDN, cache)

🧠 Final Thoughts

• Add caching for views (Redis, CDN)


• Use object storage (e.g., S3) for media
• Store metadata in a database

🔹 A. Twitter (X)

Estimate scale for a real-time microblogging platform:


• Millions of tweets/day
• Read-heavy, fast response

🔹 B. YouTube

Estimate scale for a video platform:

• Large storage for video content


• Heavy reads and bandwidth usage

👉 Reply with A or B, and we’ll go step-by-step from defining the problem to calculating QPS, storage, and
bandwidth.

Great choice! Let’s walk through a Back-of-the-Envelope Estimation for a system like Twitter (X) using the
framework.

🐦 System Design: Twitter-like Platform

🔹 1. Define the Problem

Design a Twitter-like platform that supports:

• Tweeting
• Reading tweets
• User timelines

Assume: 200 million Monthly Active Users (MAU)

🔹 2. Estimate Active Users (DAU)

• DAU ≈ 10% of MAU → 20 million DAU

🔹 3. Estimate User Actions

Assume per user per day:

• 5 tweets posted
• 50 tweets read

So per day:

• Tweets written = 20M × 5 = 100 million


• Tweets read = 20M × 50 = 1 billion

🔹 4. Estimate Storage

Assume:
• Each tweet = 280 characters ≈ 0.5 KB
• Each tweet is stored with metadata = ~1 KB total

Daily Tweet Storage:

→ 100M × 1 KB = 100 GB/day

Monthly Tweet Storage:

→ 100 GB/day × 30 = ~3 TB/month

(Excludes media, which is stored separately.)

🔹 5. Estimate QPS (Queries per Second)

• Total requests/day = Writes (tweets) + Reads (reads of tweets)

→ 100M + 1B = 1.1B requests/day

• 1 day ≈ 100,000 seconds

→ QPS ≈ 1.1B / 100,000 = ~11,000 QPS

🔹 6. Estimate Bandwidth

• Tweets read = 1B/day × 1 KB = ~1 TB/day


• Tweets posted = 100M/day × 1 KB = ~100 GB/day

Total Bandwidth: ~1.1 TB/day

(Assuming no images/videos; media would increase this significantly.)

🔹 7. Read/Write Ratio

• Reads = 1B
• Writes = 100M
• Read:Write = 10:1

🔁 Implication: Must optimize for reads → use caching (Redis, Memcached), fan-out strategies, timeline
pre-computation.
Excellent! Let's now walk through a Back-of-the-Envelope Estimation for a YouTube-like platform—a
large-scale, video-streaming service.

📺 System Design: YouTube-like Platform

🔹 1. Define the Problem

Design a platform that allows:

• Users to upload videos


• Others to stream/watch videos

Assume: 2 billion Monthly Active Users (MAU)

🔹 2. Estimate Active Users (DAU)

• DAU ≈ 10% of MAU

→ 200 million DAU

🔹 3. Estimate User Actions

Assume per user per day:

• 0.1 video uploads


• 10 video views

So daily:
• Videos uploaded = 200M × 0.1 = 20 million
• Videos watched = 200M × 10 = 2 billion

🔹 4. Estimate Storage

Assume:

• Avg. video = 10 minutes


• Encoding = 5 MB/min → 50 MB per video

Daily Storage = 20M × 50 MB = 1 PB/day (petabyte = 1,000 TB)

Monthly Storage = 1 PB/day × 30 = ~30 PB/month

📦 Note: YouTube likely stores multiple resolutions (e.g. 480p, 720p, 1080p) → actual usage could be 3× to
5× more.

🔹 5. Estimate QPS (Queries per Second)

• Requests/day = video plays = 2B


• 1 day ≈ 100,000 seconds

QPS = 2B / 100,000 = 20,000 QPS

• Writes (uploads): 20M / 100,000 = ~200 QPS


• Reads (views): 2B / 100,000 = ~20,000 QPS

🔹 6. Estimate Bandwidth

Assume:

• Avg. video size streamed = 50 MB


• Daily views = 2B

Bandwidth/day = 2B × 50 MB = 100 PB/day

💡 A CDN is essential to serve this volume.

🔹 7. Read/Write Ratio

• Reads = 2B
• Writes = 20M
• Read:Write = 100:1

🚨 Very read-heavy. Caching, buffering, and CDN delivery are crucial.


🧠

Absolutely! Practicing more Back-of-the-Envelope Estimation examples will sharpen your system design
skills. Here are 5 practical examples across different domains, each designed to test different aspects of
system thinking:
🧮 Back-of-the-Envelope Estimation Examples

1. WhatsApp-like Messaging App

• MAU: 1.5 billion


• Assumptions:
o DAU = 150 million
o Each user sends 50 messages/day
o Message = ~1 KB
• Daily Messages: 150M × 50 = 7.5 billion
• Storage/day: ~7.5 TB
• QPS: (7.5B + 15B reads) / 100,000 ≈ 225,000 QPS
• Challenges:
o Low latency delivery
o End-to-end encryption
o High fan-out for group messages

2. Instagram-like Photo App

• MAU: 800 million


• Assumptions:
o DAU = 80 million
o Each uploads 2 photos/day
o Each photo = 2 MB
• Daily Uploads: 160M photos
• Storage/day: 320 TB
• Reads: Assume 20 views/photo → 3.2B reads
• Read:Write ≈ 20:1
• Bandwidth/day: ~6 PB

3. Netflix-like Streaming App

• MAU: 300 million


• DAU: 30 million
• Avg. view time: 1 hour/user/day
• Bitrate: 3 Mbps = 1.35 GB/hour
• Daily Bandwidth: 30M × 1.35 GB = 40.5 PB/day
• Peak QPS (Concurrent viewers): ~10 million
• Key Design:
o Global CDN
o Video compression/transcoding
o Smooth adaptive bitrate switching

4. Amazon-like E-commerce Platform

• MAU: 500 million


• DAU: 50 million
• Actions/day: 20 page views + 1 purchase
• Total requests/day: 1B (views) + 50M (orders)
• QPS: ~10,500
• Storage/day: Orders (~1 KB/order) = 50 GB
• Challenges:
o Inventory updates
o Real-time recommendations
o Secure payment processing

5. Google Maps-like Navigation System

• MAU: 1 billion
• DAU: 100 million
• Active requests/hour: 10M users requesting route every 30 mins
• QPS: ~140K for route calculation
• Data size: Each map tile = ~20–100 KB
• Key Systems:
o Real-time traffic ingestion
o Fast geo queries
o Location-based caching

🔧 Next Steps

Would you like me to:

• Generate these into a downloadable worksheet?


• Go deep on one of them (e.g., WhatsApp or Netflix)?
• Provide practice problems with blank templates for you to fill in?

Let me know how you’d like to proceed!


Perfect! Let’s dive deep into the Back-of-the-Envelope Estimation for designing a WhatsApp-like
messaging platform.

📱 WhatsApp-like Messaging System – Estimation & Design Breakdown

🔹 1. Define the Problem

Design a real-time messaging system like WhatsApp, supporting:

• 1:1 and group messaging


• Media sharing (images, videos, voice notes)
• Message delivery & read receipts
• End-to-end encryption

Assume:

• 1.5 billion Monthly Active Users (MAU)

🔹 2. Estimate Daily Active Users (DAU)

• DAU ≈ 10%–20% of MAU


• We'll take 10% for conservative estimate →
• DAU = 150 million

🔹 3. Estimate User Actions

Assume per user per day:

• Messages sent: 50
• Photos sent: 5
• Voice/video notes sent: 1
• Each message = ~1 KB
• Each media = ~1 MB (average)

Totals:

• Messages/day = 150M × 50 = 7.5 billion messages/day


• Media/day = 150M × (5+1) = 900 million media/day
🔹 4. Estimate Storage Requirements

Messages:

• 7.5B × 1 KB = ~7.5 TB/day (text messages only)

Media:

• 900M × 1 MB = ~900 TB/day

🔐 Only sender & recipient have access due to end-to-end encryption — metadata like timestamps, delivery
receipts might still be stored centrally.

🔹 5. Estimate QPS (Queries per Second)

• Messages: 7.5B/day = 7.5B / 100,000 ≈ 75,000 QPS


• Reads (delivered & seen): ~2× messages = 15B/day ≈ 150,000 QPS
• Total QPS: ~225,000

🔹 6. Estimate Bandwidth

Text:

• 7.5B × 1 KB = 7.5 TB/day

Media:

• 900M × 1 MB = 900 TB/day

Most bandwidth comes from media; needs aggressive compression and CDN delivery (even for encrypted
media, you can cache encrypted blobs).

🔹 7. Read-Write Ratio

• Writes: 7.5B
• Reads: ~15B (delivery, read receipts, sync across devices)
• Read:Write ≈ 2:1

This is bi-directional traffic; not as skewed as YouTube/Twitter.


🧠

🔒 Security Architecture

• Encryption: E2E using the Signal Protocol (per session key)


• Key Management: Device-held keys, session-based
• Media: Encrypted with a separate key before upload
S

Sure! Here’s a detailed summary of Chapter: Design a URL Shortener from System Design Interview – An
Insider’s Guide by Alex Xu.

🔗 Chapter: Design a URL Shortener (e.g., TinyURL or Bitly)

This chapter walks you through the complete system design of a URL shortener service, where users submit
a long URL and get a shortened version like short.ly/xyz123.

✅ 1. Functional Requirements

• Shorten a long URL to a short one (e.g., short.ly/abc123)


• Redirect users from short URL to the original long URL
• Expiration support (optional)
• Basic analytics (click count, etc. – optional)
⚙ 2. Non-Functional Requirements

• High availability – redirection must always work


• Low latency – fast redirection
• Scalability – handle billions of URLs and redirections
• Data durability – URLs must not be lost

🧩 3. High-Level Design

The system has two major operations:

1. Shorten URL:

Accept long URL → generate short code → store mapping

2. Redirect URL:

Accept short code → look up long URL → redirect


6. Redirection Flow

1. User accesses short.ly/abc123


2. System queries database to fetch long URL
3. Redirects user using HTTP 302 (temporary) or 301 (permanent)

7. Capacity Estimation (Back-of-the-Envelope)

• 1B short URLs stored


• Average long URL: 500 bytes
• Total storage: 500 GB
• QPS: 10K–100K requests/sec
• Use caching (Redis) to speed up redirection
🏗 🔁

9. Additional Features (Optional)

• Custom short URLs (short.ly/myproduct)


• URL expiration and deletion
• Rate limiting and abuse detection
• QR code generation

10. Trade-Offs

• Consistency vs Availability: Choose based on user experience


• Short code uniqueness: Needs coordination (UUID or centralized counter)
• Analytics vs Performance: Analytics logging might be async

✅ Key Takeaways

• Keep it simple and scalable


• Choose ID generation wisely
• Optimize redirection path with caching
• Store mappings reliably, possibly with replication

Certainly! Here's a clear and structured explanation of the chapter “Design a Unique ID Generator in
Distributed Systems” from System Design Interview – An Insider’s Guide by Alex Xu.
🔐 Chapter: Design a Unique ID Generator in Distributed Systems

In distributed systems, generating globally unique, ordered, and scalable IDs is essential for things like:

• Primary keys in databases


• Order IDs, user IDs, tweet IDs
• Message IDs in queues, logs

This chapter explores how to design such a service efficiently and safely.

✅ 1. Requirements

📌 Functional

• Generate unique, non-colliding IDs


• High throughput and low latency
• Support distributed, concurrent requests

🛡
Non-functional

• Scalable: Handle millions of ID requests/sec


• Reliable: No duplicated or missing IDs
• Ordered (optional): Time-sortable (like Twitter Snowflake)

🔢 3. ID Formats

Option A: UUID

• Randomly generated
• 128-bit, e.g. 550e8400-e29b-41d4-a716-446655440000
• ❌ Not sortable, relatively large
Option B: Database Auto-Increment

• Central sequence generator (e.g., MySQL, Redis INCR)


• ❌ Bottleneck under scale

✅ Option C: Snowflake-style ID

Most preferred in large-scale systems.

🔧 5. Components of a Snowflake System

1. Timestamp Generator – ensures monotonically increasing time


2. Worker ID Assigner – assigns unique machine IDs
3. Sequence Tracker – maintains count for current millisecond
4. Clock Synchronization – handles clock drift/rollback
🧠 7. Design Considerations

• Use caching for speed


• Optionally prefetch ID ranges
• Store state locally to avoid remote calls
• Provide failover logic for clock issues

Key Takeaways

• Snowflake ID generation is highly efficient and widely adopted


• Tradeoffs include ordering, size, clock reliance, and machine coordination
• Always consider failure modes (e.g., time rollback, machine duplication)

Certainly! Here’s a detailed explanation of the chapter “Design a Web Crawler” from System Design
Interview – An Insider’s Guide by Alex Xu. This is one of the more complex system design problems and tests
your understanding of scalability, distribution, concurrency, and data processing.

🌐 Chapter: Design a Web Crawler

🧭 What is a Web Crawler?

A web crawler (or spider) is a bot that browses the web in a methodical, automated manner to:

• Download web pages


• Extract links from them
• Follow those links recursively

🧠 Used by: Search engines (e.g., Google), SEO tools, data mining applications
✅ 1. Functional Requirements

• Start from a list of seed URLs


• Crawl and download web pages
• Extract and follow hyperlinks recursively
• Avoid crawling the same URL multiple times
• Respect robots.txt (optional)
• Allow for parallel crawling

2. Non-Functional Requirements

• Scalable: Handle billions of URLs


• Efficient: High throughput; fast crawling
• Distributed: Multiple workers, nodes
• Polite: Respect domain rate limits
• Reliable: Handle failures, retries
• Extensible: Easy to plug in parsers or filters

🔁 4. Workflow

1. Seed: Add initial URLs to the queue (URL Frontier)


2. Download: Fetcher downloads the page
3. Parse: Extract HTML and links
4. Filter: Remove duplicates, non-HTTP links, blacklisted domains
5. Store: Save content in storage or DB
6. Requeue: Valid links go back to the frontier
📥 5. URL Frontier Design

• Needs to be scalable and distributed


• Implemented with a priority queue, where:
o Popular domains get lower priority to avoid hammering them
o Depth-first or breadth-first crawling can be prioritized

✅ Use tools like:

• Kafka / RabbitMQ (for distribution)


• Redis / Cassandra (for stateful queues)

📛 6. Duplicate URL Elimination

• Crucial for efficiency and correctness


• Use Bloom Filters or Hash Sets to track visited URLs

✅ Options:

• Redis Set (if fits in memory)


• Distributed hash (like DHT)
• Bloom filter (if space is constrained, allow false positives)

🚥 7. Politeness & Rate Limiting

Respect domain-specific rules:

• robots.txt: Honor crawling rules


• Throttle: Limit requests per domain/IP (e.g., 1 req/sec)
• Retry: On 429 (Too Many Requests) or 5xx errors, retry later

✅ Use:

• Token bucket algorithm


• Domain-wise rate-limit queue

📁 8. Storage Strategy

Store:

• Raw HTML
• Parsed content (e.g., title, meta tags, text)
• Metadata (timestamp, status, size, latency)
✅ Options:

• S3 / HDFS (for HTML blobs)


• Elasticsearch (for indexed search)
• Relational/NoSQL DB (for metadata)

🧠 🧮

10. Back-of-the-Envelope Estimation

Let’s say you want to crawl 1 billion pages in 30 days:

• Pages/day = 1B / 30 = ~33.3M
• Pages/second ≈ 33.3M / 86,400 ≈ 385 QPS
• Assuming 100 KB/page → 1B × 100 KB = 100 TB total

🔋 With 1,000 worker nodes, each must do ~1 page every 2.5 seconds

🧩 11. Optional Features

• Domain-based prioritization (e.g., gov or edu sites first)


• Multi-threaded fetching
• Language detection
• Media crawling (images, PDFs, videos)
• Change detection (re-crawl based on delta)
✅ Key Takeaways

• A web crawler is a distributed pipeline of fetching, parsing, filtering, and queuing


• Key challenges include duplication, politeness, and scaling
• Maintain a reliable, persistent frontier and robust filtering
• Use caching, queues, rate limiters, and bloom filters smartly

Certainly! Here’s a detailed explanation of the chapter “Design Notification System (Notification as a
Service)” from System Design Interview – An Insider’s Guide by Alex Xu.

This chapter explores how to build a generic, scalable, multi-channel notification service, like those used
by apps to send alerts via email, SMS, push notifications, or in-app messages.

🔔 Chapter: Design Notification as a Service

🧾 1. Problem Statement

Design a multi-channel, asynchronous, reliable notification system that can:

• Handle high volumes of notification requests


• Send notifications via email, SMS, push, etc.
• Allow scheduling (now or future)
• Support retries, failure handling, and delivery logging

✅ 2. Functional Requirements

• Accept a request to send notification(s)


• Support multiple channels (Email, SMS, Push)
• Allow broadcast or targeted messages
• Optionally support scheduled or delayed delivery
• Provide status tracking: sent, delivered, failed

🛡 3. Non-Functional Requirements

• High throughput and low latency


• Scalability to millions of users
• Reliability: guarantee delivery or retries
• Extensibility: add more channels easily
• Observability: logs, metrics, audit trails

🔁 5. Notification Flow

1. Producer sends request to Notification API


2. Notification is validated, templated, and stored in DB
3. Message is placed in queue (e.g., Kafka topic)
4. Workers consume from queue and invoke relevant channel service
5. Channel service talks to provider API (e.g., Twilio, SendGrid, FCM)
6. Delivery results logged; retries scheduled if needed
🔃 7. Retry Strategy & Failure Handling

• Use Exponential Backoff on failure


• Define retry limits per channel
• Persist undeliverable messages in Dead Letter Queue (DLQ)
• Alert via monitoring if too many failures from one channel/provider

🧠 8. Scheduling (Optional Feature)

To support delayed delivery:

• Store jobs with timestamp in a delayed queue or DB with cron poller


• Use a scheduler service to enqueue messages at the right time

✅ Tools: Redis (with sorted sets), Quartz Scheduler, or delay queues in Kafka

📊 9. Monitoring and Metrics

Track:

• Number of notifications per type/channel


• Success/failure ratio
• Retry count
• Latency per channel/provider

Tools: Prometheus + Grafana, Elastic + Kibana, DataDog

🧮 10. Back-of-the-Envelope Estimation

Let’s estimate for a large-scale system:


• 50 million DAU, avg. 2 notifications/day → 100M/day
• = 1.15K/sec (100M / 86,400 seconds)
• Email = 30%, SMS = 20%, Push = 50%
• Peak loads (e.g., sale event) → 10× = ~11,500/sec

💾 Storage: Logs & metadata → ~1 KB/notification = ~100 GB/day

🏗 12. Scalability Tips

• Partition queues per channel or tenant


• Horizontally scale workers per channel
• Use bulk APIs from providers when possible
• Add circuit breakers for unstable providers✅ Key Takeaways
• A notification system is a pipeline of producers, queue, workers, and channel-specific delivery
handlers
• Use queues to decouple systems and allow retries
• Make the system modular, extensible, and resilient
• Always think in terms of user experience (fast, reliable, respectful of preferences)

🔄 Optional Features You Can Add

• A UI dashboard to show notification logs


• A rules engine (e.g., "if order delayed, send SMS")
• Notification priorities
• Batch notifications (digest emails)

Absolutely! Here's a detailed explanation of the chapter “Design a News Feed System” from System
Design Interview – An Insider’s Guide by Alex Xu.
This chapter focuses on how to build a scalable, low-latency, and personalized feed — the kind you see in
social media apps like Facebook, Instagram, or Twitter.

📰 Chapter: Design a News Feed System

🧾 1. What Is a News Feed System?

A News Feed system aggregates content (posts, updates, videos, etc.) from many sources (friends, pages,
followed users) and shows a personalized list of the most relevant or recent content to the user.

Examples:

• Facebook timeline
• Instagram feed
• Twitter home timeline
• LinkedIn activity feed

✅ 2. Functional Requirements

• Users can post content (text, image, video)


• Users can follow others
• Each user sees a personalized feed
• Feed supports pagination or infinite scroll
• Show latest or most relevant posts
• Deliver posts from thousands of followees

3. Non-Functional Requirements

• Low latency: Feed should load instantly


• Scalable: Handle billions of users and posts
• Highly available: Can’t go down even if one part fails
• Consistent enough: Slight feed delay is tolerable

4. News Feed Design Approaches

Alex Xu highlights two main strategies:

Option A: Pull Model ("On-demand Feed Generation")

• Feed is generated when user opens the app


• System queries the database:

→ Who do I follow? → What are their latest posts?


→ Sort + merge + paginate → Return feed

✅ Pros:

• Always fresh, latest posts


• Simple, no pre-computation

❌ Cons:

• High latency
• Heavy DB load on every feed request
• Harder to scale

Option B: Push Model ("Fan-out on Write")

• Feed is precomputed and stored per user


• When a user posts, their post is pushed to followers' feeds
• Followers simply read from their own feed store

✅ Pros:

• Super fast feed loading (just read from DB or cache)


• Reduces read-time computation

❌ Cons:

• Write amplification: One post → pushed to millions


• Storage overhead
• May have stale data if user changes follow list

🔄 Hybrid Model (Used by Facebook & Twitter)

• Push for normal users (e.g., 100s of followers)


• Pull for celebrities or hot users (e.g., millions of followers)

➡ Reduces push overhead while keeping feed fast for most users
🔁 6. Feed Generation Logic

Pull Model

1. User opens app


2. Get list of followed users
3. Fetch their recent posts (from DB or cache)
4. Sort by time or relevance
5. Return top N posts
Push Model
1. User A posts content
2. System fetches A's followers (1000+)
3. Pushes the post to each follower’s feed DB/cache
4. When followers open app, they read prefilled feed

7. Feed Ranking Strategies

• Time-based: Show latest posts first


• Engagement-based: Likes, shares, comments
• ML-based personalization: Based on user interest

➡ A ranking engine may use:

• Collaborative filtering
• Behavioral signals
• Prior engagement
8. Back-of-the-Envelope Estimation

Assume:

• 500M DAU
• Each user follows 500 users
• Each followee posts 2 times/day

→ 500 × 2 = 1000 posts/user/day

→ Feed system must handle:

• 500M × 1000 = 500 billion feed items/day

Assume:

• Each feed item = 1 KB

→ 500 TB/day to store feeds

Feed read requests:

500M opens/day → ≈ 5800 QPS

🔄 9. Data Storage and Caching

• Use Redis/Memcached for fast feed retrieval


• Sharded DBs or NoSQL (Cassandra, DynamoDB) for feed storage
• Time series databases for temporal feed patterns

📊 10. Optional Features

• Infinite scrolling
• Filters: only photos, videos, friends
• Stories (temporary content)
• Ads injected into feed
• Real-time feed updates with WebSocket or polling
✅ Key Takeaways

• Pull model is simple but slow and hard to scale


• Push model is fast for reads but expensive for writes
• Most real systems use a hybrid based on user popularity
• Feed design depends heavily on latency, freshness, and scale goals

Absolutely! Here's a detailed explanation of the chapter “Design a Chat System” from System Design
Interview – An Insider’s Guide by Alex Xu.

A chat system is fundamental to messaging apps like WhatsApp, Messenger, Slack, Telegram, etc. It must
handle real-time communication, message reliability, and scalability for millions (or billions) of users.

💬 Chapter: Design a Chat System

✅ 1. Functional Requirements

• Send and receive instant messages


• Support for 1-to-1 chats and group chats
• Store chat history
• Delivery status: sent, delivered, read
• Online/offline presence (optional)

2. Non-Functional Requirements

• Low latency: Real-time messaging


• High availability
• Scalable to millions of concurrent users
• Message reliability: No loss or duplication
• Security: Encryption (optional)

🔗 4. Communication Protocol

Use WebSockets or gRPC Streaming for:

• Real-time bi-directional communication


• Persistent connection (avoids polling overhead)
• Fallback options:
• Long polling or HTTP polling for browsers

📦 5. Message Flow (1-to-1 Chat)

1. User A sends message to User B via WebSocket


2. Gateway validates and forwards to Message Service
3. Message Service:
a. Stores message in DB
b. Checks if recipient is online
i. ✅ Online → Send via WebSocket
ii. ❌ Offline → Queue for push notification
4. Updates delivery status
5. Optionally stores read receipts
🧩 👥 7. Group Chat Considerations

• A message sent to group → fan-out to all group members


• Store message once, but store delivery/read status per member
• Can use Pub/Sub model:
o Group channel = topic
o Each user = subscriber

📬 8. Delivery Guarantees

• Exactly-once delivery is difficult (because of retries, disconnects)


• Most systems aim for:
o At least once: Resend if unsure
o De-duplication at client using message ID

🧮 9. Back-of-the-Envelope Estimation

Assume:

• 200M DAU
• Avg. 50 messages/user/day → 10B messages/day
• = ~115,000 messages/sec (average)

Storage:

• 100 bytes/message → 1 TB/day

Active connections:

• 20M concurrent users → need persistent socket connections


• 1 Gateway server ≈ 50K sockets → need 400 servers

⚙ 10. Storage System

• Write-heavy system
• Use NoSQL DB like Cassandra or DynamoDB for fast writes
• Store recent messages in Redis or Memcached for quick access
• Archive old messages to blob store (e.g., S3)

🔃 11. Presence Management (Optional)

Track user online/offline status:

• Use heartbeat pings via WebSocket


• Store online users in Redis (fast and in-memory)
• TTL (Time to Live) ensures cleanup on disconnect

🔔 12. Notifications & Offline Delivery

• If recipient is offline:
o Store message
o Trigger push notification via FCM / APNs
• Retry sending on reconnect

🔐 13. Security

• Use end-to-end encryption (E2EE) (e.g., Signal Protocol)


• Encrypt message content before sending
• Server only stores ciphertext

✅ Key Takeaways

• Chat systems require low latency, high throughput, and reliability


• Use WebSockets for real-time, queues for decoupling, NoSQL for fast writes
• Handle disconnection, retry, and message ordering
• Think about group fan-out, rate limits, and deduplication
Absolutely! Here's a detailed explanation of the chapter “Design a Search Autocomplete System” from
System Design Interview – An Insider’s Guide by Alex Xu.

This chapter focuses on designing a responsive and scalable autocomplete system like the one you see in
Google, Amazon, YouTube, or LinkedIn, where suggestions are shown as you type.

🔍 Chapter: Design a Search Autocomplete System

🧾 1. Problem Statement

Design a search autocomplete service that returns top relevant suggestions as the user types, usually
within milliseconds.

Examples:

• Typing "cha" shows: chatGPT, chase bank, chai latte

✅ 2. Functional Requirements

• Given a query prefix (e.g., "cha"), return top N suggestions


• Suggestions should be:
o Fast (low latency)
o Ranked by popularity, context, or recency
• Support typo-tolerance and multi-language (optional)
• Suggestions should update in real time (e.g., trending terms)
3. Non-Functional Requirements

• Latency: <50ms per request


• High availability and scalability
• Real-time freshness (new searches reflected quickly)
• Memory-efficient

🧠 5. Data Sources for Suggestions

Suggestions can come from:

• Search query logs (historical popularity)


• User-specific queries (personalization)
• Trending keywords
• Product/item database (for e-commerce)
• Predefined lists (for bootstrapping)

🔤 6. Data Structure – Trie (Prefix Tree)

The most common structure for autocomplete:

• Nodes represent characters


• Paths from root → node form valid prefixes
• Each node can store:
o Top K suggestions (with popularity)
o Frequency count

✅ Efficient prefix lookup: O(m), where m = prefix length

✅ Fast traversal for autocomplete

📊 8. Real-Time Updates

• Periodically rebuild trie (batch processing) OR


• Use incremental updates on new queries
• Use streaming logs (Kafka/Spark) to update popularity in near-real-time

🧪 10. Caching and Performance Optimization

• Frequently searched prefixes → stored in cache


• Use Bloom filters to skip unknown prefixes
• Limit prefix length (e.g., only suggest after 2 characters)

🧮 11. Back-of-the-Envelope Estimation

Assume:

• 100M DAU × 5 queries/day → 500M query logs/day


• Popularity scores must update hourly
• Each suggestion = 100B entries → store top 5 per prefix

Estimated storage:

• 5 suggestions × 100M prefixes × 50 bytes = ~25 GB


• Can fit in memory on commodity hardware (or Redis cluster)
✅ Key Takeaways

• Use tries or prefix trees for fast lookup


• Rank by popularity, personalization, recency
• Ensure low latency and high QPS support
• Keep data fresh using logs and streaming updates
• Cache hot prefixes for instant access

Absolutely! Here's a detailed explanation of the chapter “Design YouTube” from System Design Interview –
An Insider’s Guide by Alex Xu.

This is one of the most complex and multi-component system design problems because it involves:

• Video uploads and streaming


• Metadata management
• Recommendation systems
• Storage, encoding, caching, and scalability
📺 Chapter: Design YouTube (or any large-scale video-sharing platform)

✅ 1. Functional Requirements

• Users can upload videos


• Other users can view videos on-demand (VOD)
• Support search and recommendations
• Support likes, comments, shares
• Video thumbnails, titles, playback speed, etc.
• Optional: Live streaming, subscriptions, monetization

🛡 2. Non-Functional Requirements

• Low latency video playback


• High availability across regions
• Scalability to millions of videos and users
• Efficient storage for massive video files
• Fault tolerance
🔁 4. Workflow: Video Upload & Playback

📤 Upload Flow

1. User uploads video → goes to Upload Service


2. Video is chunked and stored temporarily
3. Job is added to an encoding queue

⚙ Encoding Flow

4. Encoding service:
a. Converts video into multiple formats/resolutions (1080p, 720p, 480p, etc.)
b. Stores in object storage
c. Generates thumbnail and preview

▶ Playback Flow

5. User plays video → Player fetches manifest (e.g., .m3u8)


6. Streaming server/CDN delivers video chunks in required resolution

🎞 5. Adaptive Bitrate Streaming (ABR)

• Ensures smooth playback even with varying network speeds


• Use HLS (HTTP Live Streaming) or DASH (Dynamic Adaptive Streaming over HTTP)
• Video is broken into chunks (2–10 seconds each) and encoded in different resolutions

➡ Player automatically switches resolution based on bandwidth


🧮 7. Back-of-the-Envelope Estimation

Assume:

• 10M video uploads/day


• Avg. video size = 100MB → 1PB per day of raw data
• Encoded in 3 resolutions → ~3PB/day total
• 1B DAUs → Assume 5 video views/day → 5B views/day

QPS:

• 5B / 86,400 seconds ≈ 57,000 QPS


• 1 video = ~100 chunks → ~5.7M chunk fetches/second

🧭 8. Search & Recommendation

🔎 Search

• Use ElasticSearch or Solr for full-text search


• Index:
o Title, tags, uploader, views, language, transcripts

🤖
Recommendation

• Collaborative filtering: "Users who watched X also watched Y"


• Content-based filtering: Tags, topic, user interest
• ML-based ranking: History, engagement, click-through rates

Can be precomputed and cached for active users

🌍 9. CDN & Video Delivery

To reduce load and latency:

• Use CDNs (Akamai, Cloudflare, etc.)


• Cache video chunks close to users
• Prefetch popular videos to regional edge servers

Use signed URLs for security and cache invalidation strategies


🔐 10. Security & Access Control

• Authentication for upload and viewing private videos


• Rate limiting and CAPTCHA to prevent abuse
• DRM (Digital Rights Management) for licensed content
• Signed URLs for secure delivery

✅ Key Takeaways

• YouTube is a media-heavy system with complex needs:


o Encoding
o Storage
o Search
o Recommendations
o Streaming
• Design focuses on:
o Efficient video storage & delivery
o Smart indexing and metadata
o Personalized recommendations
o High scalability and availability

Absolutely! Here's a detailed explanation of the chapter “Design Google Drive (or Dropbox)” from System
Design Interview – An Insider’s Guide by Alex Xu.
Designing Google Drive involves building a cloud storage service that lets users upload, store, share, and
collaborate on files and folders, while maintaining reliability, synchronization, scalability, and security.

☁ Chapter: Design Google Drive (or Dropbox)

✅ 1. Functional Requirements

• Upload/download files of any type


• Folder structure: nested directories like a file system
• Share files/folders with others (read/write)
• Sync files across devices
• Versioning: track changes and restore versions
• Preview files (images, PDFs, videos)

🛡 2. Non-Functional Requirements

• High availability and durability (no data loss)


• Scalability to billions of files
• Efficient file sync and updates
• Access control and data security
• Low latency for metadata and file retrieval
• Support for concurrent access and edits
📁 4. File and Folder Structure

Replicate a tree-like file system:

• Each file/folder has a unique ID


• Folders store references to child items
• Metadata DB keeps:
o Name
o Type (file/folder)
o Parent ID
o Owner, timestamps, permissions

➡ Folder structure can be modeled using a relational DB or document store (e.g., MongoDB)

🧩 5. File Upload and Storage

Large files:

• Split into chunks (e.g., 4MB)


• Each chunk is:
o Checksummed (e.g., SHA-256)
o Stored in object storage (e.g., Amazon S3, GCP GCS)
o Deduplicated (if same chunk exists already)
Storage layout:

• Chunk ID → Location in storage


• File ID → Ordered list of chunk IDs

➡ Enables resumable uploads and partial updates

🔁 6. Sync Service

Sync is at the heart of Google Drive:

• Client polls or uses WebSocket for change notifications


• Syncs only changed files or chunks (delta sync)
• Maintains local cache and a change log (journal)

To detect changes:

• Use file hash comparison


• Maintain version number or modification timestamp

👥 7. Sharing and Permissions

Each file or folder has an Access Control List (ACL):

• User ID or email → permission (owner, editor, viewer)

Permissions cascade:

• If you share a folder, child files are shared too (unless overridden)

Support for:

• Shared links (anyone with link can view/edit)


• Role-based access (user, group)

📜 8. Versioning System

• Each file edit results in a new version


• Versions stored as:
o Full copy (for small files)
o Or as diffs (for large files)

Version DB stores:

• File ID → Version number, timestamp, user ID

Users can:

• View version history


• Restore to previous version
• Compare changes

📬 9. Notifications and Collaboration

To support real-time collaboration (like in Google Docs):

• Use WebSockets or gRPC streams to notify clients of:


o New uploads
o Edits
o Permission changes
• Clients update local view or prompt refresh

Optional:

• Use Operational Transformation (OT) or CRDTs for collaborative document editing

🔐 10. Security and Access Control

• All communications over HTTPS


• Files encrypted:
o In transit (TLS)
o At rest (AES-256)
• Use OAuth 2.0 for authentication and authorization
• Store access logs (IP, time, action)
🧮 12. Back-of-the-Envelope Estimation

Assume:

• 1B users
• Avg 10 files/user/day → 10B new files/day
• Each file = 1MB → 10 PB/day
• 4MB chunk size → ~3 chunks/file → 30B chunk writes/day

➡ Heavy read/write system, needs to support:

• 10–50K writes/sec
• Millions of sync messages/sec

✅ Key Takeaways

• Google Drive = Large-scale distributed file system + sync engine


• Core pillars:
o Chunked, deduplicated storage
o Tree-based metadata
o Access control
o Real-time sync
• Needs massive scalability, durability, and reliability
• Can extend to collaboration, previews, and integrations

• Certainly! Here's a detailed explanation of the chapter “Design a Key-Value Store” from System
Design Interview – An Insider’s Guide by Alex Xu.
• A Key-Value Store is a core component in distributed systems, serving as the foundation for
databases like Redis, Amazon DynamoDB, RocksDB, and LevelDB. It enables quick reads and
writes with a simple interface.

• 🧰 Chapter: Design a Key-Value Store

• ✅ 1. Functional Requirements

• GET(key) → value
• PUT(key, value) → insert or update
• DELETE(key)
• Support large data size and millions of requests per second
• Support persistence, replication, and fault tolerance

• 🛡 2. Non-Functional Requirements

• High performance: low-latency reads/writes


• Scalability: horizontal scale across machines
• Durability: data persists through failures
• Availability: should tolerate machine failure
• Consistency trade-offs (CAP theorem)

• 🔍 3. Use Cases of Key-Value Stores

• Session storage (user_id → session object)


• Caching layer (product_id → product data)
• User preferences
• Shopping cart data
• Metadata (e.g., file_id → location info)

• 🔁 7. Write Path (PUT)

• Write to WAL for durability


• Update MemTable in memory
• When MemTable full:
o It becomes Immutable MemTable
o Flushed to disk as an SSTable
o Trigger compaction to merge SSTables

• 🔍 8. Read Path (GET)

• Search in MemTable
• If not found, check Immutable MemTable
• Then search in SSTables (on-disk)
o May use a Bloom filter to skip irrelevant files
o Use index block in SSTable for efficient lookup

• 🔨 9. Compaction

• Periodic background process


• Merges multiple SSTables into fewer, larger SSTables
• Removes tombstone entries (from DELETEs)
• Keeps read performance high and disk space efficient

• 🧮 10. Back-of-the-Envelope Estimation

• Assume:
• 10 million keys
• Key = 20B, Value = 500B → ~520B/entry
• Total storage = 10M × 520B ≈ 5 GB
• If 1,000 writes/sec:
• WAL must handle 500 KB/sec write throughput
• For scalability:
• Partition by key hash across 10 nodes
• Each node handles 1M keys (~500 MB/node)

• 📦 11. Horizontal Sharding

• Distribute keys across nodes using consistent hashing


• Add or remove nodes with minimal rebalancing
• Each node maintains its own LSM Tree instance

• ✅ Key Takeaways

• A modern key-value store uses LSM Tree for efficient writes and compaction
• Combines memory (MemTable) and disk (SSTables) for performance and durability
• Read optimization via Bloom filters and indexes
• Replication, sharding, and compaction ensure performance and reliability at scale

• Would you like:
• A diagram of the architecture?
• A sample Java code of LSM Tree or in-memory KV store?
• A comparison with Redis or DynamoDB?

Certainly! Here's a detailed explanation of the chapter “Design Consistent Hashing” from System Design
Interview – An Insider’s Guide by Alex Xu.
Consistent Hashing is a fundamental concept in distributed systems, especially in systems where data
needs to be partitioned across multiple servers/nodes—such as caches (like Memcached), databases
(like DynamoDB), file storage systems, etc.

🌀 Chapter: Design Consistent Hashing

🎯 3. Core Idea of Consistent Hashing

Keys and servers are both placed on a hash ring (0 to 2³² or 0 to 1).
• Hash every node/server to a point on the ring
• Hash every key to a point on the ring
• A key is assigned to the first node clockwise from its hash value

Key Properties:

• Only K/N keys are remapped when a node is added/removed


• Ideal for scalable and fault-tolerant systems

🧠 4. How It Works (Step by Step)

1. Define a hash space (e.g., 0 to 2³² - 1)


2. Hash each server (e.g., IP or name) to a point on the circle
3. Hash each key to a point on the same circle
4. Route the key to the first server in clockwise direction

Example:

• Servers: A → 20, B → 60, C → 120 (hashed positions)If C is removed, X now goes to B (next clockwise)

🔁 5. Adding/Removing Nodes

• Adding a node → only keys between the previous server and new node move
• Removing a node → only its keys need to be reassigned to next clockwise node

➡ Ensures minimal disruption

🧩 6. Problems with Basic Consistent Hashing

Uneven load distribution:

• If hash function isn’t uniform


• If number of nodes is small

Solution: Virtual Nodes (VNodes)

• Each physical node is hashed multiple times to create virtual nodes


• Example: Server A → A1, A2, A3 … placed in different ring positions
• Keys are assigned to the virtual node, which maps back to the physical node
Benefits:

• Better load balancing


• Handles heterogeneous nodes (different capacities)

🧮 8. Back-of-the-Envelope Estimation

Assume:

• 100M keys
• 10 servers

Without consistent hashing:

• Adding 1 server → all keys rehashed → 100M moves

With consistent hashing:

• ~100M / 11 ≈ ~9M keys move to the new server

If each key = 1KB → 9M KB ≈ 9GB of movement instead of 100GB

🔐 9. CAP Theorem Relationship

• Consistent hashing doesn’t enforce CAP trade-offs directly


• But it supports scalability and availability by allowing nodes to be added/removed dynamically
• Commonly used in eventually consistent systems (e.g., DynamoDB)

✅ 10. Key Takeaways

• Traditional hashing fails with node changes → massive rebalancing


• Consistent hashing minimizes data movement
• It places keys and nodes on a ring and maps each key to the next clockwise node
• Virtual nodes improve load balancing and fault tolerance
• Widely used in caching, databases, file storage, and load balancing

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy