20 System Design Interview Questions (With Complete Walkthroughs)

Updated April 2026. Estimated reading time: 18 minutes.

System design interviews are where most senior engineering candidates crash. Not because they lack knowledge — because they've memorized answers instead of learning the underlying patterns.

This guide gives you the 20 most common system design questions, organized by pattern, with complete walkthroughs that translate to any variation.

The Framework (Memorize This)

Every system design interview, regardless of the question, should follow this structure. 45 minutes, distributed:

| Phase | Time | What you do | |-------|------|-------------| | Clarify requirements | 5 min | Functional + non-functional. Ask, don't assume | | Estimate scale | 5 min | QPS, storage, bandwidth (back-of-envelope) | | High-level design | 10 min | Draw the architecture. Services + data flow | | Deep dive | 15 min | Focus on 1-2 critical components | | Trade-offs | 5 min | Discuss alternatives and why you chose what you chose | | Wrap up | 5 min | Answer follow-up questions about scale, failure modes |

The #1 mistake: jumping to architecture diagrams at minute 2. You'll miss requirements and score low on collaboration.

The 7 Core Patterns

Every system design question is a variation of one or more of these seven patterns:

Read-heavy with caching — Feeds, profiles, product pages
Write-heavy with fan-out — Notifications, activity streams
Real-time messaging — Chat, collaborative editing
Geospatial queries — Uber, Yelp, DoorDash
Analytics pipelines — Watch time, click tracking
Search — Web search, product search, autocomplete
Transactions — Payments, inventory, booking

Master these seven. Then every variation becomes tractable.

Pattern 1: Read-Heavy with Caching

Question 1: Design Twitter / Instagram feed

Requirements clarification questions:

Read-to-write ratio? (Answer: 100:1 read-heavy)
Do users follow celebrities with millions of followers? (Yes — this changes everything)
How recent does the feed need to be? (15 min staleness is usually fine)
Do we need to support media (images, videos)?

Scale estimation:

500M daily active users
Each user reads feed ~5x/day = 2.5B feed reads/day = ~30K QPS
Each user posts 0.5 times/day = 250M writes/day = ~3K QPS

High-level design:

Write path: User posts tweet → Tweet service → Fanout service → Insert into each follower's timeline cache
Read path: User opens app → Feed service → Read pre-computed timeline from Redis → Return

Key insight: Fanout-on-write vs fanout-on-read

Fanout-on-write (push): When user tweets, push to all followers' timelines. Fast reads, slow writes for celebrities.
Fanout-on-read (pull): When user opens feed, query recent tweets from everyone they follow. Fast writes, slow reads.
Hybrid: Push for most users. For celebrities (> 10K followers), pull on read.

Deep dive: Timeline cache

Redis sorted set per user, keyed by tweet_id:timestamp
Max 800 tweets per user (covers ~2 weeks for avg user)
Background job evicts after 2 weeks
Use consistent hashing to shard by user_id

Trade-offs:

Celebrity problem: Taylor Swift tweeting causes 100M fanouts
Consistency: You might see a friend's tweet 30 sec after they post
Cache eviction strategy matters — LRU vs TTL

Question 2: Design Reddit / Hacker News

Question 3: Design YouTube (video delivery part)

Question 4: Design Netflix homepage

All use the same "read-heavy with caching" pattern. Differences:

YouTube: CDN for video files, not DB caching
Netflix: Personalized recommendations = more complex cache keys
Reddit: Votes require eventually-consistent counters

Pattern 2: Write-Heavy with Fanout

Question 5: Design a notification system

Requirements:

Push, email, SMS channels
User preferences (which channel, which notifications)
Delivery guarantees (at-least-once for financial, best-effort for social)
Rate limiting per user

Scale:

100M DAU, avg 5 notifications/user/day = 500M notifications/day = ~6K QPS

Architecture:

Source Services (feed, chat, etc.)
        ↓
   Kafka (pub-sub)
        ↓
Notification Processor
        ↓
   [User preferences DB]
        ↓
   Channel Dispatchers
   ├── Push (APNs/FCM)
   ├── Email (SendGrid)
   └── SMS (Twilio)

Deep dive: deduplication

Use Redis bloom filter keyed by (user_id, event_id)
TTL 24 hours — catches retry storms
Cost: 1MB of RAM for 1M events with 1% false positive

Trade-offs:

SMS costs real money — rate limit aggressively
APNs has specific payload size limits
User preference checks are the bottleneck — cache them

Question 6: Design Pinterest (pin fanout)

Question 7: Design an activity feed

Pattern 3: Real-Time Messaging

Question 8: Design WhatsApp / Slack / Discord

Requirements:

1-on-1 and group messaging
Message delivery (sent, delivered, read receipts)
Online presence
Media attachments
Message history

Scale:

1B users, avg 10 messages/user/day = 10B messages/day = ~100K QPS

Architecture:

Mobile Client
     ↓ (WebSocket)
Connection Service (stateful, sticky sessions)
     ↓
Chat Service
     ↓
Message DB (Cassandra, partitioned by conversation_id)
     ↓
Media Service (S3 + CDN for attachments)

Deep dive: Connection routing

Each user connects to one "connection server" via WebSocket
Redis hash map: user_id → connection_server_id
When Alice sends to Bob: look up Bob's connection server, route there
If Bob is offline, store message, deliver on reconnect

Deep dive: Group chat (fanout)

For groups < 100: push to each member's connection server
For groups > 100 (Slack channels): pull-based. Client polls.
For groups > 10K (Discord servers): pub-sub via Redis streams

Trade-offs:

Message ordering within a conversation — use client-generated timestamps + conflict resolution
Read receipts at scale — compress and batch
Media files — separate from message metadata

Question 9: Design collaborative editing (Google Docs)

Use CRDT or OT for conflict resolution. This is advanced — usually only asked at L6+.

Pattern 4: Geospatial Queries

Question 10: Design Uber / Lyft

Requirements:

Match riders with nearby drivers (< 5 miles, < 10 sec)
Real-time driver location updates
Surge pricing
Trip history

Scale:

1M active drivers, each updating location every 5 sec
= 200K QPS on location updates
Rider → driver match: 100K QPS at peak

Key insight: Geospatial indexing

You need a spatial index to answer "find drivers within 5 miles of (lat, lng)" efficiently. Options:

Geohash: Encode lat/lng into a string. Prefix match = nearby area.
Quadtree: Recursive spatial partitioning
Google S2: Hilbert curve on sphere, used by Uber
H3: Uber's own hexagonal hierarchical spatial index

For interview purposes, geohash is easiest to explain.

Architecture:

Driver App → Location Update Service
          ↓
       Redis (geohash-indexed)
          ↓
Rider Request → Matching Service → Query Redis for drivers in region
                                ↓
                               ETA Service (Google Maps or internal)
                                ↓
                               Match confirmation

Deep dive: Location update at 200K QPS

Write to Redis (memory-based geospatial commands: GEOADD, GEORADIUS)
Don't persist every update to disk — too expensive
Persist every N seconds or on ride completion

Trade-offs:

Geohash edge case: drivers near cell boundaries. Solution: query 9 cells (center + 8 neighbors)
Location accuracy vs update cost: 5-sec updates are usually enough
Surge pricing is a separate system entirely — don't conflate

Question 11: Design Yelp / restaurant search

Question 12: Design DoorDash delivery matching

Pattern 5: Analytics Pipelines

Question 13: Design a URL shortener (bit.ly, tinyurl)

This is THE most common system design question. Nail this one.

Requirements:

Generate short URL from long URL
Redirect from short → long
Analytics (clicks per short URL)
Custom aliases (optional)
Expiration (optional)

Scale:

100M URLs created/day = ~1K QPS writes
10B clicks/day = ~100K QPS reads (read-heavy)

High-level design:

Create:
POST /shorten {url} → Short URL service → DB → Return short URL

Redirect:
GET /{short} → Redirect service → Cache → DB → 302 redirect

Deep dive: Short URL generation

Three approaches:

Hash-based (MD5 first 6 chars): Problem — collisions, long URL = always same short URL (bad for privacy)
Counter-based (encode incrementing integer to base62): Predictable, enables scraping
Random generation + collision check: Most common. Generate 7-char random base62, check DB for collision (rare).

Go with approach 3.

Base62 (a-zA-Z0-9) with 7 chars = 62^7 = 3.5 trillion URLs. Enough for years.

Deep dive: Click analytics

Don't update the DB row on every click (write amplification). Instead:

Click → Redirect service (fast 302) → Async Kafka event → Analytics service → Time-series DB

Aggregate clicks by hour/day. Batch updates to main DB.

Trade-offs:

Consistency: URL just created might not be in cache for 100ms
Expired URLs: batch job or TTL at DB level
Custom aliases: check for duplicates at write time

Question 14: Design Google Analytics

Question 15: Design a metrics collection system

Pattern 6: Search

Question 16: Design Twitter search

Question 17: Design autocomplete (Google search bar)

Core concepts:

Trie for prefix matching
Top-K heap for popular queries
Distributed inverted index

Question 18: Design product search (Amazon, Shopify)

Pattern 7: Transactions

Question 19: Design a payment system (Stripe)

Requirements:

Accept card payments
Idempotency (don't charge twice for same request)
Refunds
Webhooks to merchants
PCI compliance

Scale:

100K transactions/sec at peak (Black Friday)
Must be strongly consistent (no double-charge)

Key patterns:

Idempotency keys: Client sends unique key. If we see the same key twice in 24hr, return cached response.
Two-phase commit for transaction state machine
Event sourcing: Every state change is an immutable event. The DB is a projection.

Architecture:

Merchant API → Payment Service → [Idempotency cache]
                              ↓
                          State machine (initiated → captured → settled)
                              ↓
                          Payment processor (Visa/MC network)
                              ↓
                          Event bus → Webhook dispatcher → Merchant

Deep dive: Idempotency

POST /charge { amount: 100, idempotency_key: "abc123" }

Server:
1. Acquire lock on idempotency_key
2. Check Redis: seen_keys[abc123]?
   - If exists: return cached response
   - If not: process charge
3. Cache response in Redis (TTL 24h)
4. Release lock

Without this, a network retry causes double-charge.

Trade-offs:

Strong consistency costs latency. Accept 200-500ms for payment APIs.
Multi-region becomes hard — most payment systems are single-region with read replicas
Card data — tokenize and store in PCI-compliant vault, not main DB

Question 20: Design a rate limiter

Classic at all levels. Algorithms:

Fixed window: Simple, but bursty at window boundary
Sliding window log: Accurate, expensive (stores every request)
Sliding window counter: Approximation, low overhead
Token bucket: Smooth traffic, used by AWS API Gateway
Leaky bucket: Smooth traffic, simpler than token bucket

Use Redis INCR with TTL for fixed window. For more accuracy, implement token bucket in Redis Lua script.

The Secret: Interviewer's Scoring Rubric

You're graded on 5 dimensions:

Requirements gathering — Did you clarify before designing?
Scale estimation — Can you estimate back-of-envelope?
Architecture — Did you choose appropriate patterns?
Deep dive — Can you go 2-3 levels deep on 1-2 components?
Trade-offs — Can you articulate alternatives and why you chose?

Most candidates focus on #3. The differentiator between hire/no-hire is #4 and #5.

Week-by-Week Prep Plan

Week 1: Read Designing Data-Intensive Applications chapters 1-6
Week 2: Pattern 1 + 2 (caching, fanout) — 5 practice questions
Week 3: Pattern 3 + 4 (messaging, geospatial) — 5 practice questions
Week 4: Pattern 5 + 6 + 7 — 5 practice questions
Week 5-6: 10 mock interviews (interviewing.io or peers)

Do not skip mock interviews. Reading != being able to answer under time pressure.

The 3-Hour Night-Before Protocol

If your interview is tomorrow:

(1 hour) Review the 7 patterns. Memorize one canonical example for each.
(1 hour) Practice the framework: requirements → scale → architecture → deep dive → trade-offs
(1 hour) Sketch 3 systems on paper, out loud, with a timer

Sleep. Don't learn new material. Don't cram.

Get system design questions tailored to your target company and seniority: HiredPathway. Paste any job URL, get 25+ questions specific to your role. 3 free, no card needed.

Related:

20 System Design Interview Questions (With Complete Walkthroughs)

Updated April 2026. Estimated reading time: 18 minutes.

System design interviews are where most senior engineering candidates crash. Not because they lack knowledge — because they've memorized answers instead of learning the underlying patterns.

This guide gives you the 20 most common system design questions, organized by pattern, with complete walkthroughs that translate to any variation.

The Framework (Memorize This)

Every system design interview, regardless of the question, should follow this structure. 45 minutes, distributed:

The #1 mistake: jumping to architecture diagrams at minute 2. You'll miss requirements and score low on collaboration.

The 7 Core Patterns

Every system design question is a variation of one or more of these seven patterns:

Read-heavy with caching — Feeds, profiles, product pages
Write-heavy with fan-out — Notifications, activity streams
Real-time messaging — Chat, collaborative editing
Geospatial queries — Uber, Yelp, DoorDash
Analytics pipelines — Watch time, click tracking
Search — Web search, product search, autocomplete
Transactions — Payments, inventory, booking

Master these seven. Then every variation becomes tractable.

Pattern 1: Read-Heavy with Caching

Question 1: Design Twitter / Instagram feed

Requirements clarification questions:

Read-to-write ratio? (Answer: 100:1 read-heavy)
Do users follow celebrities with millions of followers? (Yes — this changes everything)
How recent does the feed need to be? (15 min staleness is usually fine)
Do we need to support media (images, videos)?

Scale estimation:

500M daily active users
Each user reads feed ~5x/day = 2.5B feed reads/day = ~30K QPS
Each user posts 0.5 times/day = 250M writes/day = ~3K QPS

High-level design:

Write path: User posts tweet → Tweet service → Fanout service → Insert into each follower's timeline cache
Read path: User opens app → Feed service → Read pre-computed timeline from Redis → Return

Key insight: Fanout-on-write vs fanout-on-read

Fanout-on-write (push): When user tweets, push to all followers' timelines. Fast reads, slow writes for celebrities.
Fanout-on-read (pull): When user opens feed, query recent tweets from everyone they follow. Fast writes, slow reads.
Hybrid: Push for most users. For celebrities (> 10K followers), pull on read.

Deep dive: Timeline cache

Redis sorted set per user, keyed by tweet_id:timestamp
Max 800 tweets per user (covers ~2 weeks for avg user)
Background job evicts after 2 weeks
Use consistent hashing to shard by user_id

Trade-offs:

Celebrity problem: Taylor Swift tweeting causes 100M fanouts
Consistency: You might see a friend's tweet 30 sec after they post
Cache eviction strategy matters — LRU vs TTL

Question 2: Design Reddit / Hacker News

Question 3: Design YouTube (video delivery part)

Question 4: Design Netflix homepage

All use the same "read-heavy with caching" pattern. Differences:

YouTube: CDN for video files, not DB caching
Netflix: Personalized recommendations = more complex cache keys
Reddit: Votes require eventually-consistent counters

Pattern 2: Write-Heavy with Fanout

Question 5: Design a notification system

Requirements:

Push, email, SMS channels
User preferences (which channel, which notifications)
Delivery guarantees (at-least-once for financial, best-effort for social)
Rate limiting per user

Scale:

100M DAU, avg 5 notifications/user/day = 500M notifications/day = ~6K QPS

Architecture:

Source Services (feed, chat, etc.)
        ↓
   Kafka (pub-sub)
        ↓
Notification Processor
        ↓
   [User preferences DB]
        ↓
   Channel Dispatchers
   ├── Push (APNs/FCM)
   ├── Email (SendGrid)
   └── SMS (Twilio)

Deep dive: deduplication

Use Redis bloom filter keyed by (user_id, event_id)
TTL 24 hours — catches retry storms
Cost: 1MB of RAM for 1M events with 1% false positive

Trade-offs:

SMS costs real money — rate limit aggressively
APNs has specific payload size limits
User preference checks are the bottleneck — cache them

Question 6: Design Pinterest (pin fanout)

Question 7: Design an activity feed

Pattern 3: Real-Time Messaging

Question 8: Design WhatsApp / Slack / Discord

Requirements:

1-on-1 and group messaging
Message delivery (sent, delivered, read receipts)
Online presence
Media attachments
Message history

Scale:

1B users, avg 10 messages/user/day = 10B messages/day = ~100K QPS

Architecture:

Mobile Client
     ↓ (WebSocket)
Connection Service (stateful, sticky sessions)
     ↓
Chat Service
     ↓
Message DB (Cassandra, partitioned by conversation_id)
     ↓
Media Service (S3 + CDN for attachments)

Deep dive: Connection routing

Each user connects to one "connection server" via WebSocket
Redis hash map: user_id → connection_server_id
When Alice sends to Bob: look up Bob's connection server, route there
If Bob is offline, store message, deliver on reconnect

Deep dive: Group chat (fanout)

For groups < 100: push to each member's connection server
For groups > 100 (Slack channels): pull-based. Client polls.
For groups > 10K (Discord servers): pub-sub via Redis streams

Trade-offs:

Message ordering within a conversation — use client-generated timestamps + conflict resolution
Read receipts at scale — compress and batch
Media files — separate from message metadata

Question 9: Design collaborative editing (Google Docs)

Use CRDT or OT for conflict resolution. This is advanced — usually only asked at L6+.

Pattern 4: Geospatial Queries

Question 10: Design Uber / Lyft

Requirements:

Match riders with nearby drivers (< 5 miles, < 10 sec)
Real-time driver location updates
Surge pricing
Trip history

Scale:

1M active drivers, each updating location every 5 sec
= 200K QPS on location updates
Rider → driver match: 100K QPS at peak

Key insight: Geospatial indexing

You need a spatial index to answer "find drivers within 5 miles of (lat, lng)" efficiently. Options:

Geohash: Encode lat/lng into a string. Prefix match = nearby area.
Quadtree: Recursive spatial partitioning
Google S2: Hilbert curve on sphere, used by Uber
H3: Uber's own hexagonal hierarchical spatial index

For interview purposes, geohash is easiest to explain.

Architecture:

Driver App → Location Update Service
          ↓
       Redis (geohash-indexed)
          ↓
Rider Request → Matching Service → Query Redis for drivers in region
                                ↓
                               ETA Service (Google Maps or internal)
                                ↓
                               Match confirmation

Deep dive: Location update at 200K QPS

Write to Redis (memory-based geospatial commands: GEOADD, GEORADIUS)
Don't persist every update to disk — too expensive
Persist every N seconds or on ride completion

Trade-offs:

Geohash edge case: drivers near cell boundaries. Solution: query 9 cells (center + 8 neighbors)
Location accuracy vs update cost: 5-sec updates are usually enough
Surge pricing is a separate system entirely — don't conflate

Question 11: Design Yelp / restaurant search

Question 12: Design DoorDash delivery matching

Pattern 5: Analytics Pipelines

Question 13: Design a URL shortener (bit.ly, tinyurl)

This is THE most common system design question. Nail this one.

Requirements:

Generate short URL from long URL
Redirect from short → long
Analytics (clicks per short URL)
Custom aliases (optional)
Expiration (optional)

Scale:

100M URLs created/day = ~1K QPS writes
10B clicks/day = ~100K QPS reads (read-heavy)

High-level design:

Create:
POST /shorten {url} → Short URL service → DB → Return short URL

Redirect:
GET /{short} → Redirect service → Cache → DB → 302 redirect

Deep dive: Short URL generation

Three approaches:

Hash-based (MD5 first 6 chars): Problem — collisions, long URL = always same short URL (bad for privacy)
Counter-based (encode incrementing integer to base62): Predictable, enables scraping
Random generation + collision check: Most common. Generate 7-char random base62, check DB for collision (rare).

Go with approach 3.

Base62 (a-zA-Z0-9) with 7 chars = 62^7 = 3.5 trillion URLs. Enough for years.

Deep dive: Click analytics

Don't update the DB row on every click (write amplification). Instead:

Click → Redirect service (fast 302) → Async Kafka event → Analytics service → Time-series DB

Aggregate clicks by hour/day. Batch updates to main DB.

Trade-offs:

Consistency: URL just created might not be in cache for 100ms
Expired URLs: batch job or TTL at DB level
Custom aliases: check for duplicates at write time

Question 14: Design Google Analytics

Question 15: Design a metrics collection system

Pattern 6: Search

Question 16: Design Twitter search

Question 17: Design autocomplete (Google search bar)

Core concepts:

Trie for prefix matching
Top-K heap for popular queries
Distributed inverted index

Question 18: Design product search (Amazon, Shopify)

Pattern 7: Transactions

Question 19: Design a payment system (Stripe)

Requirements:

Accept card payments
Idempotency (don't charge twice for same request)
Refunds
Webhooks to merchants
PCI compliance

Scale:

100K transactions/sec at peak (Black Friday)
Must be strongly consistent (no double-charge)

Key patterns:

Idempotency keys: Client sends unique key. If we see the same key twice in 24hr, return cached response.
Two-phase commit for transaction state machine
Event sourcing: Every state change is an immutable event. The DB is a projection.

Architecture:

Merchant API → Payment Service → [Idempotency cache]
                              ↓
                          State machine (initiated → captured → settled)
                              ↓
                          Payment processor (Visa/MC network)
                              ↓
                          Event bus → Webhook dispatcher → Merchant

Deep dive: Idempotency

POST /charge { amount: 100, idempotency_key: "abc123" }

Server:
1. Acquire lock on idempotency_key
2. Check Redis: seen_keys[abc123]?
   - If exists: return cached response
   - If not: process charge
3. Cache response in Redis (TTL 24h)
4. Release lock

Without this, a network retry causes double-charge.

Trade-offs:

Strong consistency costs latency. Accept 200-500ms for payment APIs.
Multi-region becomes hard — most payment systems are single-region with read replicas
Card data — tokenize and store in PCI-compliant vault, not main DB

Question 20: Design a rate limiter

Classic at all levels. Algorithms:

Fixed window: Simple, but bursty at window boundary
Sliding window log: Accurate, expensive (stores every request)
Sliding window counter: Approximation, low overhead
Token bucket: Smooth traffic, used by AWS API Gateway
Leaky bucket: Smooth traffic, simpler than token bucket

Use Redis INCR with TTL for fixed window. For more accuracy, implement token bucket in Redis Lua script.

The Secret: Interviewer's Scoring Rubric

You're graded on 5 dimensions:

Requirements gathering — Did you clarify before designing?
Scale estimation — Can you estimate back-of-envelope?
Architecture — Did you choose appropriate patterns?
Deep dive — Can you go 2-3 levels deep on 1-2 components?
Trade-offs — Can you articulate alternatives and why you chose?

Most candidates focus on #3. The differentiator between hire/no-hire is #4 and #5.

Week-by-Week Prep Plan

Week 1: Read Designing Data-Intensive Applications chapters 1-6
Week 2: Pattern 1 + 2 (caching, fanout) — 5 practice questions
Week 3: Pattern 3 + 4 (messaging, geospatial) — 5 practice questions
Week 4: Pattern 5 + 6 + 7 — 5 practice questions
Week 5-6: 10 mock interviews (interviewing.io or peers)

Do not skip mock interviews. Reading != being able to answer under time pressure.

The 3-Hour Night-Before Protocol

If your interview is tomorrow:

(1 hour) Review the 7 patterns. Memorize one canonical example for each.
(1 hour) Practice the framework: requirements → scale → architecture → deep dive → trade-offs
(1 hour) Sketch 3 systems on paper, out loud, with a timer

Sleep. Don't learn new material. Don't cram.

Get system design questions tailored to your target company and seniority: HiredPathway. Paste any job URL, get 25+ questions specific to your role. 3 free, no card needed.

Related: