20 System Design Interview Questions (With Complete Walkthroughs)
Updated April 2026. Estimated reading time: 18 minutes.
System design interviews are where most senior engineering candidates crash. Not because they lack knowledge — because they've memorized answers instead of learning the underlying patterns.
This guide gives you the 20 most common system design questions, organized by pattern, with complete walkthroughs that translate to any variation.
The Framework (Memorize This)
Every system design interview, regardless of the question, should follow this structure. 45 minutes, distributed:
| Phase | Time | What you do |
|-------|------|-------------|
| Clarify requirements | 5 min | Functional + non-functional. Ask, don't assume |
| Estimate scale | 5 min | QPS, storage, bandwidth (back-of-envelope) |
| High-level design | 10 min | Draw the architecture. Services + data flow |
| Deep dive | 15 min | Focus on 1-2 critical components |
| Trade-offs | 5 min | Discuss alternatives and why you chose what you chose |
| Wrap up | 5 min | Answer follow-up questions about scale, failure modes |
The #1 mistake: jumping to architecture diagrams at minute 2. You'll miss requirements and score low on collaboration.
The 7 Core Patterns
Every system design question is a variation of one or more of these seven patterns:
- Read-heavy with caching — Feeds, profiles, product pages
- Write-heavy with fan-out — Notifications, activity streams
- Real-time messaging — Chat, collaborative editing
- Geospatial queries — Uber, Yelp, DoorDash
- Analytics pipelines — Watch time, click tracking
- Search — Web search, product search, autocomplete
- Transactions — Payments, inventory, booking
Master these seven. Then every variation becomes tractable.
Pattern 1: Read-Heavy with Caching
Question 1: Design Twitter / Instagram feed
Requirements clarification questions:
- Read-to-write ratio? (Answer: 100:1 read-heavy)
- Do users follow celebrities with millions of followers? (Yes — this changes everything)
- How recent does the feed need to be? (15 min staleness is usually fine)
- Do we need to support media (images, videos)?
Scale estimation:
- 500M daily active users
- Each user reads feed ~5x/day = 2.5B feed reads/day = ~30K QPS
- Each user posts 0.5 times/day = 250M writes/day = ~3K QPS
High-level design:
- Write path: User posts tweet → Tweet service → Fanout service → Insert into each follower's timeline cache
- Read path: User opens app → Feed service → Read pre-computed timeline from Redis → Return
Key insight: Fanout-on-write vs fanout-on-read
- Fanout-on-write (push): When user tweets, push to all followers' timelines. Fast reads, slow writes for celebrities.
- Fanout-on-read (pull): When user opens feed, query recent tweets from everyone they follow. Fast writes, slow reads.
- Hybrid: Push for most users. For celebrities (> 10K followers), pull on read.
Deep dive: Timeline cache
- Redis sorted set per user, keyed by tweet_id:timestamp
- Max 800 tweets per user (covers ~2 weeks for avg user)
- Background job evicts after 2 weeks
- Use consistent hashing to shard by user_id
Trade-offs:
- Celebrity problem: Taylor Swift tweeting causes 100M fanouts
- Consistency: You might see a friend's tweet 30 sec after they post
- Cache eviction strategy matters — LRU vs TTL
Question 2: Design Reddit / Hacker News
Question 3: Design YouTube (video delivery part)
Question 4: Design Netflix homepage
All use the same "read-heavy with caching" pattern. Differences:
- YouTube: CDN for video files, not DB caching
- Netflix: Personalized recommendations = more complex cache keys
- Reddit: Votes require eventually-consistent counters
Pattern 2: Write-Heavy with Fanout
Question 5: Design a notification system
Requirements:
- Push, email, SMS channels
- User preferences (which channel, which notifications)
- Delivery guarantees (at-least-once for financial, best-effort for social)
- Rate limiting per user
Scale:
- 100M DAU, avg 5 notifications/user/day = 500M notifications/day = ~6K QPS
Architecture:
Source Services (feed, chat, etc.)
↓
Kafka (pub-sub)
↓
Notification Processor
↓
[User preferences DB]
↓
Channel Dispatchers
├── Push (APNs/FCM)
├── Email (SendGrid)
└── SMS (Twilio)
Deep dive: deduplication
- Use Redis bloom filter keyed by (user_id, event_id)
- TTL 24 hours — catches retry storms
- Cost: 1MB of RAM for 1M events with 1% false positive
Trade-offs:
- SMS costs real money — rate limit aggressively
- APNs has specific payload size limits
- User preference checks are the bottleneck — cache them
Question 6: Design Pinterest (pin fanout)
Question 7: Design an activity feed
Pattern 3: Real-Time Messaging
Question 8: Design WhatsApp / Slack / Discord
Requirements:
- 1-on-1 and group messaging
- Message delivery (sent, delivered, read receipts)
- Online presence
- Media attachments
- Message history
Scale:
- 1B users, avg 10 messages/user/day = 10B messages/day = ~100K QPS
Architecture:
Mobile Client
↓ (WebSocket)
Connection Service (stateful, sticky sessions)
↓
Chat Service
↓
Message DB (Cassandra, partitioned by conversation_id)
↓
Media Service (S3 + CDN for attachments)
Deep dive: Connection routing
- Each user connects to one "connection server" via WebSocket
- Redis hash map: user_id → connection_server_id
- When Alice sends to Bob: look up Bob's connection server, route there
- If Bob is offline, store message, deliver on reconnect
Deep dive: Group chat (fanout)
- For groups < 100: push to each member's connection server
- For groups > 100 (Slack channels): pull-based. Client polls.
- For groups > 10K (Discord servers): pub-sub via Redis streams
Trade-offs:
- Message ordering within a conversation — use client-generated timestamps + conflict resolution
- Read receipts at scale — compress and batch
- Media files — separate from message metadata
Question 9: Design collaborative editing (Google Docs)
Use CRDT or OT for conflict resolution. This is advanced — usually only asked at L6+.
Pattern 4: Geospatial Queries
Question 10: Design Uber / Lyft
Requirements:
- Match riders with nearby drivers (< 5 miles, < 10 sec)
- Real-time driver location updates
- Surge pricing
- Trip history
Scale:
- 1M active drivers, each updating location every 5 sec
- = 200K QPS on location updates
- Rider → driver match: 100K QPS at peak
Key insight: Geospatial indexing
You need a spatial index to answer "find drivers within 5 miles of (lat, lng)" efficiently. Options:
- Geohash: Encode lat/lng into a string. Prefix match = nearby area.
- Quadtree: Recursive spatial partitioning
- Google S2: Hilbert curve on sphere, used by Uber
- H3: Uber's own hexagonal hierarchical spatial index
For interview purposes, geohash is easiest to explain.
Architecture:
Driver App → Location Update Service
↓
Redis (geohash-indexed)
↓
Rider Request → Matching Service → Query Redis for drivers in region
↓
ETA Service (Google Maps or internal)
↓
Match confirmation
Deep dive: Location update at 200K QPS
- Write to Redis (memory-based geospatial commands: GEOADD, GEORADIUS)
- Don't persist every update to disk — too expensive
- Persist every N seconds or on ride completion
Trade-offs:
- Geohash edge case: drivers near cell boundaries. Solution: query 9 cells (center + 8 neighbors)
- Location accuracy vs update cost: 5-sec updates are usually enough
- Surge pricing is a separate system entirely — don't conflate
Question 11: Design Yelp / restaurant search
Question 12: Design DoorDash delivery matching
Pattern 5: Analytics Pipelines
Question 13: Design a URL shortener (bit.ly, tinyurl)
This is THE most common system design question. Nail this one.
Requirements:
- Generate short URL from long URL
- Redirect from short → long
- Analytics (clicks per short URL)
- Custom aliases (optional)
- Expiration (optional)
Scale:
- 100M URLs created/day = ~1K QPS writes
- 10B clicks/day = ~100K QPS reads (read-heavy)
High-level design:
Create:
POST /shorten {url} → Short URL service → DB → Return short URL
Redirect:
GET /{short} → Redirect service → Cache → DB → 302 redirect
Deep dive: Short URL generation
Three approaches:
- Hash-based (MD5 first 6 chars): Problem — collisions, long URL = always same short URL (bad for privacy)
- Counter-based (encode incrementing integer to base62): Predictable, enables scraping
- Random generation + collision check: Most common. Generate 7-char random base62, check DB for collision (rare).
Go with approach 3.
Base62 (a-zA-Z0-9) with 7 chars = 62^7 = 3.5 trillion URLs. Enough for years.
Deep dive: Click analytics
Don't update the DB row on every click (write amplification). Instead:
Click → Redirect service (fast 302) → Async Kafka event → Analytics service → Time-series DB
Aggregate clicks by hour/day. Batch updates to main DB.
Trade-offs:
- Consistency: URL just created might not be in cache for 100ms
- Expired URLs: batch job or TTL at DB level
- Custom aliases: check for duplicates at write time
Question 14: Design Google Analytics
Question 15: Design a metrics collection system
Pattern 6: Search
Question 16: Design Twitter search
Question 17: Design autocomplete (Google search bar)
Core concepts:
- Trie for prefix matching
- Top-K heap for popular queries
- Distributed inverted index
Question 18: Design product search (Amazon, Shopify)
Pattern 7: Transactions
Question 19: Design a payment system (Stripe)
Requirements:
- Accept card payments
- Idempotency (don't charge twice for same request)
- Refunds
- Webhooks to merchants
- PCI compliance
Scale:
- 100K transactions/sec at peak (Black Friday)
- Must be strongly consistent (no double-charge)
Key patterns:
- Idempotency keys: Client sends unique key. If we see the same key twice in 24hr, return cached response.
- Two-phase commit for transaction state machine
- Event sourcing: Every state change is an immutable event. The DB is a projection.
Architecture:
Merchant API → Payment Service → [Idempotency cache]
↓
State machine (initiated → captured → settled)
↓
Payment processor (Visa/MC network)
↓
Event bus → Webhook dispatcher → Merchant
Deep dive: Idempotency
POST /charge { amount: 100, idempotency_key: "abc123" }
Server:
1. Acquire lock on idempotency_key
2. Check Redis: seen_keys[abc123]?
- If exists: return cached response
- If not: process charge
3. Cache response in Redis (TTL 24h)
4. Release lock
Without this, a network retry causes double-charge.
Trade-offs:
- Strong consistency costs latency. Accept 200-500ms for payment APIs.
- Multi-region becomes hard — most payment systems are single-region with read replicas
- Card data — tokenize and store in PCI-compliant vault, not main DB
Question 20: Design a rate limiter
Classic at all levels. Algorithms:
- Fixed window: Simple, but bursty at window boundary
- Sliding window log: Accurate, expensive (stores every request)
- Sliding window counter: Approximation, low overhead
- Token bucket: Smooth traffic, used by AWS API Gateway
- Leaky bucket: Smooth traffic, simpler than token bucket
Use Redis INCR with TTL for fixed window. For more accuracy, implement token bucket in Redis Lua script.
The Secret: Interviewer's Scoring Rubric
You're graded on 5 dimensions:
- Requirements gathering — Did you clarify before designing?
- Scale estimation — Can you estimate back-of-envelope?
- Architecture — Did you choose appropriate patterns?
- Deep dive — Can you go 2-3 levels deep on 1-2 components?
- Trade-offs — Can you articulate alternatives and why you chose?
Most candidates focus on #3. The differentiator between hire/no-hire is #4 and #5.
Week-by-Week Prep Plan
- Week 1: Read Designing Data-Intensive Applications chapters 1-6
- Week 2: Pattern 1 + 2 (caching, fanout) — 5 practice questions
- Week 3: Pattern 3 + 4 (messaging, geospatial) — 5 practice questions
- Week 4: Pattern 5 + 6 + 7 — 5 practice questions
- Week 5-6: 10 mock interviews (interviewing.io or peers)
Do not skip mock interviews. Reading != being able to answer under time pressure.
The 3-Hour Night-Before Protocol
If your interview is tomorrow:
- (1 hour) Review the 7 patterns. Memorize one canonical example for each.
- (1 hour) Practice the framework: requirements → scale → architecture → deep dive → trade-offs
- (1 hour) Sketch 3 systems on paper, out loud, with a timer
Sleep. Don't learn new material. Don't cram.
Get system design questions tailored to your target company and seniority: HiredPathway. Paste any job URL, get 25+ questions specific to your role. 3 free, no card needed.
Related:
<!-- IMAGE PROMPTS (not for publication)
Hero image (Midjourney):
System architecture whiteboard: hand-drawn boxes and arrows showing distributed system components with labels like API Gateway, Cache, Database, Queue. Person's hand holding a blue whiteboard marker mid-drawing, office background softly blurred, professional tech illustration, photorealistic --ar 16:9 --v 6
7 patterns diagram (Ideogram):
Clean architecture infographic: 7 distributed system patterns shown as hexagonal tiles (caching, fanout, messaging, geospatial, analytics, search, transactions), each with a mini icon. Dark navy background, cyan and teal accent colors, modern tech illustration style.
URL shortener architecture (DALL-E 3):
System architecture diagram showing URL shortener: client → API Gateway → URL Service → Cache (Redis) → Database. Include analytics flow via Kafka to analytics DB. Clean vector illustration, blue and gray palette, labeled components, educational diagram style.
Twitter feed fanout (Ideogram):
Data flow diagram showing Twitter feed fanout: user posts tweet on left, tweet service in middle, branches out to multiple follower timelines on right. Include small celebrity icon showing "fanout on read" alternative path. Modern flat design, blue and white palette.
-->