System Design 101 How Every App Handles Millions of Users
Netflix, Uber, Discord, Google — they all follow the same pattern. One click. Seven layers. Under 100 milliseconds.
Scroll to explore ↓
Every app you use — Netflix, Uber, Instagram, Google — follows the same fundamental architecture. A request leaves your browser, hops through DNS servers, CDN edge nodes, load balancers, and API gateways before hitting a database or cache and returning with your data. The entire journey takes under 100 milliseconds.
System design is the art of choosing the right components for the right job. Netflix uses Cassandra for viewing history because it needs availability over consistency — a stale “Continue Watching” position is annoying, but Netflix being down is unacceptable. They use MySQL for billing because ACID transactions are non-negotiable when money is involved. Discord stores trillions of messages in ScyllaDB after migrating from Cassandra dropped their p99 latency from 200ms to 5ms.
This page covers the complete system design landscape. You won't just read about these concepts — you'll interact with them, break them, and build intuition that sticks.
The Living Architecture Map
Every web app follows this blueprint. Hover to see connections. Click any component to learn what it does and who uses it.
Why does every app look like this?
It wasn't always this way. In 2004, Facebook ran on a single MySQL database. Twitter launched in 2006 on a single Ruby on Rails server. Instagram handled its first million users with just 3 engineers and a handful of AWS instances.
But as users grew — 10x, 100x, 1000x — each company hit the same walls in the same order. The database became a bottleneck. Latency spiked for distant users. A single server failure meant total downtime. And one by one, they added the same components: load balancers, caches, CDNs, message queues, and database replicas.
The architecture map above isn't a design philosophy. It's convergent evolution — every team solving the same scaling problems arrives at the same answer, independently.
Scale the App
Start with one server. Drag the slider to add users. Watch things break — then fix them.
No problems. Life is good.
One machine runs everything — your app, database, and file storage. A $5/month DigitalOcean droplet.
“Every startup begins here. One machine doing everything.”
The $20 to $500K journey
Notice the cost column. At 10 users, you're paying $20/month for a DigitalOcean droplet. At 10 million users, you're paying $500K/month for a fleet of servers, databases, caches, CDNs, and a team of 50+ engineers to keep it all running.
This is why premature optimization is a trap. Don't shard your database at 1,000 users. Don't add Kafka before you need async processing. Each component adds operational complexity — monitoring, debugging, on-call rotations. Add them when the pain of NOT having them exceeds the pain of maintaining them.
Stack Overflow is the canonical example: 1.3 billion page views per month on 9 web servers and 2 SQL Servers. They got absurdly far with vertical scaling, aggressive caching, and excellent engineering before touching horizontal scaling. The lesson? Simple architectures that you deeply understand will outperform complex architectures that nobody fully grasps.
How Real Companies Actually Build This
Theory is one thing. Here's how Netflix, Discord, and Uber actually chose their tech stacks — and the hard-won reasons behind each decision.
300M+ subscribers, 15% of global internet traffic
Netflix doesn't use one database. They use the RIGHT database for each job. Cassandra for availability, MySQL for consistency, ElasticSearch for search, EVCache for speed.
Source: Netflix Tech Blog (netflixtechblog.com)
Three lessons from these architectures
1. Nobody uses just one database. Netflix uses Cassandra AND MySQL AND ElasticSearch AND EVCache. Uber uses MySQL AND Cassandra AND Redis. Each database excels at something and fails at something else. The skill is matching the right database to the right data.
2. Microservices are a consequence, not a goal. Uber has 4,000 microservices because they have 10,000 engineers. Shopify is a monolith serving 2 million stores. The question isn't “should we use microservices?” — it's “is our team large enough that a monolith slows us down?”
3. Every decision is a tradeoff. Discord chose ScyllaDB over Cassandra: faster, but a smaller community. Netflix chose to build their own CDN: cheaper at scale, but massive upfront investment. There's no universally correct architecture — only the right architecture for your constraints.
Every Concept at a Glance
16 building blocks that every system uses. Click any card to see when to use it — and when not to. Filter by layer to focus.
Click any card to see when to use it and when not to.
The Tradeoffs You Can't Escape
There are no right answers in system design. Only tradeoffs. Click each side to see who chose what — and why.
The best engineers don't pick the “right” technology. They pick the right tradeoff for their specific problem.
Thinking in tradeoffs
Junior engineers ask “what's the best database?” Senior engineers ask “what are we optimizing for?”
This mindset shift is the most important thing system design teaches you. There is no “best” technology. PostgreSQL is excellent — until you need 100,000 writes per second across multiple continents. Cassandra handles that beautifully — until you need a JOIN. Redis is blindingly fast — until the server loses power and your cached data vanishes.
The six tradeoffs above aren't just interview trivia. They're the lens through which every architectural decision gets made. When Amazon chose DynamoDB (AP) for their shopping cart, they decided that an available but slightly stale cart was better than an unavailable but perfectly consistent one. When banks choose PostgreSQL (CP), they decide that rejecting a transaction is better than processing a wrong one.
The Numbers That Matter
Not all operations are created equal. The gap between RAM and a cross-continent packet is 1,500,000x. This race makes it visceral.
Throughput: How Fast Can You Move Data?
SSD is 33x faster than HDD. This is why every database guide says “use SSDs.”
Every architectural decision comes back to these numbers
Why does every company use Redis? Because RAM is 1,500x faster than SSD. Why do CDNs exist? Because a cross-continent packet takes 150ms but an edge server nearby takes 1ms. Why does Netflix build their CDN boxes inside ISPs? Because even one network hop matters when you're serving 15% of global internet traffic.
Jeff Bezos famously said that every 100ms of latency costs Amazon 1% of sales. Google found that adding 500ms to search results dropped traffic by 20%. These aren't abstract numbers — they're the reason billion-dollar companies invest millions in infrastructure to shave off milliseconds.
When you're doing back-of-envelope calculations in a system design interview — or in real life — these latency numbers are your foundation. Memory is fast. Disk is slow. Network is slower. And crossing the globe is an eternity in computer time.
Go Deeper
This was the bird's eye view. Each topic below gets its own interactive deep dive with hands-on simulations. All seven are live — follow in order or jump to any topic.
Page 2: The Network Layer
DNS, HTTP, Load Balancers, CDN, Proxies, WebSockets — every hop explained.
Explore →Page 3: The Data Layer
SQL vs NoSQL, caching strategies, message queues, replication, object storage.
Explore →Page 4: Distributed Systems
CAP theorem, consistent hashing, rate limiting, fault tolerance patterns.
Explore →Page 5: Security & Authentication
TLS, OAuth, JWT, common attacks, Zero Trust — how systems protect themselves.
Explore →Page 6: Monitoring & Observability
Golden signals, SLOs, distributed tracing, alerting, incident response.
Explore →Page 7: The Interview Framework
RESHADED framework, estimation gym, latency numbers, common mistakes.
Explore →Page 8: Design Netflix
Full case study. Every concept from the series applied to one real system.
Explore →Frequently Asked Questions
What is system design?+
Why do I need to learn system design?+
What's the difference between a load balancer and an API gateway?+
When should I use SQL vs NoSQL?+
What is the CAP theorem?+
Why does Netflix use multiple databases?+
What is caching and why is it so important?+
How do I start learning system design?+
Sources & References
Jeff Dean & Peter Norvig — Latency Numbers Every Programmer Should Know
Donnemartin — System Design Primer (github.com/donnemartin/system-design-primer)
ByteByteGo — System Design 101 (github.com/ByteByteGoHq/system-design-101)
Netflix Tech Blog (netflixtechblog.com) — Architecture, EVCache, Zuul, Chaos Engineering
Discord Engineering Blog — Scaling to Trillions of Messages
Uber Engineering Blog — Load Balancing, Microservices Architecture
Stack Overflow Architecture (nickcraver.com) — Performance and Scaling