How WhatsApp Works?
The Insane Scale of WhatsApp
Let’s start with numbers, because they’re genuinely mind-blowing. WhatsApp isn’t just a chat app — it’s one of the largest distributed systems ever built by humans.

| Metric | Number |
|---|---|
| Monthly Active Users | 2 billion+ |
| Messages sent per day | 100 billion |
| Peak messages per second | 1 million+ |
| Photos shared daily | 60 billion |
And here’s the kicker — in 2012, when WhatsApp hit 27 billion messages per day, they had only 32 engineers. That’s the kind of engineering efficiency that makes the rest of the tech world jealous.
“WhatsApp is proof that you don’t need thousands of engineers to build something used by billions — you need brilliant architecture.”
Bird’s-Eye Architecture
At the highest level, WhatsApp is built around a few key ideas: keep the servers as dumb as possible, push logic to the clients, store as little as possible centrally, and make delivery fast and reliable.
Here’s how the major layers stack up:

WhatsApp originally built on XMPP (Extensible Messaging and Presence Protocol) — an open standard for real-time messaging. They heavily modified it over the years to fit their scale, but the core idea remains: a protocol designed for real-time, push-based communication.
The Journey of a Single Message
You tap send. What happens next? Let’s trace the path of a single “Hey! 👋” from your phone to your friend’s screen.
- Step 1 — You hit send
- Your app encrypts the message on-device using the Signal Protocol (more on this below). The raw text never leaves your phone unencrypted.
- Step 2 — Sent to WhatsApp’s chat server
- The encrypted blob travels over a persistent WebSocket connection (or TCP connection) that your app keeps open with the nearest WhatsApp data center. This is why WhatsApp feels instant — there’s no “dialing in” every time you send a message.
- Step 3 — Server acknowledges receipt
- The server sends back an acknowledgement to your app — this is when you see the single grey tick ✓. The server received it. Job done on your side.
- Step 4 — Server looks up the recipient
- WhatsApp checks if your friend is online (has an active connection). If yes, it pushes the message immediately. If not, it queues the message and waits.
- Step 5 — Message delivered to the recipient’s device
- The encrypted message arrives on your friend’s phone. Their app decrypts it locally. You see the double grey tick ✓✓ — delivered.
- Step 6 — Read receipt fires back
- When your friend opens the chat and the message is displayed, their app fires a “read” event back to the server, which relays it to you — turning those ticks blue ✓✓.
End-to-End Encryption — The Signal Protocol
This is one of WhatsApp’s most impressive engineering decisions. Every message is end-to-end encrypted, meaning even WhatsApp cannot read your messages. Not even if a government asks nicely.
They use the Signal Protocol — the same protocol built by Open Whisper Systems. Here’s how it works at a conceptual level.
Key Exchange — The Double Ratchet Algorithm
When you first message someone, your apps do a cryptographic handshake — they exchange public keys without ever revealing private keys. Think of it like passing padlocks through the mail. Anyone can lock something with your padlock, but only you have the key to open it.
WhatsApp uses a mechanism called the Double Ratchet Algorithm, which does something clever: every single message gets its own unique encryption key. Even if a hacker somehow got the key for message #47, they still can’t decrypt message #48. This is called forward secrecy.
Why this matters for system design: WhatsApp’s servers are essentially routing encrypted blobs they cannot understand. This massively simplifies their legal liability and trust model — they genuinely cannot be a surveillance back-door, because the math makes it impossible.
What Happens When You’re Offline?
This is where message queuing comes in. If your friend is offline when you send a message, WhatsApp’s server holds onto it — encrypted — in a message store.
The moment your friend’s phone reconnects (opens the app, comes back online), their device establishes a persistent connection to WhatsApp’s servers. The server notices the connection, looks up any queued messages, and pushes them all immediately.
One important detail: WhatsApp only stores messages on their servers until delivery. Once a message is delivered to your device, it’s deleted from the server. Your chat history lives on your phone, not in a central WhatsApp database. This is a deliberate design choice — it’s cheaper to store, more private, and reduces liability.
Push Notifications
If your app is killed in the background, WhatsApp uses: - APNs (Apple Push Notification Service) for iOS - Firebase Cloud Messaging (FCM) for Android
…to wake up your app when a message arrives. This is how you see notifications even when the app isn’t running.
How Media Works — Photos, Videos & Documents
Text messages are tiny. A photo is 2–5 MB. A video might be 50 MB. You can’t run these through the same messaging pipeline without everything grinding to a halt. So WhatsApp separates the two completely.
When you send a photo, here’s what actually happens:
- Step 1 — Compress and encrypt on-device
- Your app compresses the image and encrypts it with a one-time key generated just for this file.
- Step 2 — Upload to WhatsApp’s media servers
- The encrypted file is uploaded to blob storage (backed by data centers globally). You get back a URL.
- Step 3 — Send a tiny text message
- Instead of sending the photo through the chat server, WhatsApp sends a small text message containing:
- The URL of the encrypted file
- The decryption key
- Metadata (thumbnail, dimensions, file size)
- The chat server only ever sees tiny, fast-moving messages.
- Step 4 — Recipient downloads & decrypts
- Your friend’s app receives the message, fetches the encrypted file from the CDN, decrypts it with the embedded key, and shows you the photo. Clean separation of concerns.
Read Receipts & Presence — The Ticks Explained
The humble tick is one of the most data-efficient features in WhatsApp. Here’s what each state actually represents in system terms:
| Symbol | Status | What it means technically |
|---|---|---|
| 🕐 | Sending | Message is in the outgoing queue on your device, not yet sent to server |
| ✓ | Sent | Server received and acknowledged the message |
| ✓✓ | Delivered | Message pushed to recipient’s device; their app sent a delivery ack |
| ✓✓ (blue) | Read | Recipient opened the chat; their app fired a “seen” event |
“Last Seen” — How Presence Works
When you open WhatsApp, your client tells the server you’re online. When you close it, it sends an offline event with a timestamp. That timestamp becomes your last seen time. WhatsApp stores this in a fast in-memory store (like Redis) so it can serve it to your contacts instantly.
Group Chats — The Fan-Out Problem
Here’s where things get tricky. When you send a message to a group of 256 people, WhatsApp needs to deliver that message to 255 other devices — each with their own encryption key, potentially on different servers, with different online/offline states.
This is called the fan-out problem: one message needs to branch out to many recipients.
WhatsApp’s Solution — Sender Key
Instead of encrypting the message separately 255 times (once for each recipient), the sender encrypts the message once with a group key. This group key is then distributed to all members using individual encrypted sessions.
| Scenario | Breakdown | Result |
|---|---|---|
| Without Sender Key | 1 message × 255 encryptions | = 255 encrypt operations |
| With Sender Key | 1 message × 1 encryption + 255 small key deliveries | = Much cheaper at scale |
The result? Sending a group message requires roughly the same server-side effort whether you’re sending to 5 people or 256.
Group metadata vs. message content: Group membership info, admin roles, and settings are stored on WhatsApp’s servers. But the actual message content is end-to-end encrypted — the server routes the encrypted blob without reading it.
Why WhatsApp Runs on Erlang
Most startups reach for Node.js or Go. WhatsApp chose Erlang — a 1986 language designed by Ericsson for telecom switches. This sounds like a weird choice until you understand what Erlang was built for: handling millions of concurrent connections with incredibly high reliability.
What makes Erlang special
Lightweight processes Erlang can run millions of tiny concurrent “processes” (not OS threads). Each user connection gets its own process — this maps perfectly to WhatsApp’s model where every user session is stateful and persistent.
“Let it crash” philosophy Erlang’s approach is to write simple code and let the runtime handle failures gracefully. Processes crash, supervisors restart them automatically, the user never notices. This is how WhatsApp achieves high uptime without heroic error handling.
Hot code reloading You can deploy new code to a running Erlang system without restarting it. Zero-downtime deployments are built into the language itself.
The modified Ejabberd XMPP server WhatsApp runs could handle 2 million TCP connections per server — a number that would make most Java developers weep.
Key Takeaways for System Designers
If you’re studying system design or building your own messaging feature, here’s what WhatsApp teaches us:
- Use persistent connections, not polling
- WebSockets/TCP keep-alive connections are why WhatsApp feels instant. HTTP polling at scale would be catastrophically expensive.
- Separate text from binary data
- Never run large files through your messaging pipeline. Upload to blob storage, pass a reference. Keep your message broker handling only small, fast-moving messages.
- Message queues are your offline strategy
- Assume recipients are offline. Queue messages server-side and push on reconnect. This is more reliable than any “retry from client” approach.
- Store as little centrally as possible
- WhatsApp deletes messages from servers after delivery. Less data = lower cost, simpler architecture, and better privacy by default.
- Pick your tech for the problem, not the hype
- Erlang was 30 years old when WhatsApp picked it. It was simply the right tool. Be boring, be reliable, be scalable.
“The best system design is usually the one with the fewest moving parts that still solves the problem reliably at scale.”
WhatsApp is a masterclass in restraint — minimal central state, maximum client responsibility, clever use of encryption to simplify the trust model, and a language ecosystem built for exactly this kind of concurrent, fault-tolerant problem.
Not bad for an app that started as a simple status update tool.