Blogs


How Spotify Works?

A fraction of a second before music starts flowing through your headphones, an invisible chain of systems has already sprung into action. Your device must figure out what track to play next, determine whether the audio is stored locally or needs to be fetched, connect to the nearest CDN edge server, stream and decode compressed audio packets in real time, and deliver uninterrupted playback before you even notice the delay. At Spotify’s scale — serving hundreds of millions of listeners across wildly different devices, bandwidth conditions, and geographies — this is not just streaming. It is a massive distributed system constantly balancing speed, reliability, and personalization, while quietly predicting the next song you are most likely to fall in love with.

Alt text

That is not a simple problem. It is one of the most interesting distributed systems challenges in consumer software, combining real-time media delivery, personalization at scale, search infrastructure, offline sync, and a licensing system that would give most engineers a headache. This article is a deep walk through all of it. Whether you are preparing for a system design interview, curious about how streaming infrastructure really works, or building something similar at a smaller scale, the goal is to leave you with a genuine mental model, not just a list of buzzwords.

Why Music Streaming Is Hard

Before diving into architecture, it is worth spending a moment on why this problem is genuinely difficult, because the instinctive answer — “just serve audio files from a server” — misses most of what makes Spotify interesting.

Read on →

How Stock Exchange Works?

There is a moment, roughly once every market quarter, where some piece of news hits the wire and millions of traders hit their buy or sell buttons simultaneously. The exchange absorbs that shock. Prices move. Trades match. Confirmations fly back in milliseconds. Nobody on the outside thinks twice about it.

But if you crack open what actually happened in those milliseconds, you find one of the most carefully engineered distributed systems ever built. Stock exchanges are not just websites that match buyers and sellers. They are real-time, deterministic, ultra-low-latency financial infrastructure where a microsecond of delay can represent thousands of dollars of opportunity lost, and where a single bug in the matching engine can destabilize an entire market.

Alt text

This blog is for engineers who want to understand what is actually happening under the hood. We will start from first principles and work our way through every major subsystem, from order entry to trade settlement, touching on the hardware, software, data structures, and architectural tradeoffs that make modern exchanges tick.

Why This Problem Is Hard

Before jumping into architecture, it helps to understand the constraints.

Read on →

How Amazon S3 Works?

There is a particular kind of quiet confidence in systems that just work. You upload a file, get a URL back, and years later that file is still exactly where you left it. No corruption. No missing bytes. The same object, bit-for-bit identical, retrieved in milliseconds from the other side of the planet. That is Amazon S3 in everyday terms. But the engineering underneath that simplicity is anything but simple.

Amazon Simple Storage Service launched in 2006 and redefined what developers expected from infrastructure. Before S3, running storage at scale meant buying racks of hardware, managing replication yourself, worrying about disk failures, planning for capacity, and building your own data durability systems. S3 flipped that model completely. You pay for what you store, you never think about the hardware, and the system promises eleven nines of durability — meaning you would expect to lose one object for every 100 billion objects stored every 10,000 years.

Alt text

That number sounds like marketing. It is also one of the hardest engineering targets in existence.

This article is a real engineering walkthrough of how S3 works at the architecture level. We will go from the basics of what object storage actually is, through upload pipelines, metadata infrastructure, replication systems, consistency models, and scaling strategies. By the end, you should have a clear mental model of what makes S3 tick — and why the decisions its architects made were the right ones.

Read on →

How Kafka Works?

There is a moment in every backend engineer’s career when a simple queue stops being enough. Maybe you’re logging user activity to a database and the writes start choking the system. Maybe you’re moving data between microservices with REST calls and latency starts creeping up. Maybe a product team asks for “real-time analytics” and you start wondering what that even means at scale.

Alt text

That’s when engineers usually discover Kafka.

Apache Kafka was originally built at LinkedIn to solve a very unglamorous problem: moving enormous amounts of log data between systems without breaking everything. What they ended up building wasn’t just a message queue. It was a distributed commit log, a unified event backbone, and arguably one of the most influential pieces of infrastructure in modern software engineering.

But here’s the thing nobody tells beginners: distributed messaging is genuinely hard. Not hard like “this will take an afternoon.” Hard like “this is a decade of systems research made practical.”

Think about what you’re actually trying to do when you build a distributed messaging system. You want to accept millions of messages per second from producers that don’t know or care about consumers. You want to store those messages durably on disk so nothing gets lost even if half your servers crash. You want to replay old messages if a consumer fails and needs to reprocess. You want ordering guarantees for related events. You want to fan out a single message to dozens of different consumers. You want horizontal scalability so you can throw more hardware at the problem as traffic grows. And you want to do all of this with single-digit millisecond latency.

Read on →

How Pastebin Works?

There is something deceptively simple about Pastebin. You paste some text, click a button, and get a short URL back. You share that URL with someone else, they open it, and they see your text. That is the entire product in one sentence. And yet, building Pastebin at the scale of tens of millions of daily users, billions of stored pastes, and a global audience requires you to make dozens of careful engineering decisions that would fill a whiteboard wall from top to bottom.

The reason Pastebin is a classic system design interview question is not because it is hard to understand — it is because it forces you to think through the full lifecycle of a piece of data in a distributed system: write it, store it, serve it to millions of readers, expire it gracefully, cache it intelligently, and make sure no one abuses it to host malware or leak credentials. Each of those steps has gotchas.

Alt text

This post is a full engineering deep dive. We will go through the architecture layer by layer, explain every major decision, and make sure you understand not just what Pastebin does but why it is built the way it is.

What Pastebin Actually Is

Before we go deep into the systems, let us be clear about the product. Pastebin is a text-sharing service. Users paste raw text, often source code, configuration files, log dumps, or command output, and get a short URL they can share. The core workflow is:

  • A user submits a block of text
  • The system assigns it a unique short ID and URL
  • Other users visit that URL and read the text
  • The paste may expire after a set time or persist forever
Read on →