Blogs


How Stock Exchange Works?

There is a moment, roughly once every market quarter, where some piece of news hits the wire and millions of traders hit their buy or sell buttons simultaneously. The exchange absorbs that shock. Prices move. Trades match. Confirmations fly back in milliseconds. Nobody on the outside thinks twice about it.

But if you crack open what actually happened in those milliseconds, you find one of the most carefully engineered distributed systems ever built. Stock exchanges are not just websites that match buyers and sellers. They are real-time, deterministic, ultra-low-latency financial infrastructure where a microsecond of delay can represent thousands of dollars of opportunity lost, and where a single bug in the matching engine can destabilize an entire market.

Alt text

This blog is for engineers who want to understand what is actually happening under the hood. We will start from first principles and work our way through every major subsystem, from order entry to trade settlement, touching on the hardware, software, data structures, and architectural tradeoffs that make modern exchanges tick.

Why This Problem Is Hard

Before jumping into architecture, it helps to understand the constraints.

Read on →

How Amazon S3 Works?

There is a particular kind of quiet confidence in systems that just work. You upload a file, get a URL back, and years later that file is still exactly where you left it. No corruption. No missing bytes. The same object, bit-for-bit identical, retrieved in milliseconds from the other side of the planet. That is Amazon S3 in everyday terms. But the engineering underneath that simplicity is anything but simple.

Amazon Simple Storage Service launched in 2006 and redefined what developers expected from infrastructure. Before S3, running storage at scale meant buying racks of hardware, managing replication yourself, worrying about disk failures, planning for capacity, and building your own data durability systems. S3 flipped that model completely. You pay for what you store, you never think about the hardware, and the system promises eleven nines of durability — meaning you would expect to lose one object for every 100 billion objects stored every 10,000 years.

Alt text

That number sounds like marketing. It is also one of the hardest engineering targets in existence.

This article is a real engineering walkthrough of how S3 works at the architecture level. We will go from the basics of what object storage actually is, through upload pipelines, metadata infrastructure, replication systems, consistency models, and scaling strategies. By the end, you should have a clear mental model of what makes S3 tick — and why the decisions its architects made were the right ones.

Read on →

How Kafka Works?

There is a moment in every backend engineer’s career when a simple queue stops being enough. Maybe you’re logging user activity to a database and the writes start choking the system. Maybe you’re moving data between microservices with REST calls and latency starts creeping up. Maybe a product team asks for “real-time analytics” and you start wondering what that even means at scale.

Alt text

That’s when engineers usually discover Kafka.

Apache Kafka was originally built at LinkedIn to solve a very unglamorous problem: moving enormous amounts of log data between systems without breaking everything. What they ended up building wasn’t just a message queue. It was a distributed commit log, a unified event backbone, and arguably one of the most influential pieces of infrastructure in modern software engineering.

But here’s the thing nobody tells beginners: distributed messaging is genuinely hard. Not hard like “this will take an afternoon.” Hard like “this is a decade of systems research made practical.”

Think about what you’re actually trying to do when you build a distributed messaging system. You want to accept millions of messages per second from producers that don’t know or care about consumers. You want to store those messages durably on disk so nothing gets lost even if half your servers crash. You want to replay old messages if a consumer fails and needs to reprocess. You want ordering guarantees for related events. You want to fan out a single message to dozens of different consumers. You want horizontal scalability so you can throw more hardware at the problem as traffic grows. And you want to do all of this with single-digit millisecond latency.

Read on →

How Pastebin Works?

There is something deceptively simple about Pastebin. You paste some text, click a button, and get a short URL back. You share that URL with someone else, they open it, and they see your text. That is the entire product in one sentence. And yet, building Pastebin at the scale of tens of millions of daily users, billions of stored pastes, and a global audience requires you to make dozens of careful engineering decisions that would fill a whiteboard wall from top to bottom.

The reason Pastebin is a classic system design interview question is not because it is hard to understand — it is because it forces you to think through the full lifecycle of a piece of data in a distributed system: write it, store it, serve it to millions of readers, expire it gracefully, cache it intelligently, and make sure no one abuses it to host malware or leak credentials. Each of those steps has gotchas.

Alt text

This post is a full engineering deep dive. We will go through the architecture layer by layer, explain every major decision, and make sure you understand not just what Pastebin does but why it is built the way it is.

What Pastebin Actually Is

Before we go deep into the systems, let us be clear about the product. Pastebin is a text-sharing service. Users paste raw text, often source code, configuration files, log dumps, or command output, and get a short URL they can share. The core workflow is:

  • A user submits a block of text
  • The system assigns it a unique short ID and URL
  • Other users visit that URL and read the text
  • The paste may expire after a set time or persist forever
Read on →

How Tinder Works

There is a moment every Tinder engineer has probably thought about: a user swipes right, and within a second, both people get a match notification. That notification feels instant, almost magical. But behind that single interaction is an entire distributed system firing in coordination — a recommendation engine, a geo-spatial query, a mutual-match check, a real-time push notification, and a chat channel being provisioned, all happening faster than the human brain can process what just occurred.

Tinder is not a simple CRUD app with a swiping UI on top. It is one of the most sophisticated consumer-grade distributed systems ever built. On any given day, Tinder processes over 1.6 billion swipes globally, serves users across hundreds of countries, and must deliver personalized, geo-aware recommendation feeds to millions of concurrent users, all while keeping latency below perceptible thresholds.

Alt text

The engineering challenges here are real and genuinely hard. You are dealing with write-heavy workloads from swipe events, read-heavy workloads from feed generation, real-time geo queries at planetary scale, ML-based ranking pipelines that need to be both fast and personalized, and a messaging layer that must guarantee delivery even when mobile connections are flaky. Understanding how Tinder solves these problems teaches you almost everything you need to know about modern distributed systems engineering.

This blog is going to walk through the entire architecture, piece by piece, from how a swipe is processed to how the recommendation engine decides whose profile appears next on your screen. We will cover the tradeoffs, the bottlenecks, and the engineering reasoning behind each decision. By the end, you should feel like you genuinely understand how this system works at production scale.

Core Features of Tinder

Before diving into the architecture, it helps to understand exactly what the system needs to do. Tinder’s feature set is wider than most people realize.

Read on →