Building Minute93: A Real-Time Football Platform Built to Scale

Chapter 1

What Minute93 Does

At its core, Minute93 is a football intelligence platform. You can check live scores as matches happen, see goals and cards and substitutions appear the moment they occur, browse league standings, look up teams and players, and search across the entire dataset with fuzzy matching.

The frontend is a Next.js app. The backend is a NestJS API behind an Nginx reverse proxy, with Kafka handling event streaming, Redis managing caching and real-time delivery, and k6 for load testing. Data comes from API-Football, which provides real-time match data for leagues like the Champions League, La Liga, and the Premier League.

But the interesting part is not what the app does. It is how the data moves through the system.

Chapter 2

The Data Pipeline

Everything starts with a poller. It is a background worker written in Node.js that hits the API-Football endpoint on a schedule. During live matches, it polls every 30 seconds. When nothing is happening, every 5 minutes. It runs per league, so Champions League matches get their own polling cycle separate from La Liga or the Premier League.

When the poller picks up new data, the first thing it does is check for duplicates. Every event (a goal, a card, a substitution) gets an ID built from the match ID, event type, minute, and player name. That ID gets pushed into a Redis set. If Redis returns 0, it means the event was already seen, and the poller drops it. If it returns 1, the event is new, and it goes into Kafka.

The Kafka topic is called match.events, and this is where things get interesting.

Chapter 3

Four Consumers, One Event Stream

When an event lands in match.events, four independent Kafka consumer groups pick it up. Each one does something different with the same event, and none of them know about each other.

Postgres Writer

Persists events, writes or updates match rows with the current score, status, and minute. Stores individual events like goals, cards, and substitutions.

Cache Updater

Updates Redis with live score data cached at a 5-minute TTL. Invalidates standings caches on every match update so the next request fetches fresh data.

Stats Aggregator

Watches for goals and match-end events. Fetches fresh standings from API-Football and upserts them to keep standings accurate in near real-time.

SSE Publisher

Pushes every event to a Redis Pub/Sub channel. Each match gets its own channel, so clients watching one match never receive noise from another.

This is the core advantage of the fan-out pattern. One event enters the system, and four different things happen with it independently. If the Postgres writer falls behind, it does not slow down the SSE publisher. If the cache updater has a hiccup, standings still get written to the database. Each consumer can be scaled, restarted, or debugged without touching the others.

Chapter 4

How Live Updates Reach the Browser

The live update path is worth walking through because it completely bypasses both Postgres and the main API for the hot path.

When you open a live match on Minute93, your browser opens an EventSource connection to the SSE endpoint. On the server side, the NestJS controller subscribes to the Redis Pub/Sub channel for that match. When the SSE Publisher consumer pushes an event to that channel, Redis delivers it to the subscriber, which pushes it down the SSE stream to your browser.

Your browser parses the event and updates the UI: the score changes, a new event appears in the timeline, the match minute ticks forward. When the match status changes to finished, the client closes the stream and fetches the final match data via a normal REST call.

Browser (EventSource)
    |
    v
GET /matches/:id/stream
    |
    v
NestJS SSE Controller -- subscribes to --> Redis Pub/Sub (match:{id}:events)
    ^
    |
SSE Publisher Consumer <-- reads from -- Kafka (match.events)

During a live match, the SSE path generates zero database queries. If you had 10,000 people watching the same match, the database would not feel it at all. The load stays on Redis, which is built for exactly this kind of workload.

Chapter 5

Redis is Doing Four Different Jobs

One thing I am proud of in this design is how Redis serves four completely different purposes, each one a textbook pattern used in production systems everywhere.

Cache-Aside

Live scores and standings get cached with short TTLs. Most read requests hit Redis first and only touch Postgres on a cache miss.

Pub/Sub

Powers live update delivery. Kafka consumers publish events to per-match channels, and SSE controllers subscribe to them.

Deduplication

The poller pushes event IDs into a Redis set with a 24-hour TTL. If an event ID already exists, it gets dropped.

Rate Limiting

A sliding window counter using INCR and EXPIRE keeps any single client from hammering the server.

Four patterns, one Redis instance, zero overlap between them.

Chapter 6

Search & The Poller's Tricks

The search feature uses PostgreSQL's pg_trgmextension for fuzzy matching. GIN indexes on player names and team names enable the similarity operator, so searching for “Ramos” will find results even if you type “Ramoss” or “Ramo.” Results from both the players and teams tables get combined, sorted by similarity score, and returned as a unified list. It is fast, it is forgiving of typos, and it runs entirely inside Postgres without needing a separate search engine.

The polling logic handles a few edge cases that are easy to miss. When a match disappears from the live feed, the poller does not just assume it is over. It tracks previously-live match IDs in memory per league. If a match was live in the last poll but is gone from the current one, the poller fetches that specific match by ID to figure out the real final status. Was it full time? Extra time? Penalties? This matters.

On every poll cycle, the poller also checks for any matches still marked as live whose kickoff was more than 2.5 hours ago. No real football match lasts that long. When it finds one, it resolves the final status through Kafka if possible, or falls back to updating Postgres directly. This guarantees that no match can ever be stuck in the live tab indefinitely.

Chapter 7

Database & Infrastructure

The Postgres schema has 12 tables and a couple of materialized views. The tables cover users, leagues, teams, players, matches, match events, lineups, analytics, and more. Indexing is designed around the actual access patterns. B-tree indexes on match status, kickoff time, and league ID handle the filtered queries that power the main pages. GIN trigram indexes handle search. JSONB columns store match statistics and event details where a flexible schema makes more sense than strict columns.

The backend API runs on 2 CPU cores with 4 GB of RAM. Postgres has 0.5 CPU and 1 GB of RAM. Redis has 1 GB of memory and 1,000 connections. Kafka is on a managed cloud tier that has been rock solid. The frontend is on a free hosting tier.

That is the whole stack. No Kubernetes, no multi-region deployment, no managed container orchestration. Just a handful of services, each sized for the job.

Chapter 8

Load Testing: The Part Where Things Broke

Building a system that works under zero load is easy. The real question was whether Minute93 could handle match-day traffic, and the only way to answer that was to throw simulated users at it.

I used k6 and designed tests around five realistic user behavior scenarios. 45% of virtual users were casual viewers who check live scores, glance at a match, look at standings, and leave. 25% were live match watchers who pick a match and poll for updates every few seconds for minutes at a time. 15% were explorers browsing teams, players, and results. 10% were searchers using the fuzzy search feature. 5% were power users drilling deep into multiple matches, events, and lineups.

The test patterns simulated real match-day traffic too. Pre-match ramp as users arrive before kickoff. A sudden spike when the match starts. Sharp surges when goals are scored. A halftime dip. A second-half return. A gradual cooldown after the final whistle.

The Baseline Tests

The first round of tests ran at moderate load, around 200 virtual users, with varying durations. The results were clean. 100% pass rates, sub-second median latencies, zero errors. The system worked exactly as designed.

The Ambitious Tests

Then I pushed it. 3,000 concurrent virtual users, simulating a full Champions League evening. The system collapsed. Error rates above 90%. Requests timing out at 30 seconds. One test had to be aborted less than a minute in because the server was already unresponsive.

The Diagnosis

Here is the thing though. The failure was not architectural. Nothing about the design broke. The Kafka consumers kept doing their jobs. The Redis patterns still worked. The SSE path still functioned. What happened was purely a resource issue: the CPU was completely saturated, and requests piled up waiting for compute time that was not there.

The Proof

So I designed a final test at the ceiling of what the infrastructure could handle. 500 sustained virtual users with spikes to 800. Same match-day patterns. Same user behavior scenarios. Same chaos.

The result: 100% checks passed. Zero percent error rate. Zero failed requests. Every pattern held. Kafka fan-out, Redis cache-aside, Pub/Sub for SSE delivery, event deduplication. All of it working exactly as intended under sustained load.

The only thing that ran hot was the CPU, and that is not a design problem. That is an infrastructure sizing knob.

Chapter 9

What This Proves About the Architecture

The whole point of this exercise was to answer a question: does this architecture actually scale, or does it just work when nobody is using it?

Stateless API layer

NestJS handles requests without storing session state in memory. You can put more instances behind a load balancer and each one handles its share. No sticky sessions, no shared state between processes.

Cache absorbs read traffic

With Redis sitting in front of Postgres, 80-90% of requests at scale would never touch the database. Most traffic is live scores and standings, cached with short TTLs and served in sub-millisecond reads.

Per-consumer scaling

Because each Kafka consumer group is independent, you can scale the one that is lagging without touching the others. If SSE delivery is slow, add more SSE publisher instances.

Live path skips the database

Kafka to Redis Pub/Sub to the browser. During a live match with thousands of viewers, the database does not see any of that traffic.

Tuned access patterns

Smart indexing means the database does less work per query, so each unit of CPU goes further.

If you wanted to take this to true scale (10,000+ concurrent users), you would add a load balancer, Postgres read replicas, a CDN for caching API responses at edge locations, a Redis cluster for more memory and connections, and more Kafka partitions for parallel event processing. None of that requires code changes. The architecture already supports all of it because the API is stateless, the consumers are idempotent, the Redis patterns work across clusters, and the database uses proper indexes.

The proof is in the test results. Under sustained load with realistic traffic patterns, the system hit 100% success with zero errors. The architecture did not break. It just asked for more CPU. And that is exactly what good distributed system design looks like.

Chapter 10

Why I Built This

I read system design articles, case studies, and scaling postmortems all the time. But reading about how someone else handled 50,000 concurrent users is very different from actually sitting in front of a terminal, watching your own system fall over at 3,000, and figuring out why.

A lot of us work at startups. And at a startup, you do not get to practice this stuff. The traffic is not there yet. You build features, ship fast, and hope the architecture holds when growth finally comes. But when it does come, you are suddenly expected to know how to handle it, and learning on the fly with real users and real money on the line is not a great position to be in.

Minute93 was my way of getting ahead of that. I wanted to build something real, throw serious load at it, watch it break, diagnose the failures, and prove that the underlying design was sound. So that when the day comes at an actual job where traffic doubles overnight or a product goes viral, I already know what levers to pull.

Beyond the system architecture, this project is also a showcase of my frontend and backend development skills end to end, and of a real workflow around building with AI coding agents. I had defined engineering standards at my workplace to ensure code quality, maintainability, and reliability, regardless of whether code is written by a human or generated by AI. Using structured prompting techniques, maintaining context across sessions, knowing when to lean on AI tooling and when to think through problems yourself. That is a skill in itself, and it is one that compounds over time.

This was not just a portfolio project. It was practice for the real thing.

The name Minute93 comes from Sergio Ramos' header against Atletico Madrid in the 2014 Champions League final. 93rd minute, his team trailing, the trophy slipping away. He did not score that goal by accident. He scored it because he had put in the work before. That is the whole idea behind this project. Do the work now, so when the real 93rd minute comes, you do not have to think. You just know what to do.

Read this article on Medium

This article is also published on Medium. Read it there, share it, or explore the codebase on GitHub.