DX Heroes logo
#ai
#enthusiasts
#how-it-works

How does Netflix know what you want to watch before you do?

Length: 

8 min

Published: 

April 29, 2025

How does Netflix know what you want to watch before you do?

Did you know Netflix has a huge team of researchers, and that up to 80% of what you watch on Netflix is shaped by their title recommendation system? Have you ever wondered how it works?

The recommendations you see are the result of powerful recommendation models. Originally, each section, for example "Continue Watching" and "Top Picks for You", had its own model. It drew data from the same sources as the others, but trained separately. Maintaining and improving the individual models was getting harder and more expensive.

This year, Netflix is starting to move towards a unified, comprehensive system. It's building a powerful foundation model that understands user behaviour and preferences and can share that data across all recommendation systems.

From many models to one supermodel

Originally, Netflix had a bunch of smaller models, each trained on its own. One remembered what you liked in action movies, another recommended shows that were currently popular. But the models didn't talk to each other. That caused problems, especially during updates and whenever the models needed upgrading.

Netflix's new approach draws on how large language models (= large language models, LLMs for short) work. Instead of building lots of small models, Netflix now builds one big one that understands your watching habits as a whole. It then helps the other systems by sharing what it has learned, either directly or through reusable embeddings.

Tokenization, or turning watching habits into tokens

Netflix is a professional stalker. It watches your every interaction: what you watch, for how long, what you skip, even on what device and in what language. But raw (unlabeled) data alone isn't enough. So Netflix converts these (inter)actions into tokens (tokens), units of behaviour, such as "watched Stranger Things for 40 minutes on my phone tonight".

You feed the model these tokens so it learns how users behave over time. Here comes the next challenge, because users do a lot of things. So Netflix has to find a way to decide how much detail to keep, while still processing the data quickly.

The model learns like a person, not just like a machine

As we mentioned, Netflix took inspiration from LLMs that predict the next word, or token. But Netflix wants to predict the next action a user might take. And there are lots of actions, so it has to give them different weights. Watching a full movie, for example, carries more weight and meaning than watching a three-minute trailer. So the model learns to sense what matters, which lets it recommend shows you might like more accurately.

Solving the "new show" problem

When a new movie or series comes out and no one has seen it yet, how can Netflix start recommending it?

It handles this in two ways:

  1. Incremental training. New titles get embeddings (you could call them initial data) based on similar existing titles in the database, and are gradually ranked based on real interactions from users.
  2. Metadata. Even though no one has seen the show yet, the model knows the genre, the language, and the mood, and uses that to judge where best to place it.

That way, brand-new shows can show up in your recommendations on day one. From then on, though, a show is ranked by how users interact with it.

Embeddings, the secret ingredient

Embeddings are like digital fingerprints of each show, user, or genre. They capture subtle patterns of behaviour and preference. Netflix then shares these vectors with its other tools, for example to find similar shows, predict your next watch, or personalise your homepage.

But there's a catch. The embeddings change every time the model is re-trained. So Netflix uses special mathematical transformations that take old embeddings and turn them into new ones. The vectors stay as stable as possible, and the other systems can keep working with them.

Conclusion

Netflix's goal is that, ideally, you don't have to search for anything at all. It tries to discover things for you while taking your preferences into account. Those preferences form from how you behave on Netflix, but also from how users with similar histories behave.

Their foundation model is a significant step towards a single system instead of many small tools. It rests on centralising data, drawing on the principles of LLMs, and using embeddings.

The model learns better, adapts faster, and gives better recommendations. Just as large language models changed the way we work with text, this approach can transform how recommender systems work. What does it mean for us? More accurate recommendations and more shows we actually want to watch, without having to search for them.


Sources:


Related articles:

Want to stay one step ahead?

Don't miss our best insights. No spam, just practical analyses, invitations to exclusive events, and podcast summaries delivered straight to your inbox.