How Netflix keeps you watching
By Bartek Bezemer
Google Deepmind
14 March 2024

Netflix uses clever ways to keep you watching. From tailored thumbnails to recommendations in search.

Netflix uses a toolbox of techniques to keep you watching. Over the years it has perfected the art of thumbnail A/B-testing and fine tuning its recommendation algorithm to grab and keep your attention. These optimizations have helped to keep subscribers onboard and get them to sign-in day-after-day. 

Netflix Thumbnails

An important aspect of getting users to discover the next piece of content is delivering the right visual cue to get the click. Engineering teams at Netflix have been perfecting this art by delivering tailored experiences to Netflix members. A subtle, but important piece, is delivering custom thumbnails that entice users to watch a new episode or movie. 

In 2016, Head of Product Creative at Netflix, Nick Nelson, detailed how the team was trying to grab the attention of its members before they leave to do something else. Nelson highlights that if Netflix fails to capture a user’s attention within 90 seconds, the chances of a Netflix member losing interest increases, with the user ultimately opting for another activity. Nelson notes that the brain only needs 13 milliseconds to process an image. This means the team has to deliver a captivating thumbnail that immediately piques the interest of a Netflix member. 

The team started experimenting with artwork in 2014. Netflix ran consumer research studies where they discovered that artwork was one of the biggest influencers for deciding which content to watch, but also captured 82 percent of the members attention when browsing the service. Meaning that text-based information was taking a backseat compared to images. 

Furthermore, the team found that members only spent an average of 1.8 seconds considering content presented to them. Nelson commented that this even surprised the team at Netflix. They weren’t aware that Netflix users used so little time to consider their next watch. 

The Short Game

In order to better understand the power of thumbnails, engineers at Netflix used the release of its original documentary “The Short Game” to better study image delivery and its effect on user behavior. A practice known in the industry as A/B-testing. In the technical version of Nelson’s blog, Director of Creative Production & Promotion Engineering at Netflix, Gopal Krishnan, noted that the impact of artwork has been largely unexplored in the movie industry and at Netflix themselves. 

The comment is interesting as vast sums are poured into promoting blockbusters and TV-shows, but little research was done into testing how promotional art would impact audience behavior. Whether this was due to the industry’s conservativeness, reluctance or unawareness, one can only speculate. Case in point, Netflix was set to drastically redefine the meaning and importance of the promotional artwork and catapult A/B-testing into a decades-old industry.  

Krishnan pointed out that studios designed promotional images for billboards, separate from the existing material. Other images were meant for DVD covers, which followed a grid-layout that was’t scalable across devices such as wide-screen televisions and portrait oriented smartphones. 

The team at Netflix wanted to redefine the purpose of promotional images across its service. Krihnan and the team went to work to develop a data framework that would select the best artwork per video and fit within the context of the overall Netflix experience. This meant the engineering team could only incrementally run A/B-tests. However, by running many tests, the team could quickly move on to the text. 

For the documentary The Short Game three different art pieces were used. Engineers measured the click through rate per thumbnail (CTR), aggregate play direction, fraction of plays with short duration, fraction of content viewed and much more. Easily put, generating the click was important, but the thumbnail should also keep users locked in. 

You might have experienced this yourself when browsing platforms such as YouTube. A captivating title, a strong thumbnail, only for the content to fall flat. All these signals are pinged back to the database which ranks the video. May it be for better or for worse. Krishnan’s team had to strike the right balance between captivating thumbnails and keeping the user engaged with the content. 

The results showed significant performance differences between the images, with images being able to generate between 6 to 14 percent more engagement with users compared to the default artwork. Krishnan notes that onlookers might be able to point out the flawed reasoning behind the effectiveness of running such simple A/B-tests, as Netflix was merely pushing users from one thumbnail to another. Driving away viewable hours from other content. 

The psychology behind images

Krishnan’s point is valid, as many who venture into the realm of A/B-testing for the first time, will run isolated tests and make adjustments based on a limited subset of data. This can be countered through multi-variation tests. However, even during these complex tests, when the hypothesis is flawed, the team running the experiment will run faulty conclusions. To counterbalance early findings, the team ran multiple tests across a wide variety of titles.  

Engineers at Netflix played around with all aspects available within a thumbnail. Subtle changes in the thumbnails were tested such as the title localization, new episode badging and the background image itself, which in certain tests got enhanced or replaced with high resolution alternatives. All this data was fed back into the system to deliver tailored thumbnails to each user. 

Woman smiling against blurred background
Different facial expressions illicit different emotional responses

Running these tests delivered valuable insights. Not only for Netflix, but also for the A/B-testing community and the movie industry as a whole. Nelson highlighted that it was commonly known that faces would elicit a response within humans. But, even deviations within the expressions could provoke different behavior. This makes thumbnail selection especially difficult, as many factors influence the interpretation of an image.

Additionally, the team found that thumbnail effectiveness differed per region. Nelson pointed out how different nationalities favored different thumbnails of Sense8. This might seem obvious to many working or entering the industry today, but less than a decade ago, this was a novel concept and baffled many within the field of marketing.  

During other tests, the team discovered that antagonists resulted in increased engagement. This effect is not limited to Netflix thumbnails. The same findings were discovered with app icon tests. Adding faces to thumbnails, adjusting colors and other content cues, can result in massive conversion improvements. 

Recommendations in search

In September 2021, Netflix published how it uses the search option to recommend new and interesting content to its users.Sudarshan Lamkhede and Christoph Kofler open their research by highlighting that the Netflix homepage delivers personalized recommendations to users based on their viewing behavior and similar users. This is an efficient way for users to discover new content without putting in the extra efforts. A similar experience as to how YouTube immediately suggests content when opening the app. But the team at Netflix wanted to expand this experience when users themselves were actively searching for titles or things to watch. 

Matching the content to query seems easy, but behind lie fundamental principles and technologies that ensure the right content is served to the users. Lamkhede and Kofler point out that users who search on Netflix have a specific intent, similar to users who use a search engine. These search intentions fall between the users knowing what he or she is searching for to not knowing what to look for and using the search mechanism as an exploration tool to discover new content. 

The researchers bring forward the example of a Netflix user searching for Sonic. This query can be immediately served with videos that contain Sonic the Hedgehog and explore related content to Sonic. The same principle can be extended to users who search for Adam Sandler, movies and series are shown that directly feature the actor. Additionally results are displayed that feature actors that are similar to Adam Sandler. 

These queries and suggestions operate in the realm of the abstract, which the researchers and engineers at Netflix have to address in order to satisfy users. The keyword matching is the easiest layer for the video streaming service to handle, as videos can be linked to content directly matching the keyword. The recommendation section however, the videos that appear alongside the initial intent of the users, are more difficult to determine. This is what the researchers call the query context. 

One might argue why Netflix would be bothered with adding related search content to begin with. Lamkhede and Kofler however, argue that by providing recommended content related to the search query removes friction from the product experience. First, as many Netflix customers will have undoubtedly experienced, is handling the unavailability problem. When a search query is unable to return content available on the platform, it can smoothen user disappointment by recommending content that might satisfy the user. 

By adding this recommended content to the search section, dead ends are avoided, keeping the user engaged. Additionally, adding related titles can help users navigate to their content when forgetting the details such as the title of the movie or the actor. This takes away friction at the user-side, who is able to find their content through a multitude of alternatives. Lastly, it’s a signal to the user that much more content is available for them to watch. Going beyond the recommendations on the homepage. 

Netflix Recommendations

Delivering the perfect thumbnail and matching search intent with appropriate content are pieces of a grand puzzle, the recommendation algorithm. This recommendation algorithm lies at the heart of Netflix and its ability to keep users engaged on the platform. The blog where engineers Xavier Amatriain and Justin Basilico outline the recommendation system is already over a decade old, published in April 2012. However, the details give us a rare glimpse into how the team at Netflix laid down the foundations that would redefine how video streaming customers received their content. 

Amatriain and Basilico wind back the clock to 2006, when Netflix announced the Netflix Prize. The competition with a $1 million reward was set to find the best machine learning and data mining model to predict movie ratings. The aim for the team at Netflix was to improve the accuracy of its Cinematch rating system by 10 percent, which would further enhance its recommendation system to members. 

The recommendation algorithm revolved around the root mean squared error (RMSE). As a non-mathematician this doesn’t ring a bell necessarily, but case in point remains that the team was looking for a new methodology to ensure Netflix members would receive better recommendations. During the initial run, none of the competing teams was able to achieve the desired 10 percent improvement score. However, Korbell team managed to improve the Cinewatch accuracy by 8.43 percent. Besting all competing teams.

The Korbell team poured over 2,000 hours into developing 107 algorithms that got them the win. Two of the developed algorithms delivered the largest leaps, the Matrix Factorization and the Restricted Boltzmann Machines. Amatriain and Basilico noted that the mathematical models devised by the Korbell team had some drawbacks. The algorithm could only process 100 million ratings. A far lower baseline than the 5 billion ratings Netflix was already handling at the time. Despite these drawbacks, the models proved valuable additions to the Cinematch program. 

One might wonder why Netflix ventured outward to improve its recommendation system. Amatriain and Basilico explained that Netflix itself had drastically changed. Coming from an offline DVD postal service to a video streaming service with millions of customers, which redefined digital content consumption. Whilst simultaneously very challenging to keep pace with, for Netflix engineers it unlocked a treasure trove of data that could leapfrog its service beyond its competitors. 

The DVD postal service, they noted, sends a selection of DVDs to customers, who would be able to enjoy series and movies over multiple days and weeks. Streaming however, drastically shifted this dynamic. Netflix members who logged into the service wanted to watch something immediately. Additionally, users could now enjoy new content across a wide variety of devices, which increased competition. 

Consumers could watch series on the Xbox 360, on the iPhone or on Android devices. Netflix had to come up with some that would keep users around on the service, ensuring they wouldn’t venture out to competitors. We can see its latest attempt to capitalize in the attention economy through Netflix Games. Amatriain and Basilico pointed out that 75 percent of the content recommended to users came from its recommendation algorithms. 

As Netflix engineers kept tinkering with algorithms to serve content to its customers, it found that recommendations became an integral part of the service. Recommendations, which kept improving over time as customers fed more data into the system, saw an organic orchestra of content. Each time they would open Netflix, they would be treated to a new experience and could jump right in where they left off. 

The Netflix Homepage

These elements all come together in the Netflix homepage. The visual design has drastically changed since then, but the principles of how the page is structured still apply today. When a user logged back in, Netflix displayed a top 10 selection, which consisted out of recommendations viewed by members in the household. This aspect has long been gone and replaced by a top 10 selection in the region, but it shows how Netflix back then used a social aspect to push content to its members. 

Tiles themselves were enhanced with relevant information about the movie or series. The Cinematch score still consisted of a 5-star rating, which today is replaced by a match-score expressed in percentages. One expanded, the users would be informed about friends who had already watched the series. This feature, which used the Facebook connect feature has also disappeared. Amatriain and Basilico commented that this feature allowed Netflix to push content to members which was already moving within their social circles. 

Content was divided into genres, a feature we still see today. Netflix added 3 layers of personalization into the categories line, distinguishing between the genre itself, the titles within the genre and the top ranking titles within the genre. The genre addition has proven to be a valuable addition to the Netflix service as users came to this section to discover more of the content they loved. 

Global recommendations

Paradoxically, the sophistication behind the recommendation algorithm gives the individual user a sense of uniqueness, turning Netflix into a living organism that caters to the individual’s wants and needs. But in actuality, across the board, members had a lot in common, as highlighted by Vice President, Product Innovation at Netflix, Carlos Gomez-Uribe in a blog post published in February 2016. 

As the team started to roll-out its ever-evolving recommendation system to users across the globe, it discovered that the personalization algorithm started to deliver similar suggestions to individual users across the world, revealing that tastes were eerily close to subsections of its subscriber base. This in part could be traced back to the algorithm itself pooling users into groups with similar interests and delivering recommendations popular within that community. 

Gomez-Uribe highlights the anime community, which allowed engineers at Netflix to recommend content to members with similar interests. The team at Netflix discovered that anime was, unsurprisingly, widely popular in Japan. However, only 10 percent of anime viewers on Netflix were situated in Japan, meaning there was a thriving global community for anime content. Optimizing recommendations for anime’s domestic audience, would severely limit the reach of the genre as a whole. 

Netflix has been optimizing the recommendation system ever since it gave a glimpse into its ranking methodology. In August 2022, Ehtsham Elahi, wrote how the team used reinforcement learning to deliver recommendations to budget constraint Netflix members. Elahi noted that this poses significant challenges for recommendation algorithms, who only have limited resources to deliver strong suggestions to users. 

The suggestions have to be magnitudes better than the possible alternatives the customer has to choose from. Elahi and his team go into greater detail, elaborating on the models available to optimize recommendations for this user cohort. Sparing you the details, the interest in engineers to optimize for this customer segment, shows how important the recommendation system is to Netflix’s business. 

Driving Business Value

The impact of improved recommendations are further elaborated by Gomez-Uribe in a 2016 paper, describing the business potential of its recommender system and its vital importance to growing Netflix what’s become today. Gomez-Uribe commented that the team believes that having its own recommender system is a part of its core business. Adding that personalization enables Netflix to find audiences for niche content, which is against the principles of traditional linear television broadcasting, who have to deliver content that captures large audiences to allow advertisements to reach as many people as possible. 

Having this customized recommender system in place helps Netflix to serve content more equally across its members. Meaning users don’t have to check out of the service when there’s content available that peaks their interest. Netflix measures the recommendation effectively through what it calls the effective catalog size (ECS). This metric helps Netflix to calculate how viewing is spread across the items in its catalog. 

The ECS can reveal which content is popular across the entire Netflix base, which in turn signals to the recommender system that this content should be pushed to a wider spectrum of members. By enhancing recommendations, Gomez-Uribe notes that the service has a higher success rate of delivering good content. This will lead to higher user engagement with the product such as increased streaming hours and reduce churn, or subscription cancellations. 

Failing to deliver strong content can be detrimental to business performance. Something Disney+ is struggling with, seeing millions of users leave the service. While this might be directly correlated to recommender systems at the service, failing to provide strong content consistently, will hurt a business in the mid and long-term. Gomez-Uribe points out that its churn figures are low and the primary cause for members canceling their subscription is related to payment failures rather than other more explicit reasons. 

Technological prowess

The ability of Netflix to deliver strong, customized, recommendations year after year, seamlessly across multiple devices could without a doubt be called a technological marvel. But this didn’t come easy. The engineers at Netflix have poured countless hours into fine tuning algorithms to deliver the most engaging thumbnail, expand the search section and deliver a customized homepage every time a member logs in. 

Many of these things will go unnoticed to the average Netflix user. Taking these technological leaps for granted. On the flipside, it shows how important it is for Netflix to keep expanding its customization services to prevent users from moving to a competitor. All these incremental changes signal to customers that much more content is available and there’s no need to switch to alternatives. 

Extending this across other consumer faced companies, this Netflix case study shows how important customization can be to grow a company. Adding customizability to the customer journey, may it be by adding the first name to an email or rewarding a user for their trust in the service, all contribute to a seemingly tailored experience. Marketers create a bond with the customer through subtle cues.

Bartek Bezemer graduated in Communications (BA) at the Rotterdam University of Applied Sciences, Netherlands. Working in the digital marketing field for over a decade at companies home to the largest corporations in the world.

Recommended reads

No Results Found

The page you requested could not be found. Try refining your search, or use the navigation above to locate the post.

Pin It on Pinterest

Share This