Building the Future of Content: Inside Booking.com’s Intelligent Content Enrichment Platform

Oh, the sweet nineties. Back then, the internet had only simple tags for pictures, and booking your summer vacation meant calling a random hotel and hoping for the best. It’s now 2024, and both things have become much smarter. How smart, you ask? Read on to discover how Booking.com platforms advanced machine learning algorithms to enhance content selection, delivering a more personalized customer experience.

In this article, we’ll examine the architecture of Booking.com’s Content Intelligence Platform, a system designed to maximize the use of photos and text.

The Evolution of Content-Related ML in Booking.com

Users of Booking.com engage with content from the moment they land on the site until the moment of truth: ordering their next vacation. The homepage greets you with pictures of trending destinations. When you seek unbiased info about the property, reviews of other guests await. And, of course, no one books a stay without checking the room’s pictures first. The property owner can either select all this content or let AI select it (can you guess which option brings better results? ;) ).

As Machine Learning models gained popularity, more teams added ML features to their products. One team used different quality models to rearrange images to increase diversity and better represent users. Another team used a mix of models to suggest improved photos selection to hotel owners. However, many teams needed more resources to develop similar models. They couldn’t use the existing ones, which left unmet needs. This process highlighted the need for accessible and generic Machine Learning models that could serve multiple teams.

To try and address these needs, a dedicated team has managed a Model-as-Service platform, providing real-time and batch access to various ML models. However, invoking these models in real-time resulted in increased latency and complexity, requiring precise knowledge of each model’s input specifications. This highlighted the necessity for a more efficient solution — a platform that could collect and persist model results offline in a simple and accessible manner. Enter the Content Intelligence Platform.

An Overview of the Platform

Today, when a Booking.com team wants to integrate a content-related ML model into their product, all they need to do is choose. The platform serves hundreds of different models for pictures and text via an API (and other ways to consume, as we’ll soon see).

The platform serves offline results, meaning the models don’t receive requests directly from the app flow. The data flow is somewhat inverted: every photo or piece of text that enters Booking.com is broadcasted through the company’s system for general use via Kafka. The platform consumes this live stream and runs all relevant ML models we have one after the other with the inputs. We then persist the results in a relational DB (the specific DB varies per use case) for each piece of content.

This means that when a client fetches ML-based information from the platform, response time is very fast. Models can be integrated into every part of the app or website without impacting performance or being dependent on model availability.