AI-Powered Content

Building the Future of Content: Inside Booking.com’s Intelligent Content Enrichment Platform

Discover the system architecture that enables seamless content enrichment across Booking.com.

Dor Bachar

Published in

Booking.com Engineering

7 min readNov 25, 2024

Intro

Oh, the sweet nineties. Back then, the internet had only simple <img> tags for pictures, and booking your summer vacation meant calling a random hotel and hoping for the best. It’s now 2024, and both things have become much smarter. How smart, you ask? Read on to discover how Booking.com platforms advanced machine learning algorithms to enhance content selection, delivering a more personalized customer experience.

In this article, we’ll examine the architecture of Booking.com’s Content Intelligence Platform, a system designed to maximize the use of photos and text.

The Evolution of Content-Related ML in Booking.com

Users of Booking.com engage with content from the moment they land on the site until the moment of truth: ordering their next vacation. The homepage greets you with pictures of trending destinations. When you seek unbiased info about the property, reviews of other guests await. And, of course, no one books a stay without checking the room’s pictures first. The property owner can either select all this content or let AI select it (can you guess which option brings better results? ;) ).

As Machine Learning models gained popularity, more teams added ML features to their products. One team used different quality models to rearrange images to increase diversity and better represent users. Another team used a mix of models to suggest improved photos selection to hotel owners. However, many teams needed more resources to develop similar models. They couldn’t use the existing ones, which left unmet needs. This process highlighted the need for accessible and generic Machine Learning models that could serve multiple teams.

To try and address these needs, a dedicated team has managed a Model-as-Service platform, providing real-time and batch access to various ML models. However, invoking these models in real-time resulted in increased latency and complexity, requiring precise knowledge of each model’s input specifications. This highlighted the necessity for a more efficient solution — a platform that could collect and persist model results offline in a simple and accessible manner. Enter the Content Intelligence Platform.

An Overview of the Platform

Today, when a Booking.com team wants to integrate a content-related ML model into their product, all they need to do is choose. The platform serves hundreds of different models for pictures and text via an API (and other ways to consume, as we’ll soon see).

The platform serves offline results, meaning the models don’t receive requests directly from the app flow. The data flow is somewhat inverted: every photo or piece of text that enters Booking.com is broadcasted through the company’s system for general use via Kafka. The platform consumes this live stream and runs all relevant ML models we have one after the other with the inputs. We then persist the results in a relational DB (the specific DB varies per use case) for each piece of content.

This means that when a client fetches ML-based information from the platform, response time is very fast. Models can be integrated into every part of the app or website without impacting performance or being dependent on model availability.

*Content Intelligence Platform is accessed via application backend services.*

From a consumer’s perspective, the platform is as simple as it gets. To receive ML enhancements for a photo, for example, all it takes is the photo’s unique ID in Booking.com’s ecosystem and the model name you’re interested in. There is no need to know the model’s inside workings, like what the features are or how the model’s actual input looks.

With this information in hand, you can simply send an HTTP request to the platform and receive the enhanced data in response. For example, users can send a photo ID and ask for the “Image Captioning” model results for it. In response, they get a string that describes the photo. One neat way this model is used is to generate alt text for images on the site to help with page accessibility.

*Utilizing the “Image Captioning” model results in the image’s alt text to improve page accessibility.*

Data Streaming — What? Why? How?

At the core of our platform stands a data streaming pipeline, which allows us to process live streams of content. The pipeline gets streams of photos and text from several sources via Kafka. It then runs a series of ML models on each piece of content and saves the results.

We use Apache Flink to implement our streaming pipeline. It is a framework for stateful computations over unbounded (infinite) and bounded data streams. Flink provides (among other things) three necessary abstractions:

Sources define inputs for the pipeline. Source types may vary; a classic source type is a Kafka topic.
Operators are the building blocks defining how data transforms and moves around within the pipeline. If you’ve used Java streams before, you’ll be familiar with some of the actions performed with operators, such as filter, map, and collect.
Sinks are terminal operators that determine where we store the results of the pipeline. Sinks can write to databases but also create unbounded output streams like a Kafka topic, for example.

*A typical Flink flow running the Image Captioning model in the platform.*

Before diving further, let’s take a small step back — why even bother to set up a seemingly complex data streaming platform? For example, why not create a classic batch-processing periodic job in Apache Spark?

The main advantage of data streaming is performance. Periodic jobs suffer from latency of minutes, hours, or even days, depending on the data size and its partitioning. With data streaming, you can expect latency of milliseconds. Another big appeal of data streaming is that it allows us to process tiny pieces of information instead of querying big batches. This makes our code lightweight, scalable, and simple to monitor. It also improves failure recovery.

A Platform with Two Faces

The data streaming pipeline we built operates in two modes: Realtime and Backfill. The pipeline consumes content IDs (photo IDs or text IDs) in Realtime via the aforementioned Kafka topics. These are the Sources of our Flink pipelines.

Once we have an ID, we download the picture or text it from dedicated services. Then, we call a series of ML models for inference, one after the other, in an asynchronous manner. The models we serve may be deployed on-prem, on a public cloud, or, in some cases, they can also be third-party models.

For each model, we receive some kind of result; some models return a simple numerical score, for example, the attractiveness of an image. Other models return tags or descriptions. We stream those results forward using Sinks. Depending on the use case, our sinks typically pour the data into relational databases and Kafka topics. Some models return embeddings, which require different treatments and are stored in vector databases. Data is also usually replicated to a data lake. Our ML scientists prefer to find their data there for research and experiments.

The Backfill mode operates similarly to Realtime. Most of the flow is the same, except for the Sources. They are now limited to one managed source — a topic that’s populated periodically. A process scans the company’s main images and texts database for the latest entries and compares them with our calculated results tables. If we find a diff, meaning some content doesn’t have results in our platform, we queue the missing IDs for backfill. This can happen due to model failures or other reasons that we don’t control. We usually have a few backfilled items. But, for data integrity, the platform must have every ID in the system.

An overview of the platform’s architecture.

Backfill mode is also used to onboard new models so that they have historical data and not only for new content.

Flink enables us to run the exact same logic for Realtime and Backfill with little to no code duplication. All we need to do is define new sources and create another pipeline instance. This improves code reusability and enables us to introduce new models quickly to both methods.

Wrapping Up

In conclusion, the Content Intelligence Platform empowers Booking.com to seamlessly integrate content-based machine learning models, enhancing the user experience with personalized and insightful content. By leveraging Flink for efficient data streaming, the platform processes and stores results from many sources, making them readily accessible via an API. This innovative approach has been embraced across hundreds of use cases within the company, transforming how content is curated and presented. As a result, Booking.com continues to lead the way in delivering a richer, more engaging experience for travelers worldwide, setting a new standard for intelligent content enrichment.

Booking.com Engineering

AI-Powered Content

Building the Future of Content: Inside Booking.com’s Intelligent Content Enrichment Platform

Discover the system architecture that enables seamless content enrichment across Booking.com.

Intro

The Evolution of Content-Related ML in Booking.com

An Overview of the Platform

Data Streaming — What? Why? How?

A Platform with Two Faces

Wrapping Up

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Published in Booking.com Engineering

Written by Dor Bachar

Responses (1)