Myntra Store Federated Architecture Part 1: Automatic Layout Widget Reordering

Myntra has two categories of merchandising stores: category stores and brand stores.

The store operations team actively manages category and brand store root pages to run different campaigns in coordination with business and category teams. Some of the brands also manage their brand stores.

They change layouts, add new widgets in a layout, and new content in widgets and also change the positions of widgets.

Definitions

Page — It is fully rendered content on the app for a specific URI.

Layout — It is a wireframe of the page. The layout consists of a collection of widgets. A page can have multiple layouts targeted to different sets of users.

Widget — A widget is the rendered UI component of the page. It has 2 parts — the UX component and the data component. UX component defines the look and feel. The data component defines what data is shown inside the widget.

Static layouts

Our store page layouts were static as the order of widgets in the layout was predefined and fixed.

The store operation team used to decide which widget needed to be placed in what positions. It was based on their experience of the performance of different widgets.

As there are multiple business teams in Myntra and the home page real estate is limited, business teams coordinate and decide if real estate visibility needs to be changed for a business team based on the business use case.

The above process was time-consuming and required coordination among multiple business teams. After alignment, configuring widgets was done manually.

The case for dynamic layouts

Myntra gets maximum traffic from mobile channels. Mobile has limited real estate due to its small screen size. On average, a user does click on the first two scrolls only. With every scroll, the widget’s visibility and engagement are reduced drastically. Showing the most relevant widgets at the top leads to higher user engagement.

Every user has a different liking and engagement with different types of widgets. So showing the best-performing widgets for all users on top without personalisation is not optimal for each user.

Model selection

Typical recommender systems that are trained in a continuous fashion are plagued by a feedback loop. The next iteration of the model training is guided by the training data generated by the current model in production this is known as algorithmic or selection bias. This causes the newly trained model to act greedily and favor items that have already been engaged by the users. This behavior is particularly harmful in personalized recommendations, as it can cause new ads/widgets to remain unexplored.

One of the toolkits to address this bias is Bandits. For widget reordering, we have explored a class of Bandit Algorithms called Contextual Multi Arm Bandits (CMABs), particularly LinUCB and LinTS.

Overview of Multi-armed Bandit(MAB)

Image credit

Multi-armed bandit (MAB) technique is a type of reinforcement learning algorithm.
It provides a formal framework for balancing exploration and exploitation.
Let’s say, in a Casino, a player can play on different machines to earn some reward. Let’s say there are k machines and a player can play x times by choosing one machine in each turn out of k machines.
Each machine gives a reward i.e. Rk with some standard deviation.
The player is allowed to play only x times and has to optimize the reward with each play.
Explore — The player plays on a new machine.
Exploit — The player plays on previously played machines with knowledge of previous outcomes giving higher probability to machines which give higher average rewards.
Out of x turns, let’s say 10% are explore turns and 90% are exploit turns. During the explore turn, the player randomly picks any machine which has not been explored. During the exploit turn, the player picks the machine with the maximum average reward he/she has seen till now.

Contextual Multi-Arm Bandit

We used a contextual multi-armed bandit approach. Contextual MAB uses context to improve performance. For our use case, context is user feature vector.

By assuming a linear relationship between the expected reward and context, contextual bandit algorithms can enhance recommendations in cold-start scenarios with limited data.

The diversity layer is required to not show too many similar widgets to a user. As MAB optimises the reward function, it will rank all similar widgets with high user affinity at the top.

Goals

Layout reranking can be optimised among the below-identified goals.

CTR
Freshness
Diversity
Click Quality
Revenue

For a layout, the store operation team can select any one goal or more than one goal out of all goals with a weightage.

High-level requirements

Auto-rank home page widgets with user affinity.
The solution should work for both logged-in and non-logged-in users.
The solution should be extendable to all other pages on the Myntra app.
In the initial phase, the scope was on optimizing widgets for the CTR goal, with the flexibility to extend to other goals in the future.
Identify features of the widget from the content.

Initially, we thought we could extract features of widgets from the content of the widget.

But there was a lot of variety in the domains of the content — the product page, list page, product recommendations, list of banners, bank offer rs, rating for a recent purchase, recommendations for brands and article types.

So with this approach, we need to extract features from each content type. So this seemed a large effort with a slow time to market. So we discarded this approach.

Annotate each widget with feature metadata

With this approach, we labeled the widgets with feature metadata like theme, core theme, and intent which would tell the most of the information about the widget — what type of widget it is and why type of content it is powering?

We came up with a curated list of all labels by working with the store business team. Afterwards, each new widget needs to mandatorily select the core theme and intent, from the curated label values. New label data addition will be admin-controlled. New label value additions will be done very infrequently. It can be asked by the category team. but the addition will be done by the engineering admin after approval from the product manager and data science stakeholders.

Intent identifies the broad category of the widget. The core theme identifies the sub-category of the widget.

Different types of widgets

Banner Widgets

Merchandising banners
Monetization banners
Bank Offers

Recommendation widgets

Recently viewed
Cross-sell
Studio widgets
UGC widgets

CTA user journey widgets

Product Rating — rate the recently purchased product
Order Tracking — order tracking view of the recent order

Identification of features

We identified features of interest as below:

Product vector — product embedding generated from product catalogue definition.
User vector- weighted average of all vectors of products with which the user had done interactions. Every interaction type has a defined weight.
Core user features — these features define the user profile based on the past purchase history of the user.
Widget features — core theme, intent, widget size, page context, creation date
User Widget feature — Map of the map. It has the user as first level key and widget as second level key and the click-through rate as the value.
For non-logged-in sessions, we use device vector and device widget features.

Type of features

Lifetime — these features are aggregated over the lifetime data point of that entity.
Real-time feature — these features are aggregated over the last 24 hours of data points for that entity.

We used both lifetime and real-time features in our model with predefined weightage on both.

Architecture diagram

Data lake — all clickstream events and server-side events are ingested in the data lake.
Near Realtime (NRT) pipeline — The NRT aggregation platform powers NRT pipelines with a p99 lag of around 2 minutes.
Batch pipeline — Batch aggregate platform powers scheduled aggregate jobs by spawning on-demand compute infra required for the jobs.
Ranking layer — All inference models are hosted on the Machine Learning inference platform. Machine learning platform allows inference model hosting with agility. It also owns the feature store.

This ranking layer will eventually be used across all pages. Additional latency on the ranking layer adds latency on page loading time.

Users see Myntra home page on app launch. Home page loading should be fast.

So for the home page use case, our ranking p99 SLA requirement was 50 ms.

Runtime ranking involved fetching batch and NRT features and doing inference which had much higher SLA than 50 ms.

To solve this, we built an offline ranking inference layer backed by a cache and degraded but fast online ranking in case of a cache miss.

This cache needs to cache the ranking output of widgets for all logged-in/non-logged-in users for all requested pages till the TTL window.

For cache miss use cases after TTL, widget static features, which were loaded in memory in the ranking service during startup, are used with runtime features passed in the request for low latency ranking. Costly ranking operation which involves fetching all features from batch and NRT feature stores and then doing computation is immediately triggered in the background. This background operation updates the cache with better widget ranking output for this user and page. This cache value will be used in future requests for the same user and page during a cache hit.

With our user base and average number of widgets on the home page, this cache data size came out in 100s GBs. In future, the same solution will be leveraged for other pages which will keep on increasing this size when enabled on more pages.

To reduce the total cost of ownership, we wanted a horizontally scalable disk-based key-value store for caching. After evaluation, we finalised Aerospike which is a disk-based key-value store optimised for SSD-based workloads with lower-digit single-digit latencies for reads and writes.

Results

With A/B results, keeping revenue per user and conversion as guardrail metrics, we saw significant improvements in the no of widgets clicked by users and the average CTR of the widgets on the home page. We also deployed the XGBoost model for the A/B experiment with the MAB model. MAB performed better in the results.

Future

In the follow-up posts, we will share the next set of capabilities we are building.

Page Layout Platform
Page Federation Platform
Algorithmic Store

Thanks

Special thanks to all the team members from Data Science, Data Science Engineering, Machine Learning Platform and Merchandising teams for this multi-quarter collaborative effort.

Reference

Data science publication: Bandits and diversity for an enhanced e-commerce homepage experience — pdf