Optimizing Myntra’s Pricing System for Serving Millions of Traffic under 30ms Latency

Co-Contributors : Suraj Rajan, Nilesh Mittal, Arnav

Published in

Myntra Engineering

9 min readFeb 6, 2023

Overview

Myntra’s pricing system is responsible for uploading various discounts for different products and serving them to multiple services. Currently, the pricing system has been serving discount information with a latency of 150ms for millions of traffic. With increasing scale and the need to disseminate discount information to several other systems, it became essential to scale the system to handle the increased load while maintaining its resilience, robustness, and fault tolerance. Several new features are being developed which require pricing information to be available in synchronous flows. To enable all these features, our pricing system needs to scale to handle 10s of millions of requests with latency as low as 30ms.

Pricing System Architecture

Pricing System comprises of following components:-

Discount Creator Service:- The Discount Creator Service is responsible for creation of discounts by providing a user-friendly interface for sellers to upload discounts for various products. The service’s backend layer processes these discounts and stores them in the Redis Cache and Sql transactional data store for quick and efficient retrieval. Additionally, the service ensures a seamless distribution of discounts across the system by pushing them to the Discount Injector Service through a Kafka queue on a scheduled basis, allowing the Discount Injector Service to make the discounts available to the serving layer, ensuring that end-users have access to the latest discounts.
Discount Injector Service:- The Discount Injector Service acts as a vital intermediary between the Discount Creator Service and the Discount Fulfilment Service. It is responsible for listening to and processing the Kafka events emitted by the Discount Creator Service scheduled jobs. The service then stores the discount data in the Redis cache and MongoDB of the Discount Fulfilment Service, ensuring that only the currently active discounts for each product are retained. This allows for real-time updates and efficient retrieval of the latest discounts for the end-users.
Discount Fulfilment Service:- The Discount Fulfilment Service serves as the primary point of access for providing discounts to its downstream services. Its primary responsibility is to efficiently retrieve and serve the requested discounts by fetching the data from the Redis cache or MongoDB as necessary. The cache is populated by the Discount Injector Service, ensuring that the data is always up-to-date and accurate, providing a smooth and efficient experience for the end-users.

As the Discount Fulfilment Service is the primary access point for providing discounts to its downstream services, this article will look at the architecture and concepts implemented in the Discount Fulfilment Service to enable it to efficiently handle high traffic volumes with minimal latency.

Discount Fulfilment Service Scaling

Here are the scaling hurdles we encountered and their solutions:-

Hurdle #1 : Sequential Synchronous Redis Calls

We were previously using a synchronous approach for making Redis calls for each product individually. This resulted in a significant delay when processing bulk requests containing multiple products. For example, if we had a bulk request with 50 products, we would make 50 separate requests to Redis one after the other. This meant that the Redis call for the second product would not be made until a response for the first product had been received. With an average latency of 3ms per Redis call, a bulk request of 50 products would take 150ms to complete.
The solution to this problem is discussed later in this article.

Hurdle #2 : Too Many Network Calls

Since we were using lettuce to make synchronous Redis calls, each call was directly written to the network layer. As we were making a single request to Redis for each product, the number of network calls equaled the number of products present in the request. This resulted in a high number of network calls, which further contributed to the delay in processing bulk requests.
The solution to this problem is the same as the one above and is discussed later in this article.

Hurdle #3 : No Latency Capping

No timeouts and retries were configured for the Redis calls, which was leading to some delays in processing bulk requests. For example, if a Redis call was taking 100ms, a 30ms timeout with two or three retries could have been implemented. This would increase the chances of receiving a response in a shorter amount of time, potentially reducing delays and improving overall performance.

Solution:- Timeouts and retries have been added from the application side for Redis calls to have a cap on the overall latency.

Hurdle #4 : Redis Read Preference

Our previous Redis configuration utilized REPLICA_PREFERRED as the read preference, causing an imbalance in the distribution of read traffic. Our setup consisted of X number of Redis master nodes and Y number of Redis slave nodes with a master-slave ratio of 1:2, each node having P number of partitions. The read preference of REPLICA_PREFERRED resulted in one application server having only (Y/2 * P) transport connections via one connection object. This meant that read traffic from one application server was only being directed to Y/2 replicas instead of being distributed evenly among all Y available replicas. This imbalance negatively impacted the performance and scalability of the application, highlighting the need for a more balanced distribution of read traffic.
(Ref :- https://lettuce.io/core/release/reference/#redis-cluster.connection-count)

Solution:- We have changed the Redis read preference to ANY in the lettuce due to which a single connection object will now have ((X+Y) * P) transport connections.

Hurdle #5 : More Resources Allocated To Mongo Client

We had allocated more resources to the Mongo client than necessary, however, only 0.1% of the traffic required a fallback to Mongo. The maximum connection pool size for Mongo was configured as 100, with a minimum pool size of 24. We have X master and Y slave Mongo nodes, which resulted in the following connection numbers:

The maximum number of connections from one application server to Mongo was (100 * (X+Y)).
Minimum number of connections was (24 * (X+Y)).

Solution:- We are now allocating the correct amount of resources to the mongo client and more resources to the lettuce as most of the traffic will only be served by the Redis. The max & min connection pool size for the Mongo client has now been reduced to 4, resulting in a max/min number of Mongo connections of (4 * (X+Y)).

Solutions Considered for Sequential Synchronous Redis & Too Many Network Calls

Here are the approaches that we have considered to address Hurdle #1 & Hurdle #2 :-

Approach #1 : Command Batching (Pipelining) & Flushing

The Lettuce library supports pipelining for asynchronous Redis calls by default. However, pipelining is implemented at a global level, which means that all the application threads will write their commands to the same pipeline and Lettuce will automatically flush the commands to the network layer. In this approach, the batch size and timing of flushing commands can not be defined by the application. To gain more control over this process, the auto-flush mode of the Redis connection can be turned off, and flushing can be manually performed according to a specific batch size or as needed.

Implementation

During the application startup process, a connection to Redis will be established and the auto-flush mode of that connection will be turned off.
When an application thread receives a request, it will prepare and batch all the commands for all products present in the request, and add them to the pipeline of the connection.
The commands will be flushed to Redis according to a configured batch size. This call will be an asynchronous call with the application thread waiting for a response.
The application thread will wait until a response is received or a configured timeout has been reached.
Additionally, retries can also be implemented with a configured timeout for each Redis call.

Drawbacks of this approach

Point to note is that the connection is shared between threads and if one thread invokes flush, the commands added by other threads are also flushed. This means that one thread’s behavior is affecting the other threads. With this approach, the number of commands that can be flushed at once is dynamic and can be more than the configured batch size, potentially causing issues.
The timeout for Redis calls has been configured according to the batch size, but since the count of commands that are getting flushed can be greater than the batch size, as we discussed above, it can create the following issues:
a) More and more Redis retries will happen from the application side, which will put 2x or 3x load on Redis servers.
b) More and more Redis retry failures will happen from the application side, leading to more and more calls falling back to Mongo. Since our Mongo infrastructure and Mongo client are not configured to handle that much load, it will result in high API latency and Redis and Mongo circuit breakers opening.

Ref :- https://github.com/lettuce-io/lettuce-core/wiki/Pipelining-and-command-flushing

Approach #2 : Command Batching & Flushing With Connection Pooling

Lettuce has built in support for asynchronous connection pooling.

Implementation

When an application thread receives a request, it retrieves a connection from the connection pool (which is initialized during application startup) and turns off the auto-flush feature of that connection.
It then prepares and batches all the commands (for all products present in the request) in the pipeline of that connection and flushes them to Redis (according to the configured batch size). This call is an asynchronous one, with the application thread waiting for the response.
The thread will wait until the response is received or the configured timeout has passed. Afterwards, the thread releases the connection and the connection is returned to the connection pool. Retries can also be implemented with a timeout configured for each Redis call.

Drawbacks of this approach

The problem with this approach is that at high loads, the application can experience a shortage of Redis connections in the pool, even with retries. Increasing the number of connections can exacerbate the issue, as both Redis and the application will have a large number of connections to manage, leading to an increase in TIMED waiting threads (IO threads), high CPU utilization (up to 98%), and increased round trip time between the application and Redis, resulting in high API latency.

Ref :- https://lettuce.io/core/release/reference/#_connection_pooling

Approach #3 (Opted Approach) : Command Batching Through Lettuce Command Interfaces

Lettuce Command interfaces support command batching to collect multiple commands in a batch queue and flush the batch in a single write to the transport.
Command batching can be enabled on two levels :-

On class level by annotating the command interface with @BatchSize All methods participate in command batching.
On method level by adding CommandBatching to the arguments. Method participates selectively in command batching.

This approach allows for more efficient Redis calls by reducing the number of network calls required and allows for batching of commands to improve performance, and this is the reason for choosing this approach to solve the problem of sequential synchronous Redis & too many network calls.

Implementation

This approach involves defining an interface that implements the lettuce Commands interface with a batch size configured at the class level using the @BatchSize annotation.
This ensures that when the number of commands in the batch queue reaches the configured batch size, a flush will be made and all the commands in the queue will be written to the network layer.
Then, an object of this interface is created using the Redis command factory, which creates a proxy class for the given interface. This object can be used to make Redis calls with batching behavior.

Redis calls can be made in two ways:-

With Queueing : futures.add(customRedisBatchExecutor.hgetall(productDetailsCacheKey, CommandBatching.queue()))
With Flushing : futures.add(customRedisBatchExecutor.hgetall(productDetailsCacheKey, CommandBatching.flush()))

/* Defining an interface which is extending Commands interface and including
 ** all necessary methods required to make Redis Calls.
 */
@BatchSize(50)
public interface RedisBatchExecutor extends Commands {
    RedisFuture<Map<String, String>> hgetall(String key, CommandBatching commandBatching);
    RedisFuture<Boolean> hset(String key, String field, String value, CommandBatching commandBatching);
    // .. so on
}

// Object Creation using RedisCommandFactory
RedisCommandFactory redisCommandFactory = new RedisCommandFactory(redisClusterConnection);
RedisBatchExecutor redisBatchExecutor = redisCommandFactory.getCommands(RedisBatchExecutor.class);

// Invoking Redis Calls using the created object

// With Queueing :- 
futures.add(customRedisBatchExecutor.hgetall(key, CommandBatching.queue()))
    
// With Flushing :- 
futures.add(customRedisBatchExecutor.hgetall(key, CommandBatching.flush()))

Ref :- https://lettuce.io/core/release/reference/#command-interfaces.batch

Redis Key & Mongo Document Structure

Redis Key Structure

The Discount Fulfilment Service stores discount data for each product using a 
Redis hash. An example of this is for product id 1000 with seller id 100 and 
101, the corresponding Redis key and value would be :-

Key :- discount/1000
Value :- 
1) {"key":"sellerId", "value":100}
2) {"mrp":2000, "discountId":140823004, "discountPercent":20%, "discountedAmount":1600, … so on}
3) {"key":"sellerId", "value":101}
4) {"mrp":2000, "discountId":140823005, "discountPercent":30%, "discountedAmount":1400, … so on}

Mongo Document Structure

{"productId":1000, "sellerId":100, "mrp":2000, "discountId":140823004, "discountPercent":20%, "discountedAmount":1600, … so on}
{"productId":1000, "sellerId":101, "mrp":2000, "discountId":140823005, "discountPercent":30%, "discountedAmount":1400, … so on}

In conclusion, we’ve discussed the ways we have considered and opted to scale Myntra’s pricing system to enable it to efficiently deliver discounts with latency as low as 30ms, even during high-traffic periods. By implementing solutions such as command batching, timeouts and retries, and resource redistribution, we’ve been able to improve the performance and scalability of the system. This has enabled new features such as coupons on the product listing page, display of discount expiration timers at the product level, instant discount cut-offs, and many more. We hope this article has provided some valuable insights and please feel free to share it with your friends and colleagues who may be interested.

Optimizing Myntra’s Pricing System for Serving Millions of Traffic under 30ms Latency

Co-Contributors : Suraj Rajan, Nilesh Mittal, Arnav

Overview

Pricing System Architecture

Discount Fulfilment Service Scaling

Solutions Considered for Sequential Synchronous Redis & Too Many Network Calls

Redis Key & Mongo Document Structure

Written by Nikhil Anand