The Next Chapter of Coupon’s Scaling
What is a Coupons Platform?
Myntra’s coupon platform is the powerhouse behind creating, distributing, and serving personalized and generic coupons to customers for their orders. We serve the best coupon on Product Display Page calls with one product, as the platform scales and supporting coupons on list pages with tens of products in a single call become essential, the need to revolutionize the coupon system design is imminent. Previously, such requirements were not even thought of as feasible. To keep up with the ever-evolving e-commerce landscape and even increasing scale requirements, the coupon platform must etch the next chapter in its evolution journey.

Terminologies
- PDP — Product Display Page
- PLP — Product List Page
- HRD days — High Revenue Days during which marquee sales are held
- Generic Coupons — A type of coupon that is valid and can be applied by any customer on the platform
- CouponBase — All of the coupon configurations related to a specific coupon in the system
- Global usage — The number of times the coupon has been used globally by all users on the platform
- User Usage — The number of times the coupon has been used by a particular user on the platform
- Master Template ID — A unique key to differentiate coupons construct
Why do we want to re-architect- Bottlenecks?
As user demand and coupon availability grow, the coupon platform must adapt to handle the increasing complexity. The introduction of coupons on PLP necessitates a reassessment of the system’s latency and scalability. Ensuring fast responses under significant load became a high priority.
The following bottlenecks were recognized as actionable items requiring immediate attention and resolution:
- User Usage as a Major Contributor to Latency Increase
During the analysis of the apply coupon serving call, it was found that around 80% of the API time was spent retrieving user usage. This inefficiency stemmed from fetching multiple coupons synchronously from Redis, with a fallback to a database if the cache was unavailable. The root cause was the current method of storing user usage as separate keys in Redis, where the number of keys equaled the number of times a coupon was previously used by the user. For example, if a user had used x coupons, this approach led to the creation of x distinct keys and x Redis calls to retrieve them

- Global usage fetch was acting as a blocker!
Fetching global usage was a challenging task. We were making bulk calls to Redis with all the coupons, which took 15–20ms, but with the increasing number of coupons and users, the time taken for fetching data became a matter of concern. The current approach might not be sustainable in the long run, and we need to explore alternative solutions to optimize the time taken for bulk calls. - Handling coupons for multiple products
Including coupons on the list page means handling approximately 50 products in a request, each requiring corresponding coupon responses. We rely on a product mapping data store to determine applicable coupons for each product. However, serving coupons for all 50 products within a single request using this approach is impractical, as it would require 50 synchronous bulk Redis calls. - Multiple Sequential Downstream calls
Our serving APIs rely on synchronized external dependency calls. However, the introduction of coupon support on PLP necessitates a substantial increase in synchronized external dependency calls. This includes critical calls to User Attribute Service, Segment Service, Discount Service, Seller Service, and multiple Redis operations, each serving a crucial purpose. For instance, Segment service fetches user segments for determining segment-based eligible coupons, while User Attribute service distinguishes between new and repeat users, making new user coupons available for consideration. Moreover, Discount service retrieves running trade discounts on all 50 products in the request, enabling accurate calculation of final prices after coupon deductions. This would increase the latency of the system up to 400ms, posing a significant problem in delivering fast and efficient services to customers. Thus, we must strategize to reduce latency, ensuring a seamless user experience while embracing new features on our innovative platform.

- Uncontrolled Accumulation of Labels and Cache Invalidation
The platform relies on labels to effectively map coupons and their visibility concepts. Sellers have complete control over label management, with the ability to add or remove labels as deemed fit. However, it’s essential to note that labels persist in the system even after a coupon has expired in case the seller fails to remove them. This can lead to an unmanageable accumulation of data on the label front, creating challenges for system performance and optimization. We retrieve labeled coupons during serving API calls and subsequently identify those that remain valid for further computation. Furthermore, cache invalidation has proven to be a conundrum that we continue to refine our processes to address. If we invalidate the cache, it can trigger a mass exodus of calls to the database, resulting in a surge in latency and potential disruption for our users.
Our Solutioning

- Streamlining Redis I/O Calls for Enhanced Efficiency
To optimize our system, we addressed the issue of excessive Redis I/O calls when retrieving user-specific coupon information. Previously, each coupon usage required multiple Redis calls, which increased with the user’s cumulative coupon usage. To improve efficiency, we restructured the user usage key by creating a comprehensive map of all coupons utilized by a user. This redesign enables us to retrieve complete coupon usage information in a single API call, reducing the number of Redis calls required
Previous Structure :<couponidentifier_useridentifier>: <usage count>
Current Structure :<useridentifier> : {
<couponidentifier> : <usage count>,
<couponidentifier> : <usage count>
} - Enhancing Coupon Validation with global usage changes
Our system relies on Redis to track global coupon usage, using coupon codes and master template IDs as key identifiers. When a user applies a coupon, we retrieve this information to validate its eligibility, which is crucial for revenue integrity. To safeguard against breaches of usage limits and prevent illegal usage, we implement concurrency measures and fail-safe mechanisms. Our current approach utilizes Redis calls to efficiently verify coupon codes, retrieve relevant data, and perform checks to ensure validity and adherence to usage limits.

- Using Non-Distributed Cache Layer
To improve efficiency, we utilize our Ehcache layer to fetch the global usage of a coupon, reducing reliance on Redis calls. This includes storing the configuration and tracking the coupon’s exhaustion status. We also ensure that the value is updated when modifying a coupon template. Additionally, we address concurrency concerns to prevent unauthorized breaches of usage, which could lead to significant revenue loss.
Backward Compatibility and Migration Strategy
1. Implement a BackFill API to update the global usage field within the CouponBase after the release.
2. Setting the global exhaustion status to null by default. When null, Redis is used for data in the serving flow. When a user orders, coupons are activated, and their usage is tracked. This updates the status from its initial ‘null’ state to either 0 or 1, reflecting the remaining number of usages available. New coupons default to 0. System rejects invalid updates. Sellers can only increase usage limits. If a coupon has 50 redemptions and had unlimited use, the limit can only be raised above 50.

- Leveraging Multithreading for Optimal Latency Reduction
In our quest to enhance performance and minimize latencies, we placed significant emphasis on implementing a robust multithreading approach within the coupon on PLP flow. To achieve this, we conducted a meticulous assessment to strike the ideal balance between thread creation and context switching. Through extensive analysis, we determined that harnessing the power of multithreading for external calls would be the most effective strategy.
However, we took a measured approach by primarily focusing the bulk of computation on the main thread. The auxiliary threads were strategically employed to retrieve relevant data from downstream services. By offloading this task to dedicated threads, we were able to substantially reduce the time spent on context switching, leading to notable gains in overall performance. - Efficient Cache Usage: Streamlining Future Coupon Retention
We don’t want to clutter our top-level map with upcoming coupons. We store coupons in the EhCache layer unnecessarily, even if they’re scheduled to start in some x days. To optimize our approach, we will only keep coupons in the higher-level map if their start date is within the range of currentTime + 6 hours. This aligns with our assumption of an EhCache expiry time of 6 hours.

Performance Benchmarks

