Caching Dynamic Content at the Edge

Photo by NASA on Unsplash

PayPal Upstream PayLater messages inform customers about financing opportunities to purchase products from t-shirts to treadmills. In addition to notifying customers of a safe and secure pay later option, these messages can increase merchants’ conversion and average order values. There is much upside for consumers and merchant customers in a small piece of HTML.

An example US Pay in 4 message

Delivering messages to customers may seem simple; however, at internet-scale, delivering messages to millions of consumers worldwide on thousands of merchant sites requires skilled engineering and sophisticated infrastructure. In addition to delivering the correct message to the proper merchant at the right time, PayPal’s merchant customers demand delivery of these messages at a ludicrous speed.

The CDN Caching Dream (or, How I Learned to Stop Worrying and Love the Edge)

Dramatic recreation of the team analyzing global upstream messaging demand (Source: filmschoolrejects)

PayPal customers are located all around the world. When a customer visits a merchant’s website integrated with Pay Later messaging, the message request is routed to a PayPal data center where the Pay Later infrastructure determines the best Pay Later message to offer based on the merchant and amount of the item. Depending on the physical distance between the customer’s device and the PayPal data center, the request can take a long time to complete.

Most of this time penalty is unavoidable if we go all the way to the data center, limited by the speed of light over a trans-oceanic cable. If we can move the decisioning that today happens in a PayPal data center closer to the consumer, we avoid making this trip altogether. To accomplish this, we looked towards our Content Delivery Networks (CDN).

CDNs are primarily used to improve the delivery of static digital media, such as images and audio/video content, or assets with idempotent URLs. If we can move the rendering of our messages to these Edge servers, we can eliminate that round-trip time for the consumers. But why do we need to make this round-trip in the first place?

Benefits of Server-Side Rendering

PayPal’s upstream messages are server-side-rendered (SSR). This means that when the messaging front-tier service receives a request for a message, generate the HTML markup for that message, and that markup is delivered directly into an iframe on the merchant’s page. One of the main benefits of SSR is the iframe sandbox. The iframe sandbox provides improved security by separating the message code from the merchant’s web page and gives us an isolated CSS (Cascading Style Sheets) context that eliminates the possibility of the styles of the PayPal message interfering with the merchant’s page.

SSR also enables merchants to deliver improved website performance for their customers. The performance is enhanced by generating message markup in the PayPal data center. Thus, reduces or even eliminates JavaScript execution in the consumer’s browser. SSR also removes the need for the customer’s browser to download a large code bundle before message rendering can begin.

A final reason to love SSR is that it allows us to more easily update how our messages are rendered without worrying about browsers having cached versions of client-side rendering code. This lets us avoid potential compliance issues from showing outdated messages and makes bug-fixing easier by allowing us direct control over what content is cached.

Configuring the Edge

With a CDN, an HTTP request from a user is routed to the closest physical server. If the edge server has a cached response that matches that request, it is returned immediately. If not, it forwards that request to the “origin” server, which in this case is a PayPal data center, to get the response and return it to the user. Our response will include headers to inform the edge server that the response should be cached, as well as how long to cache it. Other aspects of edge behavior can be configured in a GUI or code, typically Varnish/VCL. The default configuration is to use the full URL as the cache key.

A sample message URL

When a request from a customer arrives for the same URL, the CDN can serve up the cached response. With this approach, there are thousands of unique keys. There are over 35,000 PayPal PayLater enabled merchants, many of whom are passing product amounts and various configuration options for what the message should look like. You might be wondering, can a CDN even handle this many cached objects? We asked this exact question of an Akami engineer in the initial planning phases, and they were confused by the question, which gave us our answer: YES! CDNs are equipped to handle this type of volume; this is precisely what CDNs are designed for. However, this introduced another problem: there is a massive limitation to the cache hit rate. Many customers will still experience latency in message delivery because the message they are requesting has yet to be cached. It is the first time ANY customer has requested this exact message configuration.

Message Selection

The most crucial piece of info for rendering a message is the merchant displaying it. This information comes in the form of a client-id. The client-id is used to look up a merchant’s account, which tells the messaging infrastructure everything it needs to know to determine the content of the message to return. The factors used to determine a message:

1. Products + Offers: what products + offers are available to this merchant? That includes offers available to all merchants in their country and any customized offers the merchant has configured.

2. Eligibility: which offers, from the ones available, is this merchant eligible to display? This is based on many factors, the primary one being the industry of the merchant.

3. Suppression: if the merchant has opted to suppress messaging for a particular product, we want to respect that in our decision.

4. Risk + Other factors: each message request is evaluated by our risk systems to determine if the merchant should be allowed to show a given product.

Because these factors are merchant-specific, the same message cannot be cached for all merchants. A naïve approach could be taken here by using the merchant account (client-id) in the cache key, but that would result in many unique keys, and, in turn, a poor cache hit rate. With such a large number of merchants, some of them are bound to match the same factors. And in fact, many merchants do have perfectly matching configurations. These merchants are effectively the same for message selection and can be treated as such.

Moving Decisioning to the Edge — the “Merchant Profile”

The distribution of Merchant Profiles shows the benefits of sharing cached objects between merchants

Given that many merchants will have the same message decision, the “decision” can be moved to the CDN if the CDN can be made aware of what “type” of merchant is requesting a message. The “type” of merchant is called the “merchant profile.” The “merchant profile” is a hash of all pieces of information that affect the decision outlined above. Any two merchants with the same hash will have the same decision factors, and thus for a given amount and display configuration, will have the same message.

To inform the edge of the “merchant profile,” a hash must be passed along in the request URL of every message. The hash value is included in the code bundle of the PayPal JS SDK (Software Development Kit), which creates the message iframe. This allows the safe removal of the client-id from the cache key without impacting the message that gets rendered. This setup moved all the message decisioning to the edge.

The rendered markup was updated, and any information tied to a single merchant or render was removed. A few more parameters that infrequently change were added for cache-busting. The resulting cache key, defined in VCL, looks like this:

The VCL code to configure our custom cache key

Cache Management

The result is an operating cache that can serve hundreds of millions of messages per day, at light speed, with high effectiveness, and the ability to manage it at a granular level. Two key pieces were built to enable this management: a CDN provider API (Application Programming Interface) wrapper library that could abstract away the differences between the CDN providers, and an internal control panel application that allows operators to view the status of the cache and flush specific segments as needed.

The CDN provider API wrapper (affectionately titled “C.R.E.A.M.,” for Cache Rules Everything Around Me) provides methods for flushing the cache by an individual URL, by one or more “cache tags,” or in its entirety. The cache tags allow the grouping of cached objects, giving control of the cache with semantics that makes sense for our use case. The cache can be flushed for a particular merchant profile, or the entirety of one locale, by using the appropriate cache tag. Individual merchant’s profiles can be updated (for example, when their configuration changes) by flushing the cached SDK bundle that includes the old profile hash value, ensuring that a new one will be present when the SDK for that merchant on the subsequent request.

The control panel also reports metrics about each CDN’s traffic and hit rate over time, and this data is displayed in a cache dashboard for monitoring.

The cache management panel in the cache control dashboard

With messages that effectively render instantly, merchants can integrate upstream messaging with confidence that it will not harm their consumer experience and will appear as a seamless part of their website. Those messages will inform consumers that they have options in how to pay and lead to increased average order volume for the merchant. In the chart below, you can see the effect that edge caching had on render performance for our pilot merchants. The average render duration decreased nearly 90%, with some users seeing render speeds as low as 10–25ms.

Distribution of message render times with Edge Caching enabled

C̶a̶c̶h̶e Collaboration Rules Everything Around Me

This project would not have been possible without deep collaboration with many partner teams throughout the enterprise. We are incredibly grateful for their support to enable this drastic improvement in the experience of both merchants and consumers and for their continued engagement as we make further improvements.

The JS SDK team, led by Greg Jopa, helped us to add the merchant profile to the bundling process, the importance of which we have already discussed. We also worked closely to enable flushing those bundles when merchant configurations change.

Our Edge Engineering team, which operates our CDN integrations and interfaces with our providers, led by Brent Busby, consulted with us when designing our cache key scheme and helped us implement that scheme and test our changes in various stages. A special shout out to Arijit Ghosh and Ashutosh Srivastava from this team for being responsive as we worked through initial bugs.

We also consulted with our CDNX team, led by Shaun Warman, which focuses on the developer experience of interfacing with the CDNs. They provided sound advice on working with the provider APIs that they had experience with and guidance on creating something that could be useful to others outside of our project.

Finally, I want to thank all my teammates that worked together to make this a reality.

Team Gemini: Justin Doan, Rene Osman, Dan Haas, Anthony Re

Team Mercury: Nate Schott, Josh Dutterer, Julia Furman, Grant Black, Merlin Patterson

Edge-cached messaging is now available to all merchants with upstream presentment.