The Journey to Server Driven UI At Lyft Bikes and Scooters
by Alex Hartwell and Tim Miko
Across the past couple of years, different mobile app teams across Lyft have been moving to Server Driven UI (SDUI) for three main reasons:
- To deal with business complexity
- To increase release velocity
- To be more flexible in how we staff and build features
This post is about Lyft Bikes and Scooters’ journey to SDUI, why we’ve gone down this path, and what’s worked well for us.
(If you don’t care about the background story that led to us thinking about SDUI, you can skip down to The Core Primitives of SDUI for the nitty gritty tech discussion.)
A Quick History of Lyft Bikes and Scooters at Lyft
In 2018, Lyft had just launched dockless scooters (scooters that can start and end a ride without dedicated docking stations) in Denver. We sent what were simply database models directly to the mobile clients (we’ll call these “business models” from here on out) and used ride, vehicle, and hardware properties to drive the UI. Things were simple and they were good.
There was a simple mapping of ride state to UI elements (pseudo code)
Later that year, we got to work adding dockless electric bikes to the app. Things started to get a little more complex. The ride flow of scooters and bikes was similar, but not exactly the same. The mobile app needed to know what the user was riding so we could change the UI.
We added an enum with two cases: “electric scooter” and “electric bike”. We could swap out elements in UI and feature sets with that enum. The enum allowed us to abstract all the product differences behind a single value we could check on the client. Things were a little more server-driven and they were still good.
Around that same time, Lyft acquired Motivate and became the leading bike-share operator in North America. We quickly got to work unifying Lyft and Motivate’s systems. This time around we were adding yet another vehicle type, with major differences in its ride flow: docked bikes. This introduced even more UI/UX differences between the flows. A new “docked bike” enum case was added. We added more switch cases on the vehicle type enum, implemented the new flows, and were on our way.
It may seem like we were adding accidental complexity to our product by releasing so many variations on similar flows, but this was necessary complexity. Our clients are both individual riders and the cities we operate in. Every city has specific needs and desires when it comes to bike and scooter share and we work closely with them to launch experiences that help make the city more accessible, more human, and more livable. While this does add complexity, it is essential to find a way to support each unique client.
Around the same time we replaced Motivate city-specific bike share apps (Citi Bike, Baywheels, Divvy, etc.) with apps built inside the Lyft mobile codebase, sharing client and server code/infrastructure with the core Rideshare app. Things got a lot more complicated in this world: features didn’t just change per vehicle type, now they could change per app as well. The combinatorial complexity of our client code was growing quickly.
As part of the “white label” app launch, we introduced new kinds of dockless e-bikes. This is where complexity shot out of control. Across our markets, there were three vehicle-docking configurations: fully docked, fully dockless, and hybrid docked and dockless. The vehicle enum and switch statements we had grown to rely on were starting to fall apart. For example, not every e-bike acted the same and had the same feature set. The low-level hardware properties we were reading on the mobile client were making less and less sense for the world we were in. There was too much complexity for mobile engineers to stay up to date with. At this point, we needed a better abstraction for mobile to build features on top of.
Our first experiment with Server Supplemented UI
At this point, we started to play with moving UI/UX configuration to the server. We used an approach we call Server Supplemented UI (what we initially called “capabilities” — shout out Julien Fantin for coining the phrase and pushing for the pattern). Server Supplemented UI allowed us to drive configuration of small contained parts of the screen from the server, while the client retains control over the big picture. We’d basically return a big list of boolean values, localized strings, image URLs, UI elements like alerts, prompt panels, labels, etc. We could start abstracting away a lot of the implementation details of the hardware from the mobile clients and make new product and market launches easier, faster, and cheaper. Things became more configuration changes than code changes in a lot of places. It removed the need for client work on both iOS and Android for many types of changes.
As an example of this, we added a “station_message” field to our station object that included an image, title, description, and deep link to trigger on tap. This was originally added to denote when a station was offline, but because it was so generic we were able to reuse it for other features, like indicating “valet” stations and membership sales.
Another example was reservations: not all markets or vehicle types support reservations. Rather than baking this logic into the mobile applications, we added a capability called “is_reservable” so that the reservation feature could be enabled or disabled from the server for each and every vehicle.
This approach served us well for a long time. We slowly added more and more areas of our UI that could be server supplemented. These generic, remotely configurable areas of the UI/feature sets allowed us to ship many MVPs/markets quickly without client work. A tradeoff we had to make was poorer UX, given these supplemented areas of the screen were generic but not all that flexible. We could live with that — so, with every new feature, releasing got easier.
From Server Supplemented to Server Driven
It wasn’t until early 2022 that we had a pressing need to rethink this approach. We were working on integrating Spin scooters into the Lyft app and we knew we’d need changes to many different flows. While we had been making progress with server supplementation, a lot of the core parts of our experience still relied on that vehicle enum and scooter/bike models that looked a little too much like Lyft hardware: they had become monolithic business models that were hard to separate from UI-level concerns. The way a Spin scooter’s locks, ride states, hardware, etc. worked didn’t match one-to-one with our hardware, and we knew we didn’t want to deal with hacky mappings forever. This work would allow a smoother integration with Spin, while future-proofing us to subsequent vehicle/hardware types. We wanted mobile-first abstractions that would stop us from worrying about the underlying hardware and server implementation details and that would allow us to elegantly manage market, hardware, and product complexity.
We took a step back and thought about what was and wasn’t working with our APIs, and talked with other mobile teams throughout Lyft to see how they were dealing with this kind of complexity. It turns out that Rideshare faces similar problems: many markets and products with varying feature sets, UX flavors, and differing configurations. A lot of what was working for Lyft Bikes and Scooters was also working for Rideshare: namely, Server Driven (or Supplemented) UI. When we could control/configure the UI/UX directly from the server, feature or product launches went much smoother. All the places we had capabilities, localized strings and APIs that modeled UI interactions and elements allowed us flexibility and control that was invaluable to move quickly and support ever-growing product complexity. We knew we wanted to lean on and supercharge that approach. We decided to move from Server Supplemented to Server Driven.
What is SDUI (Server Driven UI)
The key change was the introduction of a mobile-specific “backend for frontend”, or BFF (and yes, you might say we hope to be best friends forever with this architecture!):
At a certain level, all UI within the Lyft app is server-driven as you can’t do much without an internet connection since the core functionality requires connectivity. However, this isn’t a useful definition to talk about as an architectural pattern. We define SDUI as an architecture that shifts the majority of the business and display logic from the client to the server. It breaks down the classic boundaries of server and client concerns, creating an extension of the client that lives on a server. While still being structured around a micromobility use case, the clients become increasingly passive, doing exactly what the server tells them to. The app doesn’t need to know what a bike ride is for the currently visible feature set, the server just tells the app which components/features to render.
You’ve ventured into SDUI land when returning “view models” from the server rather than “business models”. If you’re interested in learning more, check out our mobile podcast episode on Lyft’s SDUI philosophy.
Why did we go from Server Supplemented to Server Driven?
A few forces were pushing us to standardize our Server Driven UI approach:
- Server Supplementation had served us well, but everything was ad hoc. There was no standardized approach to defining how the server represented a piece of UI. Every new feature required a lot of work and back and forth.
- There weren’t very many shared building blocks we were working with, so every time we migrated a part of the screen to be server supplemented, we’d be starting mostly from scratch.
- We were adding a lot of tech debt to our core micro-mobility services by mixing core business logic with UI concerns.
- The mobile apps were hitting the Bikes & Scooters “platform” services directly. APIs that originally returned business models were now returning business models with UI fields sprinkled throughout them. Other consumers of these APIs had to deal with a growing collection of fields that made no sense for them (ex. deep links make sense for mobile clients but not for the web).
- The models were getting way too big!
- Our ride and vehicle models had so many fields, we had to come up with hacks to keep the app from crashing when we created and passed them around.
So Where Did We Start And What Did We Build?
We wanted to completely rethink the client-server relationship for our most common flows. We wanted to declare tech debt bankruptcy with our current APIs. We proposed starting with the golden path panels (the ones everyone sees when they take a ride) as they are our most used and most important screens. These views are where users understand the supply of our system, unlock vehicles, learn more about their current ride’s status, and end rides. They are also where most of the changes in the app happen, so having flexibility there would be more valuable than anywhere else. We wanted to migrate “all the things” to SDUI, but we had to be realistic — we didn’t want to jump too far, too fast into a new approach so that we could undo anything we got wrong.
We proposed a new BFF (backend for frontend) microservice named lbsbff exclusively for the Lyft bikes & scooters experience to abstract away hardware and provider implementation details from the client. That way we could build an SDUI platform while allowing the business platform layer to stop caring about the mobile clients. The server team could deal with data and business logic while mostly ignoring what the UI looks like.
The lbsbff service would house the new endpoints the mobile apps hit to light up our panels. It would be responsible for fetching data from upstream services, merging it, and returning view representations of the data to the clients. We just had to figure out what those view representations would look like.
We landed on a set of primitives, a group of common nouns, to refer to the major parts of our SDUI architecture to make it easier to communicate and reason about things. We were surprised by how few were necessary to add lots of power to our BFF.
The Core Primitives of SDUI
Components
A component represents a view appearing inside of our panel. There are two types of components in our platform:
Declarative Components
We think of Declarative Components (also called Generic Components) as similar to HTML. They are representations of native views sent over the wire (using protobuf in our case).
These are the components you probably first think of when you think of Server Driven UI. The client has no domain knowledge about these components, they are completely generic. They define a view declaratively on the server and send down a protobuf representation of the view primitives which the client then parses and renders.
We were extremely lucky that the Support XP team at Lyft had been developing a framework to define and render generic views from the server, originally for the Lyft app’s help flows. We were able to reuse their framework here and avoid reinventing the wheel. Shoutout to Brett Jones and David Stemmer for their amazing work here!
Semantic Components
We think of Semantic Components as client-driven components. The server says to render component X but doesn’t define what it looks like or how it acts. They can act as an escape hatch from server-driven land.
Semantic Components are an important part of our server-driven strategy. Declarative Components are powerful but come with a lot of limitations. It’s hard to model complex animations and view hierarchies from the server. They are remarkably powerful for simpler experiences but don’t support highly responsive (i.e. fully client-side) interactions. To get around these limitations, we support Semantic Components that the client parses and knows how to render. These components map to predetermined layouts/views on the client, the server just sends down the data necessary to hydrate them. After the client inflates the views, it can then control that area of the screen however it needs to.
Sometimes the dividing line between semantic and declarative can be hazy. Many of our semantic components are still highly configurable (swapping out images, text, colors, actions, etc.), but the core layout and interactions in all of them are defined by the client.
The Rideshare team was also working on server-driven panels and we were able to reuse/build client libraries and API definitions with them to return and render components in the panel. Shoutout to Ryan Demo and Jaden Choi. And in turn, they had been building off earlier approaches shipped for parts of the ride flow. Shoutout Keren Tevet and Dmitry Parshin!
Besides these two types of components, we have one other core primitive: Actions.
Actions
An Action represents a single piece of logic that can be associated with preconfigured trigger points, like a button tap, a view load, a checkbox toggle, etc. Actions can be deep links; but, generally, they are pre-registered commands on the client that perform associated client code (specific network requests, launching a flow like unlocking a bike or opening the help center, changing client state, etc). The action returned by the server contains data to configure how it works, but the client is responsible for actually triggering and controlling the action.
An example action to trigger the unlock bike flow might look like:
message TriggerUnlockBikeFlow { string bike_id = 1; bool default_open_to_qr_scanner = 2;
}
The server doesn’t define any specifics about how the flow works besides the initial configuration. It just tells the client to go and do its thing.
Components (both Semantic and Declarative) define their Action trigger points and can be configured with any of the supported actions. Actions can also be chained. For example, you can trigger an Action that shows an alert, which then triggers further actions when buttons inside of that alert are tapped.
Actions have been an incredibly powerful pattern that has increased the flexibility and reusability of many of our components. You can reuse a layout in multiple places but trigger different actions depending on the context.
Decoupling Actions and Components has been a major win in our SDUI platform. Even in Semantic Components, we decouple layout from actions. That means that layouts made for one purpose can be easily used for another without client work or releasing new app versions.
Primitives Changed How We Think About Feature Development
One of the core ways we have approached things in the BFF is by creating the smallest and most atomic components and actions we can. This makes it easier to test everything, but it has also made it incredibly easy to build new features based on existing primitives. If you already have “unlock a bike” as an Action you can then add new entry points to unlocking a bike with a simple change to the BFF.
When you start using Server Driven UI, every feature expands the platform and makes every future feature easier, faster, and of higher quality. If we are missing Actions or Components we can add them and they become available for all future work.
Evolving SDUI in a Sustainable Way Across Many Teams
We don’t have a single framework to build SDUI at Lyft yet. Instead, feature teams have had a lot of freedom to experiment on their own and figure out what works. Over time, we have collectively converged on similar solutions and patterns to common SDUI problems. We’ve been able to learn from each other and push the boundaries in our respective platforms. As we learned more about what each team was building, we realized there was a lot of opportunity for sharing code so that we didn’t have to solve the same problem multiple times in each of the different SDUI systems.
It started small with aligning on solutions to common problems like “Capabilities” (which components are supported in which client versions) and progressively worked up to bigger things like shared rich text content and rendering for common UI components like the panel. Teams from across the company could benefit from each other’s work, and still explore the problem space and build solutions that worked for them. Eventually, we plan to create a common suite of generic building blocks that all teams can use to craft their own Server Driven UI experiences.
TL;DR
- SDUI is a great approach when your product has a lot of variation and complex configuration. It allows you to move that configuration to the server, where changes don’t require app updates, release cycles, and duplication across platforms.
- This allows you to experiment with your product ideas quickly.
- It allows for building highly contextual and personalized experiences in a scalable way.
- It allows for flexible resourcing, and client or server engineers can power UI (in many cases).
- It has a snowball effect where the more you build on top of the platform the more powerful and useful it grows.