公司:lyft
来福车(英语:Lyft)是一家交通网络公司,总部位于美国加利福尼亚州旧金山,以开发移动应用程序连结乘客和司机,提供载客车辆租赁及媒合共乘的分享型经济服务。乘客可以通过发送短信或是使用移动应用程序来预约车辆,利用移动应用程序时还可以追踪车辆位置。
Lyft 拥有 30% 的市场份额,是美国仅次于优步的第二大的叫车公司。
From Big Data to Better Data: Ensuring Data Quality with Verity
High-quality data is necessary for the success of every data-driven company. It enables everything from reliable business logic to insightful decision-making and robust machine learning modeling. It is now the norm for tech companies to have a well-developed data platform. This makes it easy for engineers to generate, transform, store, and analyze data at the petabyte scale. As such, we have reached a point where the quantity of data is no longer a boundary. Yet this has come at the cost of quality.
In this post we will define data quality at a high-level and explore our motivation to achieve better data quality. We will then introduce our in-house product, Verity, and showcase how it serves as a central platform for ensuring data quality in our Hive Data Warehouse. In future posts we will discuss how Verity addresses data quality elsewhere in our data platform.
Building a Control Plane for Lyft’s Shared Development Environment
Note: This publication assumes you have basic familiarity with the service mesh pattern (e.g. Istio, Linkerd, Envoy — created at Lyft!)
Where’s My Data — A Unique Encounter with Flink Streaming’s Kinesis Connector
For years now, Lyft has not only been a proponent of but also a contributor to Apache Flink. Lyft’s pipelines have evolved drastically over the years, yet, time and time again, we run into unique cases that stretch Flink to its breaking points — this is one of those times.
Building Real-time Machine Learning Foundations at Lyft
In early 2022, Lyft already had a comprehensive Machine Learning Platform called LyftLearn composed of model serving, training, CI/CD, feature serving, and model monitoring systems.
On the real-time front, LyftLearn supported real-time inference and input feature validation. However, streaming data was not supported as a first-class citizen across many of the platform’s systems — such as training, complex monitoring, and others.
While several teams were using streaming data in their Machine Learning (ML) workflows, doing so was a laborious process, sometimes requiring weeks or months of engineering effort. On the flip side, there was a substantial appetite to build real-time ML systems from developers at Lyft.
Lyft is a real-time marketplace and many teams benefit from enhancing their machine learning models with real-time signals.
To meet the needs of our customers, we kicked off the Real-time Machine Learning with Streaming initiative. Our goal was to develop foundations that would enable the hundreds of ML developers at Lyft to efficiently develop new models and enhance existing models with streaming data.
In this blog post, we will discuss some what we built in support of that goal and the lessons we learned along the way.
Gotchas of Streaming Pipelines: Profiling & Performance Improvements
Discover how Lyft identified and fixed performance issues in our streaming pipelines.
Building a large scale unsupervised model anomaly detection system — Part 2
Building ML Models with Observability at Scale.
Building a large scale unsupervised model anomaly detection system — Part 1
Distributed Profiling of Model Inference Logs.
Big Savings On Big Data
How Lyft’s ML Platform Saves Time and Money on Big Data/ML Workloads.
The Recommendation System at Lyft
Recommendation plays an important role in Lyft’s understanding of its riders and allows for customizing app experiences to better fulfill their needs. At times, recommendations are also leveraged to manage the marketplace, making sure there’s a healthy balance between ride demand and driver supply. This allows ride requests to be fulfilled with more desirable dispatch outcomes such as matching riders with the best driver nearby.
This blog post focuses on the scope and the goals of the recommendation system, and explores some of the most recent changes the Rider team has made to better serve Lyft’s riders.
SimulatedRides: How Lyft uses load testing to ensure reliable service during peak events
We know what you’re thinking — testing in production is one of the cardinal sins of software development. However, at Lyft we have come to realize that load testing in production is a powerful tool to prepare systems for unexpected bursty traffic and peak events. We’ll explore why Lyft needed a custom performance testing framework that worked in production, how we built a cross-functional solution, and how we’ve continued to improve this testing platform since its launch in 2016.
What exactly do we mean by “Load Testing”? In the context of this article we mean any tool that creates traffic to stress test systems and see how they perform at the limits of their capacity.
lyft2vec — Embeddings at Lyft
Graph learning methods can reveal interesting insights that capture the underlying relational structures. Graph learning methods have many industry applications in areas such as product or content recommender systems and network analysis.
In this post, we discuss how we use graph learning methods at Lyft to generate embeddings — compact vector representation of high-dimensional information. We will share interesting rideshare insights uncovered by embeddings of riders, drivers, locations, and time. As the examples will show, trained embeddings from graphs can represent information and patterns that are hard to capture with traditional, straightforward features.
The Journey to Server Driven UI At Lyft Bikes and Scooters
Across the past couple of years, different mobile app teams across Lyft have been moving to Server Driven UI (SDUI) for three main reasons:
- To deal with business complexity
- To increase release velocity
- To be more flexible in how we staff and build features
This post is about Lyft Bikes and Scooters’ journey to SDUI, why we’ve gone down this path, and what’s worked well for us.
Quantifying Efficiency in Ridesharing Marketplaces
The health of Lyft’s marketplace depends on how riders and drivers are distributed across space and time. Within the complex rideshare space, it is not easy to define typical marketplace concepts like “market efficiency” and “supply-demand balance”. A simple question such as “Do we have enough drivers right now?” has different answers depending on context:
- Are there enough drivers in the right places to maintain good service levels?
- Are there enough drivers system-wide, assuming a ride request will be accepted no matter how far away it is?
- Are there enough to maintain an attractive earning rate?
Each question leads in a different direction. Being able to answer such questions is the interesting (and challenging!) part of operating a healthy two-sided marketplace.
Powering Millions of Real-Time Decisions with LyftLearn Serving
Hundreds of millions of real-time decisions are made each day at Lyft by online machine learning models. These model-based decisions include price optimization for rides, incentives allocation for drivers, fraud detection, ETA prediction, and innumerable others that impact how riders move and drivers earn.
A Review of Multi-Armed Bandits Applications at Lyft
Lyft hosts a dynamic marketplace connecting millions of people to a robust transportation network. In order to offer high value and quality service for both riders and drivers we need to make complex optimization decisions in near-real time. The environment can change quickly with traffic, events and weather, making these decisions even more challenging.
We have employed multi-arm bandits (MAB) algorithms, a common machine learning method for decision making using long-term rewards, to improve our real-time decision making capability. MABs allow us to not only iterate at a faster cadence and lower cost, but also allow for dynamic user experiences and responsive marketplace systems. We will walk through some of our most impactful MAB applications in UI optimization and personalized messaging, concluding with applications in our marketplace algorithms.
Detecting Android memory leaks in production
Monitoring mobile performance and resource consumption at Lyft.