话题公司 › lyft

公司:lyft

来福车(英语:Lyft)是一家交通网络公司,总部位于美国加利福尼亚州旧金山,以开发移动应用程序连结乘客和司机,提供载客车辆租赁及媒合共乘的分享型经济服务。乘客可以通过发送短信或是使用移动应用程序来预约车辆,利用移动应用程序时还可以追踪车辆位置。

Lyft 拥有 30% 的市场份额,是美国仅次于优步的第二大的叫车公司。

Crafting Seamless Journeys with Live Activities

In this edition, we transition from pixels to code, exploring the technical side that orchestrates Live Activities at Lyft.

Lyft’s Reinforcement Learning Platform

Tackling decision making problems with a platform for developing & serving Reinforcement Learning models with a focus on Contextual Bandits

Postgres Aurora DB major version upgrade with minimal downtime

为了升级数据库并减少停机时间,需要执行以下步骤:在下游的pod中设置返回503错误的GET请求;启用断路器以保护数据库;将PG10数据库设置为只读模式,并验证写入事务是否被禁用;断开所有与PG10数据库的连接;检查复制延迟,确保PG10和PG13数据库同步;重置序列以避免序列号冲突;更新Route53,将数据库连接字符串指向PG13数据库;验证Route53的DNS更新;通过在应用程序pod中运行写入脚本来验证PG13数据库的写入功能;关闭断路器,恢复应用程序的入口流量。这种蓝绿部署方法在升级数据库时成功减少了停机时间。感谢Shailesh Rangani和Suyog Pagare的Postgres专业知识,使得这次升级的停机时间最小化。

Python Upgrade Playbook

该文介绍了Lyft团队在Python升级方面的经验和做法。团队通过定期发送更新邮件和使用Slack渠道来分享和回答问题,以及利用新功能只能在较新版本的Python中使用来激励升级。团队成功升级了1500多个代码库,并且没有遇到重大问题,得益于他们优秀的CI和预发布环境。他们的升级速度越来越快,并且在其他重大项目的同时取得了进展。他们的工作也带来了其他好处,例如加快了开发流程和数据集的标准化。团队计划将他们的工具推广到整个基础架构,以跟踪所有升级和推广最佳实践。

Druid Deprecation and ClickHouse Adoption at Lyft

ClickHouse是一个开源的高性能面向列的数据库,用于在线分析处理。Lyft决定扩展ClickHouse并废弃Druid,将现有的Druid用例迁移到ClickHouse。ClickHouse相对于Druid具有简化的基础设施管理、较低的学习曲线、数据去重、较低的成本和专门的引擎等优势。Lyft通过基准测试和性能分析来评估ClickHouse,并进行了平滑的迁移过程。他们在Lyft使用ClickHouse的架构是基于Altinity的Kubernetes Operator,在HA模式下运行,使用AWS M5类型的计算实例和EBS卷进行存储。数据的摄取主要通过Kafka和Kinesis进行,并通过内部代理和可视化工具进行读取查询。Lyft在ClickHouse上处理大量数据,并对查询性能进行了优化,包括使用排序键、跳过索引和投影等技术。他们在ClickHouse上处理多个用例,包括市场健康、政策报告、花费追踪、预测和实验等。然而,在使用ClickHouse过程中也遇到了一些问题,如查询缓存性能和与Kafka集成的问题。此外,Lyft计划进一步扩展ClickHouse的使用,包括稳定批处理架构和使用流式Kinesis摄取。他们还计划将Flink SQL迁移到ClickHouse,并考虑使用ClickHouse Keeper替代ZooKeeper以减少外部组件依赖。

From Big Data to Better Data: Ensuring Data Quality with Verity

High-quality data is necessary for the success of every data-driven company. It enables everything from reliable business logic to insightful decision-making and robust machine learning modeling. It is now the norm for tech companies to have a well-developed data platform. This makes it easy for engineers to generate, transform, store, and analyze data at the petabyte scale. As such, we have reached a point where the quantity of data is no longer a boundary. Yet this has come at the cost of quality.

In this post we will define data quality at a high-level and explore our motivation to achieve better data quality. We will then introduce our in-house product, Verity, and showcase how it serves as a central platform for ensuring data quality in our Hive Data Warehouse. In future posts we will discuss how Verity addresses data quality elsewhere in our data platform.

Building a Control Plane for Lyft’s Shared Development Environment

Note: This publication assumes you have basic familiarity with the service mesh pattern (e.g. Istio, Linkerd, Envoy — created at Lyft!)

Where’s My Data — A Unique Encounter with Flink Streaming’s Kinesis Connector

For years now, Lyft has not only been a proponent of but also a contributor to Apache Flink. Lyft’s pipelines have evolved drastically over the years, yet, time and time again, we run into unique cases that stretch Flink to its breaking points — this is one of those times.

Building Real-time Machine Learning Foundations at Lyft

In early 2022, Lyft already had a comprehensive Machine Learning Platform called LyftLearn composed of model serving, training, CI/CD, feature serving, and model monitoring systems.

On the real-time front, LyftLearn supported real-time inference and input feature validation. However, streaming data was not supported as a first-class citizen across many of the platform’s systems — such as training, complex monitoring, and others.

While several teams were using streaming data in their Machine Learning (ML) workflows, doing so was a laborious process, sometimes requiring weeks or months of engineering effort. On the flip side, there was a substantial appetite to build real-time ML systems from developers at Lyft.

Lyft is a real-time marketplace and many teams benefit from enhancing their machine learning models with real-time signals.

To meet the needs of our customers, we kicked off the Real-time Machine Learning with Streaming initiative. Our goal was to develop foundations that would enable the hundreds of ML developers at Lyft to efficiently develop new models and enhance existing models with streaming data.

In this blog post, we will discuss some what we built in support of that goal and the lessons we learned along the way.

Gotchas of Streaming Pipelines: Profiling & Performance Improvements

Discover how Lyft identified and fixed performance issues in our streaming pipelines.

Building a large scale unsupervised model anomaly detection system — Part 2

Building ML Models with Observability at Scale.

Building a large scale unsupervised model anomaly detection system — Part 1

Distributed Profiling of Model Inference Logs.

Big Savings On Big Data

How Lyft’s ML Platform Saves Time and Money on Big Data/ML Workloads.

The Recommendation System at Lyft

Recommendation plays an important role in Lyft’s understanding of its riders and allows for customizing app experiences to better fulfill their needs. At times, recommendations are also leveraged to manage the marketplace, making sure there’s a healthy balance between ride demand and driver supply. This allows ride requests to be fulfilled with more desirable dispatch outcomes such as matching riders with the best driver nearby.

This blog post focuses on the scope and the goals of the recommendation system, and explores some of the most recent changes the Rider team has made to better serve Lyft’s riders.

SimulatedRides: How Lyft uses load testing to ensure reliable service during peak events

We know what you’re thinking — testing in production is one of the cardinal sins of software development. However, at Lyft we have come to realize that load testing in production is a powerful tool to prepare systems for unexpected bursty traffic and peak events. We’ll explore why Lyft needed a custom performance testing framework that worked in production, how we built a cross-functional solution, and how we’ve continued to improve this testing platform since its launch in 2016.

What exactly do we mean by “Load Testing”? In the context of this article we mean any tool that creates traffic to stress test systems and see how they perform at the limits of their capacity.

lyft2vec — Embeddings at Lyft

Graph learning methods can reveal interesting insights that capture the underlying relational structures. Graph learning methods have many industry applications in areas such as product or content recommender systems and network analysis.

In this post, we discuss how we use graph learning methods at Lyft to generate embeddings — compact vector representation of high-dimensional information. We will share interesting rideshare insights uncovered by embeddings of riders, drivers, locations, and time. As the examples will show, trained embeddings from graphs can represent information and patterns that are hard to capture with traditional, straightforward features.

首页 - Wiki
Copyright © 2011-2024 iteam. Current version is 2.124.0. UTC+08:00, 2024-04-25 10:29
浙ICP备14020137号-1 $访客地图$