话题公司 › Uber

公司:Uber

关联话题: 优步

优步(英语:Uber,/ˈuːbər/)是一间交通网络公司,总部位于美国加利福尼亚州旧金山,以开发移动应用程序连结乘客和司机,提供载客车辆租赁及媒合共乘的分享型经济服务。乘客可以透过应用程序来预约这些载客的车辆,并且追踪车辆的位置。营运据点分布在全球785个大都市。人们可以透过网站或是手机应用程序进入平台。

优步的名称大多认为是源自于德文über,和over是同源,意思是“在…上面”。 (页面存档备份,存于互联网档案馆)

然而其营业模式在部分地区面临法律问题,其非典型的经营模式在部分地区可能会有非法营运车辆的问题,有部分国家或地区已立法将之合法化,例如美国加州及中国北京及上海。原因在于优步是将出租车行业转型成社群平台,叫车的客户透过手机APP(应用程序),就能与欲兼职司机的优步用户和与有闲置车辆的租户间三者联系,一旦交易成功即按比例抽佣金、分成给予反馈等去监管化的金融手法。

2019年5月10日,优步公司透过公开分发股票成为上市公司,但首日即跌破分发价。

据估算,优步在全球有1.1亿活跃用户,在美国有69%的市占率。优步亦在大中华区开展业务,目前优步已在香港和台湾建成主流召车服务平台,并于中国大陆通过换股方式持有该市场最大网约车出行平台滴滴出行母公司小桔科技17.7%经济权益。

Containerizing the Beast – Hadoop NameNodes in Uber’s Infrastructure

There are several online references on how to run Apache Hadoop® (referred to as “Hadoop” in this article) in Docker for demo and test purposes. In a previous blog post, we described how we containerized Uber’s production Hadoop infrastructure spanning 21,000+ hosts.

HDFS NameNode is the most performance-sensitive component within large multi-tenant Hadoop clusters. We had deferred NameNode containerization to the end of our containerization journey to leverage learnings from containerizing other Hadoop components. In this blog post, we’ll share our experience on how we containerized HDFS NameNodes and architected a zero-downtime migration for 32 clusters.

Scaling Adoption of Kerberos at Uber

At Uber, we have been operating an Apache Hadoop® based Data analytics platform since 2015. As adoption picked up exponentially in 2016, we decided to secure our platform with Kerberos authentication. Since then, Kerberos has become a critical component of our security infrastructure supporting not only Uber’s Hadoop ecosystem but also other services that are considered mission critical to our tech stack.

Uber has a large and diverse tech stack deployed on thousands of machines and storing more than 300 PB of analytics data. Systems using Kerberos authentication include YARN, HDFS, Apache Hive™, Apache Kafka®, Apache Zookeeper™, Presto®, Apache Spark™, Apache Flink®, and Apache Pinot™, to name a few. Besides open-source systems, many of the internally developed platforms and services also use Kerberos for authentication. All these services need to securely communicate with each other, as well as provide secure access for all their user clients.

In this blog post, we’ll share some of the key challenges and the solutions that enabled us to scale adoption of Kerberos for authentication at Uber.

Devpod: Improving Developer Productivity at Uber with Remote Development

In this blog, we share how we improved the daily edit-build-run developer experience using DevPods, our remote development environment. We will start with some of the initial challenges, the pain points we addressed with Devpod, our architecture, and some of our recent successes in terms of adoption and cost reduction. We will finally leave you with some thoughts around the future of remote development at Uber.

LeakProf: Featherlight In-Production Goroutine Leak Detection

Go is a popular programming language used in microservice development, with one of the key features being first-class support for concurrency. Given its burgeoning popularity, it is no surprise that Uber adopted it: the Go monorepo is a centerpiece of its development platform, housing a significant portion of Uber’s codebase in the form of critical business logic, support libraries, or key components of the infrastructure.

Go’s concurrency model is built on goroutines–lightweight threads. Any function call prefixed with the “go” keyword launches the function asynchronously. Goroutines are used abundantly in Go codebases owing to low syntactical overhead and resource requirements, with programs often involving tens to hundreds or thousands of goroutines simultaneously. Two or more goroutines can communicate with each other via message passing over channels, a paradigm inspired by Hoare’s Communicating Sequential Processes. While traditional shared-memory communication is still an option, the development team behind Go encourages users to favor channels, arguing for higher tolerance against data races when used properly.

How Uber Optimizes the Timing of Push Notifications using ML and Linear Programming

Push notifications are an integral channel for Uber Eats customers to discover new restaurants, valuable promotions, new offerings such as grocery and alcohol, and the perks of becoming a member, among other things. Push notifications are sent from various teams internally such as Marketing, City Operations, and Product. Since marketing push notifications were introduced in March 2020, not only did the list of teams sending notifications grow quickly, but there was also quick growth in volume to billions of notifications per month by the end of 2020.

Speed Up Presto at Uber with Alluxio Local Cache

At Uber, data informs every decision. Presto is one of the very core engines that powers all sorts of data analytics at Uber. For example, the operations team makes heavy use of Presto for services such as dashboarding; the Uber Eats and marketing teams rely on the results of these queries to make decisions on prices. In addition, Presto is also used in Uber’s compliance department, growth marketing department, and ad-hoc data analytics.

The scale of Presto at Uber is large. Currently, Presto has 9,000 daily active users, processing 500K queries per day and handling over 50PB of data. Uber’s infrastructure encompasses 2 data centers with 7,000 nodes and 20 Presto clusters across 2 regions.

Simplifying Developer Testing Through SLATE

Modern-day technical system deployments generally follow SOA or a microservice-based architecture that allows for clearer separation of concerns, ownership, well-defined dependencies, and abstracts out a single unit of business logic.

Uber has thousands of services coordinating to power up the platform that drives the company at scale. To offer a great experience to customers, developers have to ensure that their service is meeting its functional requirements. Building confidence in meeting functional requirements requires testing of a service.

Introducing WorkflowGuard: The Workflow Governance and Observability System That Oversees over 120,000 Data Workflows

At Uber, over 120,000 production workflows are orchestrated, scheduled, and executed every day. These workflows are owned by over 3,000 users across many teams within Uber, powering critical ETL jobs, business metrics, dashboards, machine learning models, or critical regulatory reports. Internally, the Data Workflow Platform (DWP) team makes this happen by developing Uber’s centralized workflow management system with high infrastructure reliability and minimum scheduling latency. The workflow management system also comes with a user-friendly application that allows users to create, author, and manage both streaming and batch workflows in a self-serve way.

Reducing Logging Cost by Two Orders of Magnitude using CLP

Long, long ago, the amount of data our systems output to logs was small enough that we were able to retain all of the log files. This allowed our engineers to freely analyze the logs, say for troubleshooting our systems or improving applications. But as Uber’s business grew rapidly, the amount of data being logged increased dramatically. And so we were forced to discard log files after just a short period of time, given the prohibitive cost of retaining them–that is, until we integrated CLP into the logging library (Log4j) of our big data platform. In aggregate, CLP achieves a 169x compression ratio on our log data, saving storage, memory, and disk/network bandwidth at every level. As a result, we can now retain all logs at a fraction of the cost, without throwing away any insights, and the compressed logs can be efficiently searched without decompression.

Crane: Uber’s Next-Gen Infrastructure Stack

Uber has been on a multi-year journey to reimagine our infrastructure stack for a hybrid, multi-cloud world. The internal code name for this project is Crane. In this post we’ll examine the original motivation behind Crane, requirements we needed to satisfy, and some key features of our implementation. Finally, we’ll wrap up with some forward-looking views for Uber’s infrastructure.

Deduping and Storing Images at Uber Eats

The Uber Eats system handles several hundred million product images and millions of image updates are performed every hour. We have implemented a content-addressable caching layer that very effectively detects duplicates and thereby reduces download times, processing times, and storage costs. This full system was developed and completely rolled out in the course of less than 2 months, which improved the latency and reliability of the image service and unblocked projects on our new catalog API development.

MySQL to MyRocks Migration in Uber's Distributed Datastores

Uber uses MySQL as the underlying database engine for Schemaless and Docstore, our distributed databases. By default, MySQL uses the most popular InnoDB engine, a B+Tree structure for data storage. MyRocks is a MySQL storage engine that integrates with RocksDB, an open source project. The RocksDB store is based on the log-structured merge-tree (or LSM tree) and is optimized for fast storage and combines outstanding space and write efficiency with acceptable read performance.

The Uber Storage Platform team has migrated all Schemaless instances and some Docstore instances to MyRocks since 2019. In this post, we are going to talk about the journey of our migration to MyRocks.

Uber Freight Carrier Metrics with Near-Real-Time Analytics

While most people are familiar with Uber, not all are familiar with Uber Freight. Uber Freight has been around since 2016 and is dedicated to provide a platform to seamlessly connect shippers with carriers. We’re simplifying the lives of trucking companies by providing a platform for carriers to browse through all available shipment opportunities with upfront pricing and book with the tap of a button, and making the fulfillment process more scalable and efficient.

Providing reliable service to shippers is critical for Uber Freight in order to gain their trust. Because carriers’ performance could significantly impact reliability of Freight’s service, we need to be transparent with carriers about the level to which we are holding them accountable, providing them with a clear view of how well they are performing and, if needed, where they can improve.

To achieve this, Uber Freight developed the Carrier Scorecard to show the carriers several metrics, including engagement with the app, on-time pickup/delivery, tracking automation, and late cancellations. By showing this information in near real time on the Carrier App, we are able to provide feedback to carriers in real time and set ourselves apart from most of our competitors in the industry.

Uber’s Next Gen Push Platform on gRPC

In our last blog post we talked about how we went from polling for refreshing the app to a push-based flow to build our app experience.

All our apps need to be synced with real-time information, whether it’s through pickup time, arrival time, and route lines on the screen, or nearby drivers when you open the app. We use our push platform to deliver these messages that power the real-time user experiences as described in our previous post, which we strongly recommend that you review to learn about the details of the architecture before proceeding.

This blog post will cover how we changed our protocol from Server Sent Events (HTTP1.1) to gRPC-based bidirectional streaming (QUIC/HTTP3), the challenges we faced, the final results, and some key learnings.

ML Education at Uber: Program Design and Outcomes

If you have read our previous article, ML Education at Uber: Frameworks Inspired by Engineering Principles, you have seen several examples of how Uber benefits from applying Engineering Principles to drive the ML Education Program’s content design and program frameworks.

In this follow-up, we will dig deeper into what we believe to be other unique aspects of ML Education at Uber: our approach to Content Components, Content Delivery, Observability, and Marketing & Reach.

ML Education at Uber: Frameworks Inspired by Engineering Principles

At Uber, millions of machine learning (ML) predictions are made every second, and hundreds of applied scientists, engineers, product managers, and researchers work on ML solutions daily.

Uber wins by scaling machine learning. We recognize org-wide that a powerful way to scale machine learning adoption is by educating. That’s why we created the Machine Learning Education Program: a program driven by Engineering Principles that provides a framework for delivering Uber-specific ML educational resources to Uber Tech employees.

Like a production system, education resources, contents, and distribution channels need to be continuously measured, evaluated, and improved. Ensuring each component of the ML Education Program is designed on this premise enabled us to quickly deliver new courses and curriculum that are tailored to engineers and scientists of various backgrounds.

This 2-part article will focus on how we have applied engineering principles when designing and scaling this program, and how it has helped us achieve the desired outcome. Part 1 will introduce our design principles and explain the benefits of applying these principles to technical education content design and program frameworks, specifically in the ML domain. Part 2 will take a closer look at critical components of the program and reflect on the outcomes that make ML Education at Uber a success.

inicio - Wiki
Copyright © 2011-2024 iteam. Current version is 2.129.0. UTC+08:00, 2024-07-06 03:23
浙ICP备14020137号-1 $mapa de visitantes$