ML Education at Uber: Program Design and Outcomes

If you have read our previous article, ML Education at Uber: Frameworks Inspired by Engineering Principles, you have seen several examples of how Uber benefits from applying Engineering Principles to drive the ML Education Program’s content design and program frameworks.

In this follow-up, we will dig deeper into what we believe to be other unique aspects of ML Education at Uber: our approach to Content Components, Content Delivery, Observability, and Marketing & Reach.

Our Library of Content Modules

As mentioned in our first post, Modularity and Extensibility were huge motivations in building our learning content out of unique components. Having a uniform library of content modules provides 2 major benefits:

Ability to tailor the inclusion or exclusion of component types based on the unique scope of a learning topic
Consistent delivery of component types across course offerings, which reduces cognitive load for users

Our library is always evolving, but 4 types of core modules have been with our program since the beginning: Prerequisites, Theory, Hands-on Codelabs, and “Getting Started” Packs.

Prerequisites

Courses may enforce 2 primary types of prerequisites: knowledge and action-based.

An example of a knowledge prerequisite is: “understanding of basic ML concepts and familiarity with Uber’s ML development workflow.” Action-based prerequisites vary from things like requesting permissions and access to tooling or completing a unique development setup prior to the course.

To complete most of the hands-on modules in our training courses, users are required to build a unique development setup on their local machine. We streamlined this setup process by repurposing the same development setup across most of our course offerings and storing all technical assets (like datasets) in a central repository within Uber’s ML monorepo. This enables users to complete one setup once, rather than a new setup each time they attend an ML Education course.

Prerequisite development setups are facilitated by Uber’s internal tool called “codelabs.”

Theory and Q&A

Visual learners were welcomed to ML Education offerings with slides, diagrams, and graphs, while auditory learners were welcomed with recordings and voice overs. While we often see in our post-course surveys that participants rate the hands-on components as being most helpful, theory remains critical to our courses. We feel that theory compliments our hands-on activities in 3 key ways:

Introduces the functionality of the tools or services
Ties the specific subject back to Uber’s broader ML ecosystem
Details how the specific subject should be applied in the context of Uber use cases

Mindful that long stints of theory can be overwhelming or disengaging to some learner types, we built our theory modules to incorporate:

Pre-recorded instructor videos, rather than support real-time instructor facilitation
Checkpoint questions to recap sections and inspire open Q&A

Hands-on Codelabs

Specifically for kinesthetic learners, hands-on resources are essential. We embedded interactive codelabs within each course, and prompted students with questions to think critically about what they were learning. We also included case studies of current Uber applications that grounded the learnings back to reality. Across all courses facilitated live in H1 2022, we found that on average 88% of attendees felt the hands-on codelab module was the most helpful component of the course.

These hands-on exercises contain some or all of the following:

Interactive Jupyter notebooks
Uber-specific datasets
ML project scaffolds or model templates for students to replicate
Assets stored in Uber’s ML monorepo to streamline environment setup and access to course assets that participants need to use locally

“Getting Started” Packs

After obtaining knowledge on a new subject, it is sometimes difficult to know exactly how to apply it to a real-world use case. Our “Getting Started” packs aim to reduce this uncertainty for users post-course.

We know our hands-on codelabs do an excellent job reinforcing what a user has just learned by allowing them to explore a tool or service in a controlled environment. However, in order to teach a diverse range of skill sets at scale, some exercises must use pre-determined datasets and existing projects are cloned. This removes an element of decision-making from the ML development workflow for users, even in our most complex courses.

This may cause a user to be left with a feeling of “great, now what?” when preparing to dive headfirst into using our tools/services to build their own ML solution from scratch. To aid the users who need help taking their first pass at “real world” application of the course’s content, we equip each class of participants with a starter pack.

The structure of each starter pack varies per topic, but we commonly provide things like direction to relevant project templates, high-level guidance for building their own project, and which resources or helpdesk channels to leverage if they need additional help. Providing this stepping stone also gives us a clear view into the impact metrics we establish for a given course (see the “Observability” section for more).

Content Delivery

Once a new course’s components are determined, the ML Education team decides what delivery format best suits the course subject matter.

When ML Education’s inaugural pilot courses launched, we delivered courses using the only delivery format we had available at the time: Instructor-led Live Facilitation. Today, we have 3 delivery formats, each of which evolved from a combination of needs of different learning topics, participants, and instructors.

Figure 1: Delivery methods for ML Education content.

Live

Our live delivery method encapsulates a full learning experience that commonly includes theory, applications, and hands-on activity components. Each theory module contains a combination of pre-recorded and live content, and our hands-on activities are facilitated through Uber codelabs.

Figure 2: Live content delivery method explained.

Using pre-recorded material in a “live” session might sound counterintuitive, but we leverage recordings to meet time constraints and to minimize variance in content delivery. Where does that leave our on-call instructors? Their primary responsibility is to facilitate the Q&A that follows sections of theory and further elaborate on any important concepts in real time.

To facilitate kinesthetic learning, we include an instructor-guided lab that applies theory content to a more tangible example, giving students the chance to get hands-on experience with the content and tooling just demonstrated.

We commonly use live delivery for high-priority topics that range from intermediate to advanced level of difficulty. For instance, adoption of Canvas (Uber’s internal tool that brings reproducible code to the ML development process) was highly prioritized by UberAI in 2021. The ML Education program developed curriculum in response to the desired behavior change, and facilitated several live Canvas training instances in the US and India.

So far, user feedback has overwhelmingly indicated that live content delivery is the most effective method for topics where we want to drive behavior changes like tool adoption. In live sessions, users can work closely with our ML experts in real time to fully understand theory concepts and troubleshoot throughout the hands-on course module. By the time they leave the session, users have had the opportunity to clarify anything they need to know in order to immediately implement the tools, services, and/or practices they just learned into a production use case.

Semi-Guided

This is the ML Education Program’s blended learning approach to content delivery. It’s not quite as involved as a live course, but not as hands-off as an online course.

In a semi-guided course, participants assume most of the ownership of absorbing theory content (see figure below).

Figure 3: Semi-guided content delivery method explained.

This format allows users to learn theory content independently rather than with the entire group. Instructors are utilized only where and when they are needed, which is largely during the course’s codelab (hands-on) components.

We have found that semi-guided delivery works well for topics where:

Theory content is still important, but high-level enough where users can confidently consume key takeaways on their own
A thorough explanation from a subject matter expert is not required to understand key concepts (independent reading is sufficient)

It is more likely that beginner-level topics will fit the requirements listed above. Our Semi-Guided Intro to Regression and Intro to Classification courses have obtained overwhelmingly positive feedback to date, both earning 100% CSAT scores in H1 2022.

Online

Our commitment to reproducibility is exemplified by online courses. For every live course that matures, an online version is created, stored, and advertised to users. The online course versions contain all the same components as a topic’s live version, but available at a time more convenient to the user.

An element of engagement is inevitably lost when leveraging online delivery. We combat this by providing Q&A support through dedicated Slack channels for online course participants.

We have found extreme value in offering a near 1:1 mapping of live to online course offerings. Online delivery dramatically increases accessibility, which is critical for a large-scale global organization. At Uber, our engineers implement ML solutions around the globe and we are committed to ensuring all engineers have access to our unified library of learning resources regardless of location. Scheduling time-zone-inclusive live course instances can be challenging, so having an online offering available to anyone at any time is necessary in order for us to fulfill our commitment to global accessibility.

While each of our 3 delivery formats are uniquely their own, we strive to bring consistencies across each method where applicable. For example, we provide users with a consistent approach to troubleshooting and Q&A regardless of delivery format.

Each course is allocated a unique Slack channel where participants submit questions or request help troubleshooting as they complete the course. We like using Slack because it allows for continued engagement between participants and instructors post-course. All questions are captured and archived which enables users to search questions asked in previous live sessions.

Additionally, each delivery method contains the same branded templates and artifacts. Course introductions and conclusions are organized and facilitated in a similar manner regardless of delivery in an attempt to reduce cognitive load on users, as well as provide an indistinguishable “look/feel” for all ML Education learning resources.

Which delivery method works best? There’s no right answer for that. They all satisfy different needs (whether it be needs of the content, audience, or both).

Observability

In the early days, when ML Education was an informally established program of 2 courses, the initiative was considered to be a pilot until value proved otherwise.

How were the ML Education program creators able to capture and communicate this value so that the program could scale to what it is today? By weaving a disciplined observability strategy into the program’s foundation.

When we think of observability, we do so in the context of our tooling as well as our individual courses. The capabilities embedded in Uber’s ML tooling provides ML Education with the opportunity to observe the positive business impact our courses provide to the organization.

Thankfully, Uber’s internal ML infrastructure is deeply integrated with Uber’s existing comprehensive observability toolstack (e.g., M3, ELK). Each step of the ML workflow is checkpointed and logged, so we can easily identify the points of friction and dropoff of users as they are going through our codelabs. To reduce friction for users and drive towards completion, we made sure every step in the codelab is resumable or bypassable if a user so chooses. Instructors can also easily use our logged metrics to build dashboards with Grafana to track engagement or to set up alerts to check if a component used in the hands-on portion is broken. One of our favorite use cases for this is monitoring the spike in new users during and following our annual internal machine learning conference, UberML.

Customer testimonials are a less systematic, but incredibly powerful approach to observability. At any given time we have former course attendees reach out to share their ML accomplishments following the training courses they’ve taken. For example, an engineer shared that following his attendance of Intro to Deep Learning he applied the skills he learned to productionize a phishing/fraud detection model that realized substantial cost savings for Uber. To more concretely establish our attribution models, we are also considering publishing specific docker images as part of ML Education and to track their usages as base images.

We understand that in general, measuring shifts in behavior is really difficult. But having excellent observability capabilities in our tooling get us one step closer to accurately capturing a user’s motivation for adopting a new tool and appropriately attributing it to the investment we make in ML educational resources.

Knowing that business impact can be a less concrete area of observability, we couple it with several other KPIs to enable observability for overall program health and success.

ML Education defines “health” from 4 major points of view:

Sentiment (participant, volunteer)
Quality (content freshness)
Consumption (enrollments, attendance)
Business Impact

To measure progress in each of these 4 areas, we use some or all of the following methods for data collection and analysis:

Surveys
Google Analytics
Learning Management System (LMS) user data
Logging functionality in our ML tool stack
Existing usage tracking in our ML tool stack
Anecdotal feedback

Outcomes

We have a comprehensive list of key outcomes that nicely tie into each of the 4 pillars of program health mentioned above. Of this list, the 4 outcomes that we want to highlight in detail are Participant Satisfaction, Instructor Satisfaction, Launching as Beta, and ML Market Size.

Participant Satisfaction

We have always measured participant overall satisfaction (OSAT) at a course level. Early on, this course level satisfaction metric played a heavier role in quantifying overall program success. Meaning if participant OSAT for our courses was high, the program could be deemed “healthy.” If participant OSAT was low, that was flagged as a risk not only for the specific course that derived the response, but the program as a whole. Today, we still highly value participant OSAT, but we now follow a much more comprehensive approach to determining overall program health.

So far in 2022 our aggregate participant OSAT is 94% YTD. Factors including communication, discoverability of learning resources, and delivery mechanisms strongly impact a user’s overall experience during a training course.

Measuring and investing in participant satisfaction has yielded 2 outcomes of note:

Repeat customers
Some participants convert into program volunteers

These results pay high dividends because they positively contribute to critical program pipelines (our customer and volunteer bases).

Instructor Satisfaction

After establishing a standard for measuring OSAT for participants, we quickly did the same for instructor OSAT. Instructor OSAT is measured at a program level rather than course level, because we want to account for the E2E experience of being a member of Uber’s ML Education Team. Are instructors satisfied with the program’s operating procedures? Do they feel that their time and talents are utilized effectively? These insights are critical for instructor recruitment and retention.

Measuring instructor OSAT gives us clear, actionable insights as to how we could make the ML Education team experience awesome for a diverse group of volunteers. The first instance of measuring instructor OSAT nudged us to implement changes to our operations that yielded the below outcomes:

Formally recognize instructors’ contributions to training during Uber’s perf/promo cycle. This includes formalizing contributions to ML Education as an example to demonstrate Engineering Competencies at Uber.
Allow volunteers the flexibility to choose their level of commitment to the program on a per-half basis (versus annual)

Launching in Beta

All new courses and updates to existing courses are rolled out in beta. This provides an incubator-esque environment for each course development team to experiment with new frameworks, modules, and delivery mechanisms. We communicate to users ahead of time that they are participating in a beta rollout and that their feedback is critically important in finalizing the content or delivery prior to formal rollout. A similar approach is applied when rolling out new program processes or offerings.

By having a beta-first attitude to course launches and processes, we foster a program culture of creativity and data-driven experimentation. As a result, we have a healthy 2-way channel for user feedback and the ability for ML teams to effectively test out new features quickly.

ML Market Size

Over time, all live courses tend to have a higher proportion of employees of non-ML backgrounds attend than they did when initially launched. We are particularly excited about this trend. When a larger percentage of non-ML engineers attend ML Education courses it means that we are distilling ML expertise to the broader ML market, increasing the overall internal ML market size for Uber.

Uber’s ML Education Program has covered a lot of ground in just 18 months. In 2021 we served 818 employees and are on track to increase the total number of program participants by over 50% in 2022. Our curriculum library has grown 3x, and our instructor base has grown 2x since the program’s inception.

These incredible results are realized by a passionate group of volunteers, strong buy-in from our Program Sponsor, the support and collaboration of other Tech Enablement professionals, a data-driven approach to decision making, and a preference towards experimentation.

Thank you for devoting time to learning about how we use education as a means to scale ML at Uber. We hope you found value in this series of articles, and wish you all the best in your machine learning enablement efforts.

Acknowledgements

The ML Education Program would not be possible without Thommen Korah, David Morales, Juan Marcano, Program Sponsor (Smitha Shyam), and the hard work of our ML Education core group and course instructors. This team has dedicated a significant amount of their time to educating Uber Engineers to recognize ML business problems, apply ML solutions at scale, and accelerate their work using our internal tools at Uber.