无中断地将大规模交互式计算工作负载迁移到Kubernetes
Millions of people worldwide use Uber daily, generating vast amounts of data on traffic, routes, estimated times, and more. We use this data to learn and enhance their experiences with Uber.
全球数百万人每天使用Uber,生成大量关于交通、路线、预计时间等的数据。我们利用这些数据来学习并提升他们与Uber的体验。
We developed DSW (Data Science Workbench), an interactive notebook platform for applied scientists, data scientists, ML engineers, and operations specialists to facilitate this learning. DSW supports data exploration, analysis, model training, workflow scheduling, visualization, and collaboration through a web interface.
我们开发了 DSW (数据科学工作台),这是一个为应用科学家、数据科学家、机器学习工程师和运营专家提供的交互式笔记本平台,以促进学习。DSW 支持数据探索、分析、模型训练、工作流调度、可视化和通过 Web 界面的协作。
Behind the scenes, the DSW team provides access to Jupyter® and RStudio® notebooks by allocating isolated containers with internal tooling and necessary open-source software. These containers vary in memory, compute, and GPU resources. Each user session offers multiple Python® kernels, Apache Spark™ PySpark, and Sparkmagic kernels, with independent environments. Additional Python packages can be installed in the environment that used to require reinstalling upon container restart before the migration.
在后台,DSW 团队通过分配带有内部工具和必要开源软件的隔离容器来提供对 Jupyter® 和 RStudio® 笔记本的访问。这些容器在内存、计算和 GPU 资源上各不相同。每个用户会话提供多个 Python® 内核、Apache Spark™ PySpark 和 Sparkmagic 内核,具有独立的环境。在迁移之前,环境中可以安装额外的 Python 包,而不需要在容器重启时重新安装。
Managing Python dependencies is complex, and migrating tech stacks without disruption is challenging. In this post, we explore how we migrated 3,500 interactive Jupyter and RStudio user sessions from Peloton—an Apache Mesos®–based container orchestrator—to Kubernetes®, with minimal disruption. We also highlight how intelligently debouncing inotify events helped track installed/uninstalled Python packages during the migration across restarts.
管理 Python 依赖关系是复杂的,迁移技术栈而不造成中断是具有挑战性的。在这篇文章中,我们探讨了如何将 3,500 个交互式 Jupyter 和 RStudio 用户会话从 Peloton——一个基于 Apache Mesos® 的容器编排器——迁移到 Kubernetes®,并且造成的中断最小。我们还强调了如何智能地去抖动 inotify 事件,以帮助在重启期间跟踪已安装/未安装的 Python 包。
This ...