Maestro:Netflix的工作流编排器
By Jun He, Natallia Dzenisenka, Praneeth Yenugutala, Yingyi Zhang, and Anjali Norwood
由Jun He、Natallia Dzenisenka、Praneeth Yenugutala、Yingyi Zhang和Anjali Norwood撰写
TL;DR
TL;DR
We are thrilled to announce that the Maestro source code is now open to the public! Please visit the Maestro GitHub repository to get started. If you find it useful, please give us a star.
我们很高兴地宣布,Maestro的源代码现在对公众开放!请访问Maestro GitHub存储库开始使用。如果您觉得有用,请给我们一个星星。
What is Maestro
什么是Maestro
Maestro is a general-purpose, horizontally scalable workflow orchestrator designed to manage large-scale workflows such as data pipelines and machine learning model training pipelines. It oversees the entire lifecycle of a workflow, from start to finish, including retries, queuing, task distribution to compute engines, etc.. Users can package their business logic in various formats such as Docker images, notebooks, bash script, SQL, Python, and more. Unlike traditional workflow orchestrators that only support Directed Acyclic Graphs (DAGs), Maestro supports both acyclic and cyclic workflows and also includes multiple reusable patterns, including foreach loops, subworkflow, and conditional branch, etc.
Maestro是一个通用的、水平可扩展的工作流编排器,旨在管理大规模的工作流,如数据管道和机器学习模型训练管道。它监督工作流的整个生命周期,从开始到结束,包括重试、排队、任务分发给计算引擎等。用户可以以各种格式打包他们的业务逻辑,如Docker镜像、笔记本、bash脚本、SQL、Python等。与只支持有向无环图(DAG)的传统工作流编排器不同,Maestro支持有向无环图和循环工作流,并且还包括多个可重用模式,包括foreach循环、子工作流和条件分支等。
Our Journey with Maestro
我们与大师的旅程
Since we first introduced Maestro in this blog post, we have successfully migrated hundreds of thousands of workflows to it on behalf of users with minimal interruption. The transition was seamless, and Maestro has met our design goals by handling our ever-growing workloads. Over the past year, we’ve seen a remarkable 87.5% increase in executed jobs. Maestro now launches thousands of workflow instances and runs half a million jobs daily on average, and has completed around 2 million jobs on particularly busy days.
自从我们在这篇博文中首次介绍了Maestro以来,我们已成功将数十万个工作流迁移到了Maestro上,对用户几乎没有造成中断。过渡是无缝的,Maestro通过处理我们不断增长的工作负载...