数据网格中的流式 SQL
By keeping the logic of individual Processors simple, it allowed them to be reusable so we could centrally manage and operate them at scale. It also allowed them to be composable, so users could combine the different Processors to express the logic they needed.
通过保持单个处理器的逻辑简单,使它们可重用,因此我们可以在中心管理和操作它们以实现规模化。它还允许它们可组合,因此用户可以组合不同的处理器来表达所需的逻辑。
However, this design decision led to a different set of challenges.
然而,这种设计决策带来了一组不同的挑战。
Some teams found the provided building blocks were not expressive enough. For use cases which were not solvable using existing Processors, users had to express their business logic by building a custom Processor. To do this, they had to use the low-level DataStream API from Flink and the Data Mesh SDK, which came with a steep learning curve. After it was built, they also had to operate the custom Processors themselves.
一些团队发现提供的构建块不够表达。对于无法使用现有处理器解决的用例,用户必须通过构建自定义处理器来表达其业务逻辑。为此,他们必须使用 Flink 的低级 DataStream API 和 Data Mesh SDK,这需要一个陡峭的学习曲线。构建完成后,他们还必须自行操作自定义处理器。
Furthermore, many pipelines needed to be composed of multiple Processors. Since each Processor was implemented as a Flink Job connected by Kafka topics, it meant there was a relatively high runtime overhead cost for many pipelines.
此外,许多管道需要由多个处理器组成。由于每个处理器都是由 Kafka 主题连接的 Flink 作业,因此对于许多管道来说,运行时开销相对较高。
We explored various options to solve these challenges, and eventually landed on building the Data Mesh SQL Processor that would provide additional flexibility for expressing users’ business logic.
我们探索了各种解决这些挑战的选项,最终决定构建数据网格 SQL 处理器,为表达用户业务逻辑提供额外的灵活性。
The existing Data Mesh Processors have a lot of overlap with SQL. For example, filtering and projection can be expressed in SQL through SELECT and WHERE clauses. Additionally, instead of implementing business logic by composing multiple individual Processors together, users could express their logic in a single SQL query, avoiding the additional resource and latency overhead that came from multiple Flink jobs and Kafka topics. Furthermore, SQL can support User Defined Functions (UDFs) and cus...