中间件与数据库：Flink的相关资料

Flink Sql Gateway的原理与实践

我们在使用Flink开发实时任务时，都会用到框架本身提供的DataStream API，这使得用户不能不用Java或者Scala甚至Python来编写业务逻辑；这种方式虽然灵活且表达性强，但对用户具有一定的开发门槛，并且随着版本的不断更新，DataStream API也有很多老版本不兼容的问题。

OPPO技术

Real-Time Exactly-Once Ad Event Processing with Apache Flink and Kafka

Uber recently launched a new capability: Ads on UberEats. With this new ability came new challenges that needed to be solved at Uber, such as systems for ad auctions, bidding, attribution, reporting, and more. This article focuses on how we leveraged open source technology to build Uber’s first “near real-time” exactly-once events processing system. We’ll dive into the details of how we achieved exactly-once processing as well as the inner workings of our event processing jobs.

uber技术

网易游戏 FlinkSQL 平台化实践

随着近年来流式 SQL 理论逐渐完善，在实时流计算场景中的提供与离线批计算类似的 SQL 开发体验成为可能。本文将介绍在网易游戏在 Flink SQL 平台化上的探索和实践。

Flink+Hologres在网校策略算法的实践和应用

网校的服务策略团队，专注于学员分班、师资调度、客服机器人等算法方向，该类业务场景下，需要实时获取用户的行为特征，通常是将行为日志以及相关数据库的Binlog写入kafka，再通过Flink消费Kafka数据产生实时行为特征或者统计指标后提供交互，这个过程中需要做几件事情，比如Preprocessing（预处理），Pre-aggregated（预聚合），在线训练过程中还需要关联一些维表或者聚合特征，这些特征可能会全量加载到计算节点里面，也有可能需要历史数据二次计算，就需要一个实时的OLAP平台和高并发的点查服务，形成一个交互过程，最后将实时产生的特征推到算法模块中。这个过程难点在于确定一个既可以提供实时的OLAP还能提供高并发点查服务数据库。

好未来技术

Unified Flink Source at Pinterest: Streaming Data Processing

To best serve Pinners, creators, and advertisers, Pinterest leverages Flink as its stream processing engine. Flink is a data processing engine for stateful computation over data streams. It provides rich streaming APIs, exact-once support, and state checkpointing, which are essential to build stable and scalable streaming applications. Nowadays Flink is widely used in companies like Alibaba, Netflix, and Uber in mission critical use cases.

pinterest技术

Flink在唯品会的实践

唯品会自2017年开始基于k8s深入打造高性能、稳定、可靠、易用的实时计算平台，支持唯品会内部业务在平时以及大促的平稳运行。现平台支持Flink、Spark、Storm等主流框架。本文主要分享Flink的容器化实践应用以及产品化经验。

唯品会技术

Detecting Image Similarity in (Near) Real-time Using Apache Flink

Pinterest is a visual platform at its core, so the need to understand and act on images is paramount. A couple of years ago, the Content Quality team designed and implemented our own batch pipeline to detect similar images. The similarity signal is widely used at Pinterest for use cases varying from improving recommendations based on similar images to taking down spam and abusive content. However, it was taking several hours for the signal to be computed for newly created images, which was a long window for spammers and abusers to harm the platform. So recently, the team implemented a streaming pipeline to detect similar images in near-real-time.

pinterest技术

Pinterest Flink Deployment Framework

Apache Flink是一个框架和分布式处理引擎，用于在无界和有界数据流上进行有状态计算。它提供的功能包括精确的唯一性保证、低延迟、高吞吐量和强大的计算模型。在Pinterest，我们采用Flink作为统一的流处理引擎。

pinterest技术

有赞 Flink 实时任务资源优化探索与实践

随着 Flink k8s 化以及实时集群迁移完成，有赞越来越多的 Flink 实时任务运行在 K8s 集群上，Flink k8s 化提升了实时集群在大促时弹性扩缩容能力，更好的降低大促期间机器扩缩容的成本。同时，由于 K8s 在公司内部有专门的团队进行维护，Flink k8s 化也能够更好的减低公司的运维成本。

不过当前 Flink k8s 任务资源是用户在实时平台端进行配置，用户本身对于实时任务具体配置多少资源经验较少，所以存在用户资源配置较多，但实际使用不到的情形。比如一个 Flink 任务实际上 4 个并发能够满足业务处理需求，结果用户配置了 16 个并发。这种情况会导致实时计算资源的浪费，从而对于实时集群资源水位以及底层机器成本，都有一定影响。基于这样的背景，本文从 Flink 任务内存以及消息能力处理方面，对 Flink 任务资源优化进行探索与实践。

有赞技术

中间件与数据库：Flink的相关资料

中间件与数据库：Flink

Flink Sql Gateway的原理与实践

Real-Time Exactly-Once Ad Event Processing with Apache Flink and Kafka

网易游戏 FlinkSQL 平台化实践

Flink+Hologres在网校策略算法的实践和应用

Unified Flink Source at Pinterest: Streaming Data Processing

Flink在唯品会的实践

Detecting Image Similarity in (Near) Real-time Using Apache Flink

Pinterest Flink Deployment Framework

有赞 Flink 实时任务资源优化探索与实践

基于Flink构建实时数仓实践

【Flink】基于 Flink 实时计算商品订单流失量

flink动态分流

字节跳动基于Flink的MQ-Hive实时数据集成

阿里巴巴大规模应用 Flink 的实战经验：常见问题诊断思路

达达集团实时计算任务SQL化实践

基于 Apache Flink 的实时 Error 日志告警