中间件与数据库：Kafka的相关资料

去哪儿KAFKA性能优化-节省2000核CPU

去哪儿旅行的Kafka日志集群在春节压测期间遇到性能问题，导致部分客户端堆积和数据生产异常。集群网络闲置率降低到0.4以下，部分机器接近闲置，无法通过增加机器解决性能问题。经排查，发现数据量增大和高峰期pod扩展导致网络链接数增加影响性能。通过将num.io.threads参数从32修改为128，优化了Kafka本身解决了问题，并节省了2000核CPU。此外，将单盘改为双盘并没有提升闲置率。

携程技术

该文章介绍了在Go语言中使用kafka-go库创建消费者并进行数据消费的过程。重点讨论了重平衡的机制和实现。通过分析kafka-go库中的代码，展示了心跳机制的实现方式，并解释了当协调者通知消费者进行重平衡时，消费者如何暂停消费并重新加入消费组。重平衡期间，消费者会停止消费并重新分配分区。最后，消费者会重新创建协程进行数据的获取。这篇文章对于想要在Go语言中使用kafka-go库进行数据消费的开发者来说是非常有用的参考资料。

三七互娱技术

这些年背过的面试题——Kafka篇

本文是技术人面试系列Kafka篇，面试中关于Kafka都需要了解哪些基础？一文带你详细了解。

阿里巴巴技术

Kafka on Kubernetes: Reloaded for fault tolerance

Coban - Grab’s real-time data streaming platform - has been operating Kafka on Kubernetes with Strimzi in production for about two years. In a previous article (Zero trust with Kafka), we explained how we leveraged Strimzi to enhance the security of our data streaming offering.

In this article, we are going to describe how we improved the fault tolerance of our initial design, to the point where we no longer need to intervene if a Kafka broker is unexpectedly terminated.

grab技术

Kafka 分级存储在腾讯云的实践与演进

本文介绍了一系列与微服务和消息队列相关的技术文章。其中包括云原生API网关支持WAF对象接入、Apache RocketMQ在腾讯云的实践、RocketMQ 5.X PopAck源码拆解等。该系列文章涵盖了多个技术领域，并提供了相关的详细信息和实践案例。

腾讯技术

B站KAFKA探索与实践

Kafka 是我们公司各个部门的重要数据中间件，主要用于上报、暂存和分发各种数据。

哔哩哔哩技术

Flink消费kafka数据同步问题排查

我们有一个flink任务，消费的kafka的数据，写入到es，非常简单的逻辑，但是出现了数据丢失的情况。

哈啰技术

专为小白打造—Kafka一篇文章入门

Kafka 是MQ消息队列作为最常用的中间件之一，其主要特性有：解耦、异步、限流/削峰。

Kafka 和传统的消息系统(也称作消息中间件)都具备系统解耦、冗余存储、流量削峰、缓冲、异步通信、扩展性、可恢复性等功能。与此同时，Kafka 还提供了大多数消息系统难以实现的消息顺序性保障及回溯消费的功能。

京东技术

Scaling Kafka to Support PayPal’s Data Growth

Apache Kafka is an open-source distributed event streaming platform that is used for data streaming pipelines, integration, and ingestion at PayPal. It supports our most mission-critical applications and ingests trillions of messages per day into the platform, making it one of the most reliable platforms for handling the enormous volumes of data we process every day.

To handle the tremendous growth of PayPal’s streaming data since its introduction, Kafka needed to scale seamlessly while ensuring high availability, fault tolerance, and optimal performance. In this blog post, we will provide a high-level overview of Kafka and discuss the steps taken to achieve high performance at scale while managing operational overhead, and our key learnings and takeaways.

paypal技术

揭秘eBay Kafka跨数据中心高可用方案

本文讨论了基于local-aggregation集群拓扑, 设计Kafka跨数据中心高可用方案的思路，同时支撑了上下游数据和服务的高可用和连续性。

eBay技术

Monitoring Apache Kafka with JMX Exporter and Kafka Exporter

At Mixpanel, we use Apache Kafka to ingest trillions of data points per month. Continuous and reliable monitoring of our Apache Kafka brokers is crucial to avoid any unexpected service degradation or loss of data.

Zero traffic cost for Kafka consumers

Coban, Grab’s real-time data streaming platform team, has been building an ecosystem around Kafka, serving all Grab verticals. Along with stability and performance, one of our priorities is also cost efficiency.

In this article, we explain how the Coban team has substantially reduced Grab’s annual cost for data streaming by enabling Kafka consumers to fetch from the closest replica.

grab技术

成本低误差小，携程基于 Kafka 的 Serverless 延迟队列的实践

基于Serverless产品，轻松实现低成本的延迟队列。

携程技术

浅谈kafka

当今大数据时代，高吞吐、高可靠成为了分布式系统中重要的指标。而Apache Kafka作为一个高性能、分布式、可扩展的消息队列系统，被越来越多的企业和开发者所关注和使用。

本文将介绍Kafka的基本概念，包括Kafka的架构、消息的存储和处理方式、Kafka的应用场景等，帮助读者快速了解Kafka的特点和优势。同时探讨Kafka的一些高级特性，如Kafka的配置、文件存储机制、分区等，帮助读者更好地使用Kafka构建分布式系统和应用。

京东技术

Kafka实时数据即席查询应用与实践

Kafka中的实时数据以Topic的概念进行分类存储，而Topic的数据有一定的时效性。在定位一些实时数据的Case时，如果没有对实时数据进行历史归档，在排查问题时，没有日志追述，会很难定位是哪个环节的问题。

vivo技术

消息队列之 MetaQ 和 Kafka 哪个更香！

本篇文章首先介绍MetaQ消息队列，然后介绍作者对MetaQ和Kafka这两个消息队列的理解。

阿里巴巴技术