中间件与数据库:Kafka
专为小白打造—Kafka一篇文章入门
Kafka 是MQ消息队列作为最常用的中间件之一,其主要特性有:解耦、异步、限流/削峰。
Kafka 和传统的消息系统(也称作消息中间件)都具备系统解耦、冗余存储、流量削峰、缓冲、异步通信、扩展性、可恢复性等功能。与此同时,Kafka 还提供了大多数消息系统难以实现的消息顺序性保障及回溯消费的功能。
Scaling Kafka to Support PayPal’s Data Growth
Apache Kafka is an open-source distributed event streaming platform that is used for data streaming pipelines, integration, and ingestion at PayPal. It supports our most mission-critical applications and ingests trillions of messages per day into the platform, making it one of the most reliable platforms for handling the enormous volumes of data we process every day.
To handle the tremendous growth of PayPal’s streaming data since its introduction, Kafka needed to scale seamlessly while ensuring high availability, fault tolerance, and optimal performance. In this blog post, we will provide a high-level overview of Kafka and discuss the steps taken to achieve high performance at scale while managing operational overhead, and our key learnings and takeaways.
揭秘eBay Kafka跨数据中心高可用方案
本文讨论了基于local-aggregation集群拓扑, 设计Kafka跨数据中心高可用方案的思路,同时支撑了上下游数据和服务的高可用和连续性。
Monitoring Apache Kafka with JMX Exporter and Kafka Exporter
At Mixpanel, we use Apache Kafka to ingest trillions of data points per month. Continuous and reliable monitoring of our Apache Kafka brokers is crucial to avoid any unexpected service degradation or loss of data.
Zero traffic cost for Kafka consumers
Coban, Grab’s real-time data streaming platform team, has been building an ecosystem around Kafka, serving all Grab verticals. Along with stability and performance, one of our priorities is also cost efficiency.
In this article, we explain how the Coban team has substantially reduced Grab’s annual cost for data streaming by enabling Kafka consumers to fetch from the closest replica.
成本低误差小,携程基于 Kafka 的 Serverless 延迟队列的实践
基于Serverless产品,轻松实现低成本的延迟队列。
浅谈kafka
当今大数据时代,高吞吐、高可靠成为了分布式系统中重要的指标。而Apache Kafka作为一个高性能、分布式、可扩展的消息队列系统,被越来越多的企业和开发者所关注和使用。
本文将介绍Kafka的基本概念,包括Kafka的架构、消息的存储和处理方式、Kafka的应用场景等,帮助读者快速了解Kafka的特点和优势。同时探讨Kafka的一些高级特性,如Kafka的配置、文件存储机制、分区 等,帮助读者更好地使用Kafka构建分布式系统和应用。
Kafka实时数据即席查询应用与实践
Kafka中的实时数据以Topic的概念进行分类存储,而Topic的数据有一定的时效性。在定位一些实时数据的Case时,如果没有对实时数据进行历史归档,在排查问题时,没有日志追述,会很难定位是哪个环节的问题。
消息队列之 MetaQ 和 Kafka 哪个更香!
本篇文章首先介绍MetaQ消息队列,然后介绍作者对MetaQ和Kafka这两个消息队列的理解。
基于Kafka和Elasticsearch构建实时站内搜索功能的实践
目前我们在构建一个多租户多产品类网站,为了让用户更好的找到他们所需要的产品,我们需要构建站内搜索功能,并且它应该是实时更新的。本文将会讨论构建这一功能的核心基础设施,以及支持此搜索能力的技术栈。
Kafka-SASL认证
本文介绍了kafka使用SASL安全认证的配置方式。
Zero trust with Kafka
Grab’s real-time data platform team, also known as Coban, has been operating large-scale Kafka clusters for all Grab verticals, with a strong focus on ensuring a best-in-class-performance and 99.99% availability.
Security has always been one of Grab’s top priorities and as fraudsters continue to evolve, there is an increased need to continue strengthening the security of our data streaming platform. One of the ways of doing this is to move from a pure network-based access control to state-of-the-art security and zero trust by default.
使用 Prometheus 监控 Kafka,我们该关注哪些指标
本文旨在分享阿里云Prometheus在阿里云Kafka和自建Kafka的监控实践。
如何更好地使用Kafka?
本文主要从Kafka消费、堆积、稳定性、预案、成本控制等角度等最佳实践。
新浪微博从 Kafka 到 Pulsar 的演变
新浪现有 Kafka 集群主要处理来自新浪新闻、微博等的数据,数据类型包括特征日志、订单数据、广告曝光、埋点 / 监控 / 服务日志等。这些数据经过 Kafka 在线集群、广告专用集群、日志集群、离线集群和机器学习训练等集群的处理后,会用于推荐训练、HDFS 落地、离线数仓、实时监控、数据报表和实时分析等生产目的。
Kafka 负载均衡在 vivo 的落地实践
Cruise Control作为Kafka的运维工具,它包含了Kafka服务上下线、集群内负载均衡、副本扩缩容、副本缺失修复以及节点降级等功能。