Pinterest Tiered Storage for Apache Kafka®️: A Broker-Decoupled Approach

Jeff Xiang | Senior Software Engineer, Logging Platform; Vahid Hashemian | Staff Software Engineer, Logging Platform

When it comes to PubSub solutions, few have achieved higher degrees of ubiquity, community support, and adoption than Apache Kafka®️, which has become the industry standard for data transportation at large scale. At Pinterest, petabytes of data are transported through PubSub pipelines every day, powering foundational systems such as AI training, content safety and relevance, and real-time ad bidding, bringing inspiration to hundreds of millions of Pinners worldwide. Given the continuous growth in PubSub-dependent use cases and organic data volume, it became paramount that PubSub storage must be scaled to meet growing storage demands while lowering the per-unit cost of storage.

Tiered Storage is a design pattern that addresses this problem by offloading data typically stored on broker disk to a cheaper remote storage, such as Amazon S3®️. This allows the brokers themselves to keep less data on expensive local disks, reducing the overall storage footprint and cost of PubSub clusters. MemQ is a PubSub solution that maximally employs this design pattern by keeping all data in object storage, eliminating the need for local disk storage to decouple storage from compute.

KIP-405 adopts the Tiered Storage design pattern for open-source Kafka (available in Kafka 3.6.0+). It details a broker-coupled implementation, which natively integrates Tiered Storage functionality into the broker process itself.

At Pinterest, we productionalized and are now open-sourcing our implementation of Tiered Storage for Apache Kafka®️, which is decoupled from the broker. This brings the benefits of storage-compute decoupling from MemQ to Kafka and unlocks key advantages in flexibility, ease of adoption, cost reduction, and resource utilization when compared to the native implementation in KIP-405.

With 20+ production topics onboarded since May 2024, our broker-decoupled Tiered Storage implementation currently offloads ~200 TB of data every day from broker disk to a cheaper object storage. In this blog, we share the approach we took and the learnings we gained.

Why Tiered Storage?

Data sent through Kafka is temporarily stored on the broker’s disk, replicated across followers for each partition, and removed once the data exceeds the configured retention time or size threshold. This means that the cost of Kafka storage is a function of data volume, retention, and replication factor. Data volume growth is typically organic in nature, while retention and replication factor are often rigid and necessary for non-negotiables such as fault tolerance and recovery. When faced with growth in any of those variables, horizontal or vertical scaling of Kafka clusters is needed to support higher storage demands.

Traditionally, horizontal scaling of Kafka clusters involved adding more brokers to the cluster in order to increase the total storage capacity, while vertical scaling involved replacing existing brokers with new ones that have higher storage capacity. This meant that the total storage cost of scaling up Kafka clusters grew as a factor of the per-unit storage cost to store data on broker disk. This fact is evident through the following equations:

totalCost = costPerGB * totalGB * replicationFactor

totalGB = GBperSecond * retentionSeconds

Substituting the second equation into the first results in:

totalCost = costPerGB * GBperSecond * retentionSeconds * replicationFactor

As mentioned previously, we usually do not have control over throughput (GBperSecond), retention (retentionSeconds), or replication factor. Therefore, lowering the total cost of storage is most effectively achieved by decreasing the per-unit storage cost (costPerGB). Tiered Storage achieves this by offloading data from expensive broker disks to a cheaper remote storage.

Decoupling from the Broker

The native Tiered Storage offering in Apache Kafka®️ 3.6.0+ incorporates its features into the broker process itself, resulting in an inseparable coupling between Tiered Storage and the broker. While the tight coupling approach allows the native implementation of Tiered Storage to access Kafka internal protocols and metadata for a highly coordinated design, it also comes with limitations in realizing the full potential of Tiered Storage. Most notably, integrating Tiered Storage into the broker process means that the broker is always in the active serving path during consumption. This leaves behind the opportunity to leverage the remote storage system as a secondary serving path.

To address this, we applied the MemQ design pattern to Tiered Storage for Apache Kafka®️ by decoupling Tiered Storage from the broker, allowing for direct consumption from the remote storage. This delegates the active serving path to the remote storage, freeing up resources on the Kafka cluster beyond just storage and dramatically reducing the cost of serving. It also provides a higher degree of flexibility in adopting Tiered Storage and applying feature updates. The table below illustrates several key advantages of a broker-decoupled approach compared against the native implementation.

Design & Implementation

Figure 1: Architecture Overview

The implementation of broker-decoupled Tiered Storage consists of three main components:

  1. Segment Uploader: A continuous process that runs on each broker and uploads finalized log segments to a remote storage system
  2. Tiered Storage Consumer: A consumer client capable of reading data from both remote storage and local broker disk
  3. Remote Storage: A storage system that should have a per-unit storage cost that is lower than that of broker disk and supports operations necessary to the Segment Uploader and Tiered Storage Consumer

The diagram shown in Figure 1 depicts how the Segment Uploader and Tiered Storage Consumer interact with remote storage, as well as other relevant systems and processes. Let’s dive deeper into each of these components.

Segment Uploader

Figure 2: Segment Uploader Architecture

The Segment Uploader is an independent process that runs on every broker as a sidecar in a Kafka cluster that has enabled Tiered Storage. Its primary responsibility is to upload finalized log segments for the partitions that the broker leads. Achieving this objective while maintaining decoupling from the broker process requires designing these critical mechanisms:

  • Log directory monitoring
  • Leadership change detection
  • Fault tolerance

Log Directory Monitoring

The Segment Uploader’s function is to upload data residing on its local broker’s disk to a remote storage system. Due to the fact that it runs independently of the broker process, the Segment Uploader monitors the broker file system for indications signaling that certain data files are ready for upload. Let’s dive into how this monitoring mechanism works.

The Kafka broker process writes data to a local directory specified by the broker configuration log.dir. The directory structure and content of log.dir is maintained by the broker process and typically looks like the following:

<log.dir>
| - - - topicA-0
| - - - 00100.index
| - - - 00100.log
| - - - 00100.timeindex
| - - - 00200.index
| - - - 00200.log
| - - - 00200.timeindex
| - - - 00300.index ← active
| - - - 00300.log ← active
| - - - 00300.timeindex ← active
| - - - topicA-3 | - - -

| - - - topicB-1

| - - -

| - - - topicB-10

| - - -

This directory contains all the topic-partitions that this broker is a leader or follower for. The broker process writes data it receives from producer applications into log segment files (files ending in .log), as well as corresponding indexing metadata into .index and .timeindex files that allow for efficient lookups. The files names correspond to the earliest offset contained in the segment. Incoming data is written to the active log segment, which is the segment with the largest offset value.

The Segment Uploader uses a file system watcher on the topic-partition directories that this broker is currently leading to monitor events on those directories. File system events indicate to the Segment Uploader when a log segment is finalized (i.e. it is rotated and no longer receiving new data) and ready to upload. To do this, the Segment Uploader maintains a set of active log segments for each topic-partition that it is watching and uploads the most recently rotated segment upon detecting a rotation event.

In the example above, suppose that the broker is a leader for topicA-0 (topicA, partition 0), and 00300 is the active segment for topicA-0. Let’s also suppose that 00100 and 00200 are already uploaded to the remote storage, and each log segment contains 100 offsets. The Kafka broker process will continue to write incoming data into 00300 until it reaches the configured size or time threshold specified by broker configurations. In most cases, segments are rotated via the size threshold, but time-based rotations can occur when the throughput is low enough where the size threshold cannot be reached before the topic’s retention.ms configuration takes effect. Upon reaching either threshold, 00300 will be finalized and closed, while a new active segment (e.g. 00400) will be created. This triggers a file system event which notifies the Segment Uploader that 00300 is ready to upload. The Segment Uploader then enqueues all three files for 00300 for upload to remote storage and sets 00400 as the current active segment for topicA-0.

<log.dir>
| - - - topicA-0
| - - - 00100.index
| - - - 00100.log
| - - - 00100.timeindex
| - - - 00200.index
| - - - 00200.log
| - - - 00200.timeindex
| - - - 00300.index ← enqueue & upload
| - - - 00300.log ← enqueue & upload
| - - - 00300.timeindex ← enqueue & upload
| - - - 00400.index ← active
| - - - 00400.log ← active
| - - - 00400.timeindex ← active

Upon the successful completion of the uploads, the Segment Uploader will upload an offset.wm file whose content is the last successfully uploaded offset for that topic-partition. This mechanism commits the progress of the Segment Uploader and allows it to recover from restarts and interruptions by resuming uploads only for segments that came after the last committed offset.

Leadership Change Detection

Decoupling the Segment Uploader from the broker process necessitates a reliable mechanism for discovering topic-partition leadership and detecting changes in real time. The Segment Uploader achieves this by monitoring the ZooKeeper endpoints for the Kafka cluster, which is updated and maintained by the Kafka cluster’s controller.

Upon startup, the Segment Uploader bootstraps the current state by reading the /brokers/topics/<topic>/partitions/<partition>/state path in ZooKeeper, which informs the Segment Uploader about the topic-partitions that the broker is currently leading. The Segment Uploader correspondingly places a file system watch for those topic-partition subdirectories in log.dir.

Leadership changes are reflected in real time under the same ZooKeeper path. When a broker becomes the leader of a topic-partition, it must be already considered an in-sync replica (ISR), which means that it is fully caught up on replicating from the original leader. This also means that the Segment Uploader can be confident that when the Kafka cluster controller updates the leadership metadata in ZooKeeper, the Segment Uploader on the new leader can immediately place a watch on the topic-partition directory, while the Segment Uploader on the old leader can remove the watch.

In Apache Kafka®️ 3.3+, ZooKeeper is replaced by KRaft. As such, the Segment Uploader will need to monitor leadership via KRaft when deployed alongside the newest Kafka versions. Note that KRaft support in the Segment Uploader is currently under development.

Fault Tolerance

In Tiered Storage, missed uploads means data loss. As such, the main objective of fault tolerance in the Segment Uploader is to guarantee the continuity of data in the remote storage by ensuring that there are no missed uploads. Since the Segment Uploader is decoupled from the broker process, designing it for fault tolerance requires special attention.

The most common risk factors which might compromise data integrity of the uploaded log segments come primarily from the following areas:

  • Transient upload failures
  • Broker or Segment Uploader unavailability
  • Unclean leader election
  • Log segment deletion due to retention

Transient upload failures are usually mitigated by the Segment Uploader’s retry mechanism. Broker or Segment Uploader downtime can be recovered via the last committed offset specified by offset.wm as described in the previous section. Unclean leader election results in data loss on the broker level, so we accept this as a built-in risk when unclean leader election is enabled.

The most interesting problem is the last one — how can the Segment Uploader prevent missed uploads when log segment management and deletions are performed separately by the broker process?

The decoupling between Segment Uploader and the broker process implies that the Segment Uploader must upload every rotated log segment before it is cleaned up by Kafka retention policies, and the mechanism to do so must rely on signals outside of the Kafka internal protocol. Doing so while maintaining decoupling from the broker requires careful consideration of how log segments are managed by the broker.

Preventing Missed Uploads Due to Log Segment Deletion

To better understand how the Segment Uploader prevents missed uploads due to log segment deletion, let’s first explore how a log segment is managed by the Kafka broker’s LogManager. A given log segment stored on local Kafka broker disk undergoes four phases in its lifecycle: active, rotated, staged for deletion, and deleted. The timing of when a log segment transitions between these phases is determined by data throughput, as well as several configuration values at the topic and broker levels. The diagram below explains the phase transitions visually:

Figure 3: Log Segment Lifecycle & Timing

The broker configuration log.segment.bytes (default 1G) determines the size threshold for each log segment file. Once an active segment is filled to this threshold, the segment is rotated and a new active segment is created to accept subsequent writes. Thereafter, the rotated segment remains on the local broker’s disk until the topic-level retention thresholds of either retention.ms (time threshold) or retention.bytes (size threshold) is reached, whichever happens first. Upon reaching the retention threshold, the Kafka broker stages the segment for deletion by appending a “.deleted” suffix to the segment filename, and remains in this state for the duration specified by the broker configuration log.segment.delete.delay.ms (default 60 seconds). Only after this does the segment get permanently deleted.

A log segment is only eligible for upload after it is rotated and will no longer be modified. When a log segment is deleted in the final phase, it is considered permanently lost. Therefore, the Segment Uploader must upload the log segment while it is in the rotated or staged for deletion phases. Under normal circumstances, the Segment Uploader will upload the most recently-rotated log segment within one to two minutes of the segment’s rotation, as it is immediately enqueued for upload upon rotation. As long as the topic retention is large enough for the Segment Uploader to have some buffer room for the upload, this typically does not present a problem. The diagram below compares the time sequences of the segment lifecycle and the triggering of uploads.

Figure 4: Log segment lifecycle vs. Segment Uploader uploads

This works well in practice provided that log segments are rotated on a size-based threshold defined by log.segment.bytes. However, in some circumstances, the log segments might be rotated on a time-based threshold. This scenario occurs when a topic-partition receives low enough traffic where it does not have enough data to fill up log.segment.bytes within the configured retention.ms. In this situation, the broker rotates the segment and simultaneously stages it for deletion when the segment’s last-modified timestamp is beyond the topic’s retention.ms. This is illustrated in the following diagram:

Figure 5: Segment rotation due to time-based threshold

It is imperative that the Segment Uploader is able to upload during the time that it is in the staged for deletion phase, the duration of which is determined by the broker configuration log.segment.delete.delay.ms. Moreover, the segment filename upon rotation is different from the normal scenario due to the appended “.deleted” suffix, so attempting to upload the segment with the regular filename (without the suffix) will result in a failed upload. As such, the Segment Uploader retries the upload with a “.deleted” suffix upon encountering a FileNotFoundException. Additionally, the broker configuration of log.segment.delete.delay.ms should be adjusted to a slightly higher value (e.g. 5 minutes) to provide more buffer room for the Segment Uploader to complete the upload.

It is worth mentioning that the above scenario with low volume topics is generally not a concern because the benefits of Tiered Storage are most effectively achieved when applied to high volume topics.

Tiered Storage Consumer

The Segment Uploader is only part of the story — data residing in remote storage is only useful if it can be read by consumer applications. Many advantages of a broker-decoupled approach (i.e. Pinterest Tiered Storage) are realized on the consumption side, such as bypassing the broker in the consumption path to save on compute and cross-AZ network transfer costs. Tiered Storage Consumer comes out-of-the-box with the capability of reading data from both remote storage and the broker in a fully transparent manner to the user, bringing the theoretical benefits of broker-decoupled Tiered Storage from concept to reality.

Consumer Architecture

Figure 6: How Tiered Storage Consumer works

Tiered Storage Consumer is a client library that wraps the native KafkaConsumer client. It delegates operations to either the native KafkaConsumer or the RemoteConsumer depending on the desired serving path and where the requested data is stored. Tiered Storage Consumer accepts native KafkaConsumer configurations, with some additional properties that users can supply to specify the desired behavior of Tiered Storage Consumer. Most notably, the user should specify the mode of consumption, which is one of the following:

  • Remote Only: Bypass the broker during consumption and read directly and only from remote storage.
  • Kafka Only: Only read from Kafka brokers and never from remote storage (this is the same as a regular KafkaConsumer).
  • Remote Preferred: If the requested offset exists in both remote and broker, read from remote.
  • Kafka Preferred: If the requested offset exists in both remote and broker, read from broker.

Figure 7: Consumption modes and serving path for requested offsets

This section will focus on the design and behavior of Tiered Storage Consumer when reading from remote storage.

Reading From Remote Storage

Consumer Group Management

Tiered Storage Consumer leverages the existing functionalities of Kafka’s consumer group management, such as partition assignment (via subscription) and offset commits, by delegating those operations to the KafkaConsumer regardless of its consumption mode. This is the case even when consuming from remote storage. For example, a Tiered Storage Consumer in Remote Only mode can still participate in group management and offset commit mechanisms the same way that a regular KafkaConsumer does, even when the offsets are only available in the remote storage system and cleaned up on the broker.

Storage Endpoint Discovery

When reading from remote storage, Tiered Storage Consumer needs to know where the data resides for the particular topic-partition that the consumer is attempting to read. The storage endpoint for any particular topic-partition is provided by the user-defined implementation of StorageServiceEndpointProvider, which should be shared between the Segment Uploader and the Tiered Storage Consumer. The StorageServiceEndpointProvider class name is provided in the Tiered Storage Consumer configurations, which deterministically constructs a remote storage endpoint for a given topic-partition. In practice, the same implementation of StorageServiceEndpointProvider should be packaged into the classpaths of both the Segment Uploader and the consumer application to guarantee that constructed endpoints are consistent between the two.

Figure 8: StorageServiceEndpointProvider usage

With the remote endpoints constructed upon subscribe() or assign() calls, a Tiered Storage Consumer in its consumption loop will delegate read responsibilities to either the KafkaConsumer or the RemoteConsumer, depending on the user-specified consumption mode and where the requested offsets reside. Reading from remote storage is possible in every consumption mode except Kafka Only.

Let’s walk through the details of Kafka Preferred and Remote Only consumption modes. Details for the other two consumption modes are assumed to be self-explanatory after understanding these two.

Kafka Preferred Consumption

When in Kafka Preferred consumption mode, Tiered Storage Consumer delegates read operations first to the native KafkaConsumer, then to the RemoteConsumer if the desired offsets do not exist on the Kafka broker. This mode allows for consumer applications to read data in near real-time under normal circumstances, and read earlier data from remote storage if the consumer is lagging beyond the earliest offsets on Kafka brokers.

Figure 9: Kafka Preferred consumption mode

When the requested offset is not on the Kafka broker, the KafkaConsumer’s poll call throws a NoOffsetForPartitionException or OffsetOutOfRangeException. This is caught internally by the Tiered Storage Consumer in Kafka Preferred mode, which then delegates the RemoteConsumer to try and find the data from remote storage. If the offset exists in remote storage, Tiered Storage Consumer returns those records to the application layer after directly fetching those records from the remote storage system, skipping the broker when accessing the actual data.

Remote Only Consumption — Skip the Broker

When in Remote Only consumption mode, Tiered Storage Consumer solely delegates the read operations to the RemoteConsumer. The RemoteConsumer will directly request data from the remote storage based on the constructed endpoints for the assigned topic-partitions, allowing it to avoid contacting the Kafka cluster directly during the consumption loop except for offset commits and group rebalances / partition assignments, which still rely on Kafka’s internal consumer group and offset management protocols.

Figure 10: Remote Only consumption mode

When skipping the broker in Remote Only consumption mode, the full set of benefits in broker-decoupled Tiered Storage is realized. By rerouting the serving path to the remote storage system instead of the broker, consumer applications can perform historical backfills and read older data from remote storage while leveraging only the compute resources of the remote storage system, freeing up those resources from the Kafka broker. Simultaneously, depending on the pricing model of the remote storage system, cross-AZ network transfer cost can be avoided (e.g. Amazon does not charge for bandwidth between S3 and EC2 in the same region). This pushes the benefits of adopting Tiered Storage beyond just storage cost savings.

Remote Storage

The choice of the remote storage system backing Tiered Storage is critical to the success of its adoption. For this reason, special attention should be paid to the following areas when evaluating a remote storage system:

  • Interface compatibility and support for operations necessary to Tiered Storage
  • Pricing and mechanisms of data storage, transfer, replication, and lifecycle management
  • Scalability and partitioning

Interface Compatibility

At a minimum, the remote storage system should support the following generic operations over the network:

void putObject(byte[] object, Path path);

List<String> listObjectNames(Path path);

byte[] getObject(Path path);

Some more operations are generally needed for improved performance and scalability. These include, but are not limited to:

Future putObjectAsync(byte[] object, Path path, Callback callback);

InputStream getObjectInputStream(Path path);

Clearly, in-place updates and modifications to uploaded log segments are unnecessary. For this reason, object storage systems are usually preferred for their scalability and cost benefits. An eligible and compatible remote storage system can be added to Pinterest Tiered Storage as long as they are able to implement the relevant interfaces in the Segment Uploader and Tiered Storage Consumer modules.

Data Storage, Transfer, Replication, & Lifecycle Management

The purpose of adopting broker-decoupled Tiered Storage is to lower the cost and resources of storage and serving on the Kafka cluster. Therefore, it is important to understand the technical mechanisms and pricing model of the remote storage system when it comes to operations offloaded from the Kafka cluster to the remote storage. These include data storage, transfer, replication, and lifecycle management.

Most of the common remote storage systems with widespread industry adoption have publicly available documentation and pricing for each of those operations. It is important to note that the savings and benefits that can be achieved with Tiered Storage adoption is critically dependent on holistic evaluation of these factors of the remote storage system, in tandem with existing case-specific factors such as data throughput, read / write patterns, desired availability and data consistency, locality and placement of services, etc.

Unlike the native implementation of Tiered Storage in KIP-405, the lifecycle management of data uploaded to remote storage in this broker-decoupled implementation is delegated to native mechanisms on the remote storage system. For this reason, the retention of uploaded data on remote storage should be configured according to the mechanisms available to the remote storage system of choice.

Scalability & Partitioning

Writing data to a remote storage system and serving consumption using its resources requires preparations for scale. The most common bottleneck on the remote storage system comes from compute resources, which is typically enforced via request rate limits. For example, Amazon S3®️ details its request rate limits for both reads and writes on a per partitioned prefix basis. For Tiered Storage to operate at large scale, it is critical that the remote storage is pre-partitioned in a thoughtful manner that evenly distributes request rates across the remote storage partitions in order to avoid hotspots and rate limiting errors.

Taking Amazon S3®️ as an example, partitioning is achieved via common prefixes between object keys. A bucket storing Tiered Storage log segments should ideally be pre-partitioned in a way that evenly distributes request rate load across partitions. To do so, the object keyspace must be designed in such a way that allows for prefix-based partitioning, where each prefix-partition receives similar request rates as every other prefix-partition.

In Tiered Storage, log segments uploaded to S3 generally adhere to a keyspace that looks like the following examples:

topicA-0 - - - -

s3://my-bucket/custom-prefix/kafkaCluster1/topicA-0/00100.log

s3://my-bucket/custom-prefix/kafkaCluster1/topicA-0/00200.log
s3://my-bucket/custom-prefix/kafkaCluster1/topicA-0/00300.logtopicA-1 - - - -

s3://my-bucket/custom-prefix/kafkaCluster1/topicA-1/00150.log

s3://my-bucket/custom-prefix/kafkaCluster1/topicA-1/00300.logtopicB-0 - - - -

s3://my-bucket/custom-prefix/kafkaCluster1/topicB-0/01000.log

s3://my-bucket/custom-prefix/kafkaCluster1/topicB-0/02000.log

The Kafka cluster name, topic name, and Kafka partition ID are all part of the key. This is so that the Tiered Storage Consumer can reconstruct the prefix solely based on those pieces of information when assigned to those topic-partitions. However, what if topicA receives much higher read and write traffic than topicB? The above keyspace scheme does not allow for S3 prefix partitioning in a way that evenly spreads out request rate load, and so the S3 prefix-partition custom-prefix/kafkaCluster1/topicA becomes a request rate hotspot.

The solution to this problem is to introduce prefix entropy into the keyspace in order to randomize the S3 prefix-partitions that host data across different topics and partitions. The concept of partitioning the remote storage via prefix entropy was introduced in MemQ and has been battle-tested in production for several years.

To support this in Tiered Storage, the Segment Uploader allows users to configure a value for ts.segment.uploader.s3.prefix.entropy.bits which injects an N-digit MD5 binary hash to the object key. The hash is calculated from the cluster name, topic name, and Kafka partition ID combination. Assuming that N=5, we get the following keys for the same examples above:

topicA-0 - - - -

s3://my-bucket/custom-prefix/01011/kafkaCluster1/topicA-0/00100.log

s3://my-bucket/custom-prefix/01011/kafkaCluster1/topicA-0/00200.log
s3://my-bucket/custom-prefix/01011/kafkaCluster1/topicA-0/00300.logtopicA-1 - - - -

s3://my-bucket/custom-prefix/01010/kafkaCluster1/topicA-1/00150.log

s3://my-bucket/custom-prefix/01010/kafkaCluster1/topicA-1/00300.logtopicB-0 - - - -

s3://my-bucket/custom-prefix/11100/kafkaCluster1/topicB-0/01000.log

s3://my-bucket/custom-prefix/11100/kafkaCluster1/topicB-0/02000.log

With this new keyspace, if topicA receives much higher request rates than other topics, the request load is spread evenly between different S3 prefix-partitions, assuming that each of its Kafka partitions receives relatively even read and write throughput. Applying this concept across a large number of Kafka clusters, topics, and partitions will statistically lead to an even distribution of request rates between S3 prefix-partitions.

When constructing a Tiered Storage Consumer, the user must supply the same N value as the Segment Uploader so that the consumer is able to reconstruct the correct key for each topic-partition it is assigned to.

Prefix-partitioning my-bucket with N=5

Resulting 32 prefix-partitions:

custom-prefix/00000
custom-prefix/00001
custom-prefix/00010
custom-prefix/00011

custom-prefix/11111

Next Steps

Decoupling from the broker means that Tiered Storage feature additions can be rolled out and applied without needing to upgrade broker versions. Here are some of the features that are currently planned:

  • Integration with PubSub Client, a backend-agnostic client library
  • Integration with Apache Flink®️ (via PubSub Client integration)
  • Support for more remote storage systems (e.g. HDFS)
  • Support for Parquet log segment storage format to enable real-time analytics (dependent on adoption of KIP-1008)

Check It Out!

Pinterest Tiered Storage for Apache Kafka®️ is now open-sourced on GitHub. Check it out here! Feedback and contributions are welcome and encouraged.

Acknowledgements

The current state of Pinterest Tiered Storage for Apache Kafka®️ would not have been possible without significant contributions and support provided by Ambud Sharma, Shardul Jewalikar, and the Logging Platform team. Special thanks to Ang Zhang and Chunyan Wang for continuous guidance, feedback, and support.

Disclaimer

Apache®️, Apache Kafka®️, Kafka®️, Apache Flink®️, and Flink®️ are trademarks of the Apache Software Foundation (https://www.apache.org/).
Amazon®️, AWS®️, S3®️, and EC2®️ are trademarks of Amazon.com, Inc. or its affiliates.

To learn more about engineering at Pinterest, check out the rest of our Engineering Blog and visit our Pinterest Labs site. To explore and apply to open roles, visit our Careers page.

首页 - Wiki
Copyright © 2011-2025 iteam. Current version is 2.146.0. UTC+08:00, 2025-10-23 14:51
浙ICP备14020137号-1 $访客地图$