Kafka Topic Partitioning Strategies

Why partition your data in Kafka?

This post will show how the partitioning technique used by your producers influences what your consumers want to do with the data. If you have a lot of traffic and need to run multiple application instances, you’ll need to partition your data. How you split determines the load balancing for the downstream application. The producer clients decide which topic division the data belongs in, while the consumer apps’ actions drive the decision logic. Uncorrelated/random partitioning is the ideal partitioning approach to utilise if at all possible.

However, you may need to partition the data by an attribute if:

Consumers of the topic must aggregate data based on some point.
Customers require some order assurance.
Another resource is a bottleneck; thus, data must be shared.
You wish to concentrate data for storage and/or indexing purposes.

Random partitioning of Kafka data

This system handles the matching service’s input subject for our most CPU-intensive application. All matching service instances must be aware of all registered queries to match any event. Despite many attendees, the quantity of written questions is limited. For the time being, a single application instance can handle holding all of them in memory.

Customers with the evenest load distribution benefit from random segmentation, making scaling easier. It’s perfect for stateless or “embarrassingly parallel” services.

When you utilise the default partitioner without manually defining a partition or a message key, you’ll get this message. From version 2.4 onwards, Kafka’s default partitioner uses a “sticky” technique, which clusters all messages for a batch into the same random partition.

Partition by aggregate

However, on the topic consumed by the query aggregation service, we must partition by query identifier since we need all of the events we are collecting to end up in the same place.

We noticed that the top 1.5 percent of searches accounted for almost 90% of the events processed for aggregation after deploying the original version of the service. As you may expect, this resulted in some hot patches on the unfortunate partitions.

The aggregate service was split into two parts. We may randomly partition the first stage, partially aggregating the data, and then partition the final results per window by query ID. This method condenses the more significant streams in the first aggregation stage, making load balancing easier at the second level.

Of course, this technique comes with a cost in terms of resources. We spend more on network and service charges since we have to add write an extra hop to Kafka and split the service.

Planning for resource bottlenecks and storage efficiency

It’s critical to consider resource limitations and storage efficiency while deciding on a partition scheme.

Another service relies on several databases that have been split into shards, causing a resource bottleneck. We divide the topic into fragments based on how the databases break the bits. This method yields a result that resembles the diagram in our partition by aggregate example.Each consumer will only have access to the database shard to which it connects. As a result, difficulties with other database shards will not impact the instance’s ability to consume data from its partition. It will also be less share if the program retains a database-related state in memory. Naturally, this form of data segmentation is prone to hotspots.

Storage efficiency: The source topic shares a topic with the design that retains the event data indefinitely in our query processing system. It uses a separate consumer group to read all the same data. This page’s content organizes into sections based on the customer account to which it applies. We consolidate an account’s data into as few nodes as possible for storage and access efficiency. Many identities are compact enough to carry on a node, but others require multiple nodes. If an account grows too large, we use particular logic to distribute it between nodes, reducing the number of nodes as needed.

Consumer partition assignment

When a consumer joins or quits a group of consumers, the brokers rebalance the partitions between consumers, which means Kafka takes care of load balancing for you regarding the number of cells per application instance. This is fantastic—it’s a vital characteristic of Kafka’s work. Almost all of our offerings make use of customer groups.

When a rebalance occurs by default, all consumers lose their partitions and are reassigned new ones (known as the “eager” protocol). If your application has a state-linked with the consumed data, such as our aggregator service, you’ll need to remove it and start with data from the new partition.

You can utilise the StickyAssignor to reduce partition shuffle on stateful services. This assignor tries to maintain partition numbers allocated to the same instance as long as they stay in the group while still distributing partitions evenly across the members. Suppose partition changes are critical to the application’s logic. In that case, the consumer client code must track whether it has kept, lost, or gained partitions because cells permanently revoke at the start of a rebalance. For our aggregator service, we take this method.

I’d want to draw attention to a couple of other possibilities. The CooperativeStickyAssignor is available in Kafka Release 2.4 and beyond. Instead of removing all partitions at the beginning of a rebalancing, the consumer listener only gets the difference in divisions withdrawn as the rebalance progress. Rebalances as a whole take longer, and we need to optimise our application to reduce the time it takes to rebalance when a partition moves. That’s why, for our aggregator service, we stuck with the “eager” protocol beneath the StickyPartitioner. However, with Kafka version 2.5, we can continue consuming from partitions during a cooperative rebalance, so it’s worth revisiting.

Additionally, if customers continuously identify themselves as the same member, you may be able to take advantage of static membership, preventing a rebalance from occurring.

Instead of utilising a consumer group, you can assign partitions directly through the consumer client, which does not cause rebalances. Of course, in that situation, you must manually balance the cells and ensure that they are entirely used. We do this in circumstances where we’re utilising Kafka to snapshot state. We manually connect snapshot messages with the input topic partitions that our service reads.

Conclusion

Your data form and the processing your apps conduct will determine your partitioning strategies. As businesses grow, your techniques may need to change to accommodate the increased volume and variety of data. Consider where your architecture’s resource bottlenecks are, and distribute load among your data pipelines accordingly. The principle is the same whether CPU, database traffic, or storage space. Make the most of your limited/expensive resources.

About Enteros

Enteros offers a patented database performance management SaaS platform. It proactively identifies root causes of complex business-impacting database scalability and performance issues across a growing number of RDBMS, NoSQL, and machine learning database platforms.

The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.

Are you interested in writing for Enteros’ Blog? Please send us a pitch!

Real Estate IT Economics with Financial Precision: Enteros’ Cost Attribution Intelligence

9 February 2026
Database Performance Management

Introduction Real estate has always been an asset‑heavy, capital‑intensive industry. From commercial portfolios and residential developments to REITs and PropTech platforms, profitability depends on precise financial control. Yet while real estate organizations apply rigorous financial discipline to assets, leases, and investments, their IT and data environments often lack the same level of cost transparency. Modern … Continue reading “Real Estate IT Economics with Financial Precision: Enteros’ Cost Attribution Intelligence”

Managing Database Growth with Financial Precision: Enteros for Tech Leaders

Database Performance Management

Introduction For technology enterprises, databases are no longer just systems of record—they are engines of innovation. SaaS platforms, AI applications, digital marketplaces, analytics products, and customer-facing services all depend on rapidly growing databases that must scale continuously, remain highly performant, and stay available around the clock. But as database environments grow, so do costs. Cloud … Continue reading “Managing Database Growth with Financial Precision: Enteros for Tech Leaders”

From Performance to Profitability: Enteros Database Intelligence for Real Estate Enterprises

8 February 2026
Database Performance Management

Introduction The real estate sector has undergone a dramatic transformation over the past decade. What was once an asset-heavy, relationship-driven industry is now deeply digital, data-intensive, and platform-centric. Property listing portals, smart building platforms, tenant experience apps, valuation engines, AI-driven pricing models, IoT-enabled facilities management systems, and digital transaction platforms all rely on complex, always-on … Continue reading “From Performance to Profitability: Enteros Database Intelligence for Real Estate Enterprises”

Running Retail on Data: How Enteros Transforms Database Performance Management

Database Performance Management

Introduction Retail has evolved far beyond physical stores and point-of-sale systems. Today’s retail enterprises operate complex, always-on digital ecosystems that span e-commerce platforms, mobile apps, omnichannel order management, supply chain systems, loyalty programs, personalization engines, and real-time analytics. Every product search, cart update, inventory check, price change, promotion, and payment depends on high-performing databases working … Continue reading “Running Retail on Data: How Enteros Transforms Database Performance Management”