Kafka used as Delivery Mechanism in News Feed

439 Views Asked by At

Can I create topics called update_i for different kinds of updates and partition them using user_id in a Kafka MQ ? I've been through this post by confluent.io: https://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/ . Also, I know that I cannot create a topic with dynamic number of partitions. These two facts (the post and static number of Kafka partitions). What's the delivery mechanism alternative ?

2

There are 2 best solutions below

0
On

Can I create topics called update_i for different kinds of updates and partition them using user_id in a Kafka MQ ?

If I understand you correctly, the answer is Yes.

What you would need to do in a nutshell:

  • Topic configuration: Determine the required number of partitions for your topic(s). Usually, the number of partitions is determined based on (1) anticipated scale/volume of the incoming data, i.e. the Write-side of scaling, and/or (2) the required parallelism when consuming the messages for processing, i.e. the Read-side of scaling. See https://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/ for details.

  • Writing messages to these Kafka topics (aka the side of the "Kafka producer"): In Kafka, messages are key-value pairs. In your case, you would set the message key to be the user_id. Then, when using Kafka's default "partitioner", messages for the same message key (here: user_id) would automatically be sent to the same partition -- which is what you want to achieve.

0
On

As a possible solution I would suggest to create a number of partitions, and then setup producers to select partition using the following rule

user_id mod <number_of_partitions>

That will allow you to keep order of messages for particular user_id.

Then, If you need to have a consumer that processes only messages for particular user_id, you can write a (low-level) consumer that will read a particular partition and process only messages that are sent for a particular customer and ignore all other messages.