Question

How to configure Kafka connector to include multiple partitions when running in custom mode

  • 12 September 2019
  • 2 replies
  • 1 view

Badge

I am using Kafka connector in my FME job to consume data from Kafka topic. On this job uses this particular consumer group. In case of restart scenario, I want to run the job configuring kafka connector to read from a specific topic. While I do that I need to mention partition number in Kafka connector. My topic has multiple partitions. So how would I tell Kafka connector that it should consider all partitions for restart.


2 replies

Badge

Hi @rbeerak,

If you use the earliest or latest offset option the consumer will consume from all partitions of your topic consuming either from the beginning or from the last committed offset when restarted.

You only have to specify a partition number if you want to consume from a specific offset. The custom offset is specific to a single topic and a single partition.

I do realize that our implementation is somehow limiting in this scenario as you would have to use one consumer (instance of a KafkaConnector) per topic, offset & partition tuple, which is not really a good workaround. So at the moment, the custom offset option is really just useful to get a specific message on a single partition.

Can you share more details about the use case in your scenario? Specifically why you choose to use a custom offset instead of the earliest/latest option?

I want to make sure that any enhancements we add regarding this do make sense for your scenario.

Thanks!

 

 

Badge

Thanks for getting back. Here is my use case.

 

I have a kafka topic which has 4 partitions. I am using Batch mode (instead of stream) with earliest offset so I run the job every 5 minutes consuming a specific number of messages so I get the data close to near real time.

In case of error scenario where I need to rerun the job mentioning a specific offset for each partition based on a last committed offset, I am not sure how to configure Kafka connector. Can you please help with a work around so we build a on demand job for error handling.

 

Reply