I am using Kafka connector to consume data from Kafka topic.Kafka topic has 5 partitions.I want to configure FME job in such a way that Kafka connector consumes data from topic at partition level making partition as unit of parallelism for better speed. Is Kafka connector has inbuilt feature to handle partition parallelism hiding from developer. That way I can be assured of parallelism taking place with single kafka connector in the job. Please confirm.
configuring kafka connector to read data concurrently from multiple partitions of a topic
Best answer by gerhardatsafe
Hi @rbeerak,
The KafkaConnector will use all available partitions by default if the starting offset is set to earliest or latest.
To scale for faster throughput a workspace consuming from a Kafka topic can be published to FME Server and then submitted to run on multiple engines in parallel. As long as all running workspaces have the same consumer group id specified in the KafkaConnector they will not consume duplicate but spread a load of consumed messages among FME Server jobs. To make sure a job is restarted immediately in a case of a crash the RTC (run til canceled) option should be used in the advanced Run Workspace parameters (https://docs.safe.com/fme/html/FME_Server_Documentation/WebUI/Run-Workspace.htm).
I hope this answers your questions and helps!
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.
