- +1-408-508-6255
- 830 Hillview Court, Suite 100 Milpitas, CA 95035
- info@prophetstor.com
In a Kafka-based application, messages for specific topics are generated from some producers, and sent to the Kafka brokers. The brokers perform required replications and distribute messages to the consumers of the respective topics. After receiving messages from the brokers, a consumer will perform some tasks and let the brokers know the messages have been committed (or consumed). The zookeepers maintain the offset of the last message sent by a producer for a topic, and the offset of the last committed message notified by a consumer for a topic. When there is a burst of messages received by the brokers, the messages will be stored in the queues longer if a consumer cannot process the messages fast enough, affecting overall application performance. In order to handle the dynamic nature of message production rate, HPA or Horizontal Pod Autotscalling of the Kafka Consumers is used to scale the number of Kafka Consumers so that the production and consumption rates of a topic are matched while using a reasonable number of consumer replicas (minimize resource costs). At the same time, HPA also needs to maintain a low latency of processing messages, which is a KPI or Key Performance Index of Kafka Consumers. In particular, we are calibrating the number of replicas with the following trade-offs:
A Kubernetes HPA controller is a controller that can determine the number of pods of a deployment, a replica set, or a stateful set. The HPA controller measures the relevant metrics to determine the number of pods required to meet the criteria as defined the HPA’s configuration, which is implemented as API resource with information like the desiredMetricValue. A native HPA controller supports the following types of metrics to determine how to scale: