Kafka Confluent K8s PART — I Single Cluster

3 min readJun 11, 2024

Kafka is an open source, distributed publish-subscribe messaging system for handling high-volume, high-throughput, and real-time streaming data. You can use Kafka to build streaming data pipelines that move data reliably across different systems and applications for processing and analysis.

This platform administrators, cloud architects, and operations professionals interested in deploying Kafka clusters.

You can also use the CFK operator to deploy other components of the Confluent Platform, such as the web-based Confluent Control center, Schema Registry, or KsqlDB.

Plan and deploy GKE infrastructure for Apache Kafka
Deploy and configure the CFK operator
Configure Apache Kafka using the CFK operator to ensure availability, security, observability, and performance
Automated rolling updates for configuration changes.
Automated rolling upgrades with no impact to Kafka availability.
If a failure occurs, CFK restores a Kafka Pod with the same Kafka broker ID, configuration, and persistent storage volumes.
Automated rack awareness to spread replicas of a partition across different racks (or zones), improving availability of Kafka brokers and limiting the risk of data loss.

helm repo add confluentinc https://packages.confluent.io/helm

helm repo update

kubectl create ns kafka

helm install confluent-operator confluentinc/confluent-for-kubernetes -n kafka

helm ls -n kafka

Three replicas of Kafka brokers, with a minimum of two available replicas required for cluster consistency.
Three replicas of ZooKeeper nodes, forming a cluster.
Two Kafka listeners: one without authentication, and one utilizing TLS authentication with a certificate generated by CFK.
Tolerations, nodeAffinities, and podAntiAffinities configured for each workload, ensuring proper distribution across nodes, utilizing their respective node pools and different zones.
Communication inside the cluster secured by self-signed certificates using a Certificate Authority that you provide.

Generate CA Pair with Openssl.

kubectl create secret tls sslcerts --cert=ca.pem --key=ca-key.pem -n kafka

vi kafka-cluster.yaml

---
apiVersion: platform.confluent.io/v1beta1
kind: Kafka
metadata:
  name: kafka-cluster-confluent
spec:
  replicas: 3
  tls:
    autoGeneratedCerts: true
  image:
    application: confluentinc/cp-server:7.4.0
    init: confluentinc/confluent-init-container:2.6.0
  dataVolumeCapacity: 50Gi
  storageClass:
    name: premium-rwo
  configOverrides:
    server:
      - offsets.topic.replication.factor=3
      - transaction.state.log.replication.factor=3
      - transaction.state.log.min.isr=2
      - default.replication.factor=3
      - min.insync.replicas=2
      - auto.create.topics.enable=true
  listeners:
    custom:
    - name: tls 
      port: 9093
      tls:
        enabled: true
  podTemplate:
    tolerations:
    - key: "app.stateful/component"
      operator: "Equal"
      value: "kafka-broker"
      effect: NoSchedule
    affinity:
      nodeAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 1
          preference:
            matchExpressions:
            - key: "app.stateful/component"
              operator: In
              values:
              - "kafka-broker"
    topologySpreadConstraints:
    - maxSkew: 1
      topologyKey: "topology.kubernetes.io/zone"
      whenUnsatisfiable: DoNotSchedule
      labelSelector:
        matchLabels:
          app: kafka-cluster-confluent
          clusterId: kafka
          platform.confluent.io/type: kafka
    envVars:
    - name: KAFKA_HEAP_OPTS
      value: "-Xmx4G -Xms4G"
    resources:
      requests:
        memory: 3Gi
        cpu: "1"
      limits:
        memory: 4Gi
        cpu: "2"
    probe:
      readiness:
        failureThreshold: 15
  dependencies:
    zookeeper:
      endpoint: zookeeper.kafka.svc.cluster.local:2182
      tls:
        enabled: true
---
apiVersion: platform.confluent.io/v1beta1
kind: Zookeeper
metadata:
  name: zookeeper
spec:
  replicas: 3
  tls:
    autoGeneratedCerts: true
  image:
    application: confluentinc/cp-zookeeper:7.4.0
    init: confluentinc/confluent-init-container:2.6.0
  dataVolumeCapacity: 50Gi
  logVolumeCapacity: 10Gi
  storageClass:
    name: premium-rwo
  podTemplate:
    affinity:
      nodeAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 1
          preference:
            matchExpressions:
            - key: "app.stateful/component"
              operator: In
              values:
              - "zookeeper"
    topologySpreadConstraints:
    - maxSkew: 1
      topologyKey: "topology.kubernetes.io/zone"
      whenUnsatisfiable: DoNotSchedule
      labelSelector:
        matchLabels:
          app: zookeeper
          clusterId: kafka
          platform.confluent.io/type: zookeeper
    resources:
      requests:
        memory: 3Gi
        cpu: "1"
      limits:
        memory: 3Gi
        cpu: "2"

kubectl apply -f kafka-cluster.yaml -n kafka

kubectl get pod,svc,statefulset,deploy -n kafka

Check the pod and svc are up are running with given command. IN my next post will discuss about Kafka (topics, Consumer and Producer).

Thanks to GCP they provide good document to understands the concept.

Kafka Confluent K8s PART — I Single Cluster

Written by Sridharan r.g