This tutorial shows how to deploy a Kafkorama cluster — with Kafka support, in conjunction with Azure Event Hubs, using Azure Kubernetes Service (AKS).
Before deploying Kafkorama on AKS, ensure that you have a Microsoft Azure account and have installed the following tools:
Let's use the following shell variables for this tutorial:
export RESOURCE_GROUP=rg-kafkorama
export AKS_CLUSTER=aks-kafkorama
export EVENTHUBS_NAMESPACE=evhns-kafkorama
export EVENTHUBS_TOPIC=server
Login to AKS:
az login
Create a new resource group:
az group create --name $RESOURCE_GROUP --location eastus
Create an AKS cluster with at least three and at most five nodes, while also activating cluster autoscaling:
az aks create \
--resource-group $RESOURCE_GROUP \
--name $AKS_CLUSTER \
--node-count 3 \
--vm-set-type VirtualMachineScaleSets \
--generate-ssh-keys \
--load-balancer-sku standard \
--enable-cluster-autoscaler \
--min-count 3 \
--max-count 5
Connect to the AKS cluster:
az aks get-credentials \
--resource-group $RESOURCE_GROUP \
--name $AKS_CLUSTER
Check if the nodes of the AKS cluster are up:
kubectl get nodes
Create a namespace kafkorama
for all the resources created for this environment by copying the following to a file kafkorama-namespace.yaml
:
apiVersion: v1
kind: Namespace
metadata:
name: kafkorama
Then, execute the command:
kubectl apply -f kafkorama-namespace.yaml
First, create a namespace into the Event Hubs:
az eventhubs namespace create --name $EVENTHUBS_NAMESPACE \
--resource-group $RESOURCE_GROUP -l eastus
Create a Kafka topic on Azure Event Hubs as follows:
az eventhubs eventhub create --name $EVENTHUBS_TOPIC \
--resource-group $RESOURCE_GROUP \
--namespace-name $EVENTHUBS_NAMESPACE
Fetch the Event Hubs rule/policy and get the value of the name
attribute from the JSON response of the following command:
az eventhubs namespace authorization-rule list \
--resource-group $RESOURCE_GROUP \
--namespace-name $EVENTHUBS_NAMESPACE
Suppose the policy got above is RootManageSharedAccessKey
, then get the value of the attribute primaryConnectionString
from the JSON response of the following command:
az eventhubs namespace authorization-rule keys list \
--resource-group $RESOURCE_GROUP \
--namespace-name $EVENTHUBS_NAMESPACE \
--name RootManageSharedAccessKey
The value of the attribute primaryConnectionString
from the response of the last command should look as follows:
Endpoint=sb://evhns-kafkorama.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=xxxxxxxxxxxxxxxxx
Therefore, the JAAS config to authenticate to Azure Event Hubs with SASL should look as follows:
KafkaClient {
org.apache.kafka.common.security.plain.PlainLoginModule required
username="$ConnectionString"
password="Endpoint=sb://evhns-kafkorama.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=xxxxxxxxxxxxxxxxx";
};
Copy the JAAS config to a file jaas.config
. We will need this configuration later to connect a Kafka consumer and producer to the Azure Event Hubs with SASL.
Because the JAAS config file obtained in the previous step must be included in the pod configuration, we should create a secret from jaas.config
which will be mounted as a volume in Kubernetes:
kubectl create secret generic kafkorama-secret \
--from-file=jaas.config -n kafkorama
We will use the following Kubernetes manifest to build a cluster of three Kafkorama Gateways. The following command
will update the variables $EVENTHUBS_NAMESPACE
, $EVENTHUBS_TOPIC
and it will create a file named kafkorama-cluster.yaml
:
cat > kafkorama-cluster.yaml <<EOL
apiVersion: v1
kind: Service
metadata:
namespace: kafkorama
name: kafkorama-cs
labels:
app: kafkorama
spec:
type: LoadBalancer
ports:
- name: client-port
port: 80
protocol: TCP
targetPort: 8800
selector:
app: kafkorama
---
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: kafkorama
name: kafkorama
spec:
selector:
matchLabels:
app: kafkorama
replicas: 3
template:
metadata:
labels:
app: kafkorama
spec:
containers:
- name: kafkorama-cluster
imagePullPolicy: Always
image: kafkorama/kafkorama-gateway:latest
volumeMounts:
- name: kafkorama-secret
mountPath: "/kafkorama/secrets/jaas.config"
subPath: jaas.config
readOnly: true
env:
- name: MIGRATORYDATA_EXTRA_OPTS
value: "-DMemory=128MB \
-DLogLevel=INFO \
-DX.ConnectionOffload=true \
-DClusterEngine=kafka"
- name: MIGRATORYDATA_KAFKA_EXTRA_OPTS
value: "-Dbootstrap.servers=$EVENTHUBS_NAMESPACE.servicebus.windows.net:9093 \
-Dtopics=$EVENTHUBS_TOPIC \
-Dsecurity.protocol=SASL_SSL \
-Dsasl.mechanism=PLAIN
-Djava.security.auth.login.config=/kafkorama/secrets/jaas.config"
- name: MIGRATORYDATA_JAVA_GC_LOG_OPTS
value: "-XX:+PrintCommandLineFlags -XX:+PrintGC -XX:+PrintGCDetails -XX:+DisableExplicitGC -Dsun.rmi.dgc.client.gcInterval=0x7ffffffffffffff0 -Dsun.rmi.dgc.server.gcInterval=0x7ffffffffffffff0 -verbose:gc"
resources:
requests:
memory: "256Mi"
cpu: "0.5"
ports:
- name: client-port
containerPort: 8800
readinessProbe:
tcpSocket:
port: 8800
initialDelaySeconds: 20
failureThreshold: 5
periodSeconds: 5
livenessProbe:
tcpSocket:
port: 8800
initialDelaySeconds: 10
failureThreshold: 5
periodSeconds: 5
volumes:
- name: kafkorama-secret
secret:
secretName: kafkorama-secret
EOL
This manifest contains a Service and a Deployment. The Service is used to
handle the clients of the Kafkorama cluster over the port 80
.
In this manifest, we've used the MIGRATORYDATA_EXTRA_OPTS
environment variable which can be used to define specific parameters or adjust the default value of any parameter listed in
the Configuration Guide. In this manifest, we've used this environment variable to modify
the default values of the parameters such as Memory. Additionally,
we've employed it to modify the default value of the parameter ClusterEngine
, to enable the
Kafka native add-on.
To customize the Kafkorama's native add-on for Kafka, the environment variable
MIGRATORYDATA_KAFKA_EXTRA_OPTS offers the flexibility to
define specific parameters or adjust the default value of any parameter of the Kafka native add-on. In the manifest above, we've used
this environment variable to modify the default values of the parameters bootstrap.servers
and topics
among others
to connect to Azure Event Hubs.
To deploy the Kafkorama cluster, copy this manifest to a file kafkorama-cluster.yaml
, update the variables
$EVENTHUBS_NAMESPACE
and $EVENTHUBS_TOPIC
in the file and run the command:
kubectl apply -f kafkorama-cluster.yaml
Because the deployment concerns the namespace kafkorama
, switch to this namespace as follows:
kubectl config set-context --current --namespace=kafkorama
To return to the default namespace, run:
kubectl config set-context --current --namespace=default
Check the running pods to ensure the kafkorama
pods are running:
kubectl get pods
The output of this command should include something similar to the following:
NAME READY STATUS RESTARTS AGE
kafkorama-57848575bd-4tnbz 1/1 Running 0 4m32s
kafkorama-57848575bd-gjmld 1/1 Running 0 4m32s
kafkorama-57848575bd-tcbtf 1/1 Running 0 4m32s
You can check the logs of each cluster member by running a command as follows:
kubectl logs kafkorama-57848575bd-4tnbz
Now, you can check that the service of the manifest above is up and running:
kubectl get svc
You should see an output similar to the following:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kafkorama-cs LoadBalancer 10.0.39.44 YourExternalIP 80:32210/TCP 17s
You should now be able to connect to the address assigned by AKS to the load balancer service under the column
EXTERNAL-IP
. In this case the external IP address is YourExternalIP
and the port is 80
. Open in your browser the
corresponding URL http://YourExternalIP
. You should see a welcome page that features a demo application under the
Debug Console menu for publishing to and consuming real-time messages from the Kafkorama cluster.
The stateless nature of the Kafkorama cluster when deployed in conjunction with Azure Event Hubs, where each cluster member is independent from the others, highly simplifies the horizontal scaling on AKS.
For example, if the load of your system increases substantially, and supposing your nodes have enough resources
available, you can add two new members to the cluster by modifying the replicas
field as follows:
kubectl scale deployment kafkorama --replicas=5
If the load of your system decreases significantly, then you might remove three members from the cluster by modifying
the replicas
field as follows:
kubectl scale deployment kafkorama --replicas=2
Manual scaling is practical if the load of your system changes gradually. Otherwise, you can use the autoscaling feature of Kubernetes.
Kubernetes can monitor the load of your system, typically expressed in CPU usage, and scale your Kafkorama cluster
up and down by automatically modifying the replicas
field.
In the example above, to add one or more new members up to a maximum of 5
cluster members if the CPU usage of the
existing members becomes higher than 50%, or remove one or more of the existing members if the CPU usage of the existing
members becomes lower than 50%, use the following command:
kubectl autoscale deployment kafkorama \
--cpu-percent=50 --min=3 --max=5
Alternatively, you can use a YAML manifest as follows:
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
namespace: kafkorama
name: kafkorama-autoscale # you can use any name here
spec:
maxReplicas: 5
minReplicas: 3
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: kafkorama
targetCPUUtilizationPercentage: 50
Save it as a file named for example kafkorama-autoscale.yaml
, then execute it as follows:
kubectl apply -f kafkorama-autoscale.yaml
Now, you can display information about the autoscaler object above using the following command:
kubectl get hpa
and display CPU usage of cluster members with:
kubectl top pods
While testing cluster autoscaling, it is important to understand that the Kubernetes autoscaler periodically retrieves CPU usage information from the cluster members. As a result, the autoscaling process may not appear instantaneous, but this delay aligns with the normal behavior of Kubernetes.
Kafkorama clustering tolerates a number of cluster member to be down or to fail as detailed in the Clustering section.
In order to test an AKS node failure, use:
kubectl drain <node-name> --force --delete-local-data \
--ignore-daemonsets
Then, to start an AKS node, use:
kubectl uncordon <node-name>
Delete the Kubernetes resources created for this deployment with:
kubectl delete -f kafkorama-namespace.yaml
Go back to default namespace:
kubectl config set-context --current --namespace=default
Finally, when you don't need anymore the AKS cluster of nodes, delete it:
az group delete --name $RESOURCE_GROUP --yes --no-wait
First, please read the documentation of the Kafka native add-on to understand the automatic mapping between Kafkorama subjects and Kafka topics.
Utilize Kafkorama's client APIs to create real-time applications that communicate with your Kafkorama cluster via Azure Event Hubs.
Also, employ the APIs or tools of Azure Event Hubs to generate real-time messages, which are subsequently delivered to Kafkorama's clients. Similarly, consume real-time messages from Azure Event Hubs that originate from Kafkorama's clients.