High Availability

Task 5: Setting Up High Availability (HA) in Kubernetes

High Availability (HA) ensures that your Kubernetes cluster remains functional even if some control plane nodes fail. A highly available Kubernetes cluster has: ✅ Multiple control plane nodesMultiple etcd membersLoad balancer for API serverWorker nodes spread across availability zones


Step 1: Deploy Multiple Control Plane Nodes

For HA, at least 3 control plane nodes are recommended.

Method 1: Using kubeadm (Self-Managed Kubernetes)

1️⃣ Initialize the first control plane node:

kubeadm init --control-plane-endpoint "LOAD_BALANCER_IP:6443" --upload-certs

2️⃣ Retrieve the join token for additional control planes:

kubeadm token create --print-join-command
kubeadm init phase upload-certs --upload-certs

3️⃣ Join additional control plane nodes:

kubeadm join LOAD_BALANCER_IP:6443 --token <TOKEN> \
    --discovery-token-ca-cert-hash sha256:<HASH> \
    --control-plane --certificate-key <CERT_KEY>

Common Issues & Solutions

Issue
Cause
Solution

error execution phase preflight

Firewalld or SELinux blocking traffic.

Disable firewall: systemctl stop firewalld.

unable to connect to server

Load balancer misconfigured.

Ensure LB forwards traffic to API servers.

certificate expiry error

Certs expired.

Renew with kubeadm certs renew all.


Step 2: Set Up etcd High Availability

A highly available etcd cluster should have an odd number (3,5,7) of members.

1️⃣ Check existing etcd members:

2️⃣ Add a new etcd member:

3️⃣ Restart etcd on all nodes:

Common Issues & Solutions

Issue
Cause
Solution

etcd cluster is unhealthy

Incorrect peer URL.

Check logs: journalctl -u etcd -f.

etcd member add failed

Duplicate node name.

Use a unique name.

etcdctl: connection refused

etcd is not running.

Restart etcd and check logs.


Step 3: Deploy an API Server Load Balancer

A load balancer distributes API requests across control plane nodes.

Method 1: Use HAProxy

1️⃣ Install HAProxy on a separate node or existing control plane node:

2️⃣ Configure /etc/haproxy/haproxy.cfg:

3️⃣ Restart HAProxy:

4️⃣ Verify Load Balancer:

Common Issues & Solutions

Issue
Cause
Solution

503 Service Unavailable

API servers unreachable.

Check master node IPs in haproxy.cfg.

Connection refused

Firewall blocking traffic.

Allow traffic: ufw allow 6443/tcp.

haproxy not starting

Config syntax error.

Run haproxy -c -f /etc/haproxy/haproxy.cfg to check syntax.


Step 4: Deploy Worker Nodes Across Availability Zones

1️⃣ Get the join command:

2️⃣ Join worker nodes:

3️⃣ Verify worker nodes joined successfully:

Common Issues & Solutions

Issue
Cause
Solution

node not ready

Kubelet is not running.

Restart kubelet: systemctl restart kubelet.

timeout error

Network issues.

Verify cluster network setup.


Step 5: Validate High Availability Setup

1️⃣ Check all control plane nodes are Ready:

2️⃣ Verify etcd health:

3️⃣ Check HAProxy Load Balancing:

If HA is working correctly, API requests should be balanced between control planes.


Summary

Multiple control plane nodes deployedHighly available etcd cluster configuredAPI server load balancer set upWorker nodes deployed across multiple availability zonesCluster verified for HA setup


Next Task: Do you want to proceed with Disaster Recovery Automation? 😊

Last updated