Task 5: Setting Up High Availability (HA) in Kubernetes
High Availability (HA) ensures that your Kubernetes cluster remains functional even if some control plane nodes fail. A highly available Kubernetes cluster has:
✅ Multiple control plane nodes
✅ Multiple etcd members
✅ Load balancer for API server
✅ Worker nodes spread across availability zones
Step 1: Deploy Multiple Control Plane Nodes
For HA, at least 3 control plane nodes are recommended.
A highly available etcd cluster should have an odd number (3,5,7) of members.
1️⃣ Check existing etcd members:
2️⃣ Add a new etcd member:
3️⃣ Restart etcd on all nodes:
Common Issues & Solutions
Issue
Cause
Solution
etcd cluster is unhealthy
Incorrect peer URL.
Check logs: journalctl -u etcd -f.
etcd member add failed
Duplicate node name.
Use a unique name.
etcdctl: connection refused
etcd is not running.
Restart etcd and check logs.
Step 3: Deploy an API Server Load Balancer
A load balancer distributes API requests across control plane nodes.
Method 1: Use HAProxy
1️⃣ Install HAProxy on a separate node or existing control plane node:
2️⃣ Configure /etc/haproxy/haproxy.cfg:
3️⃣ Restart HAProxy:
4️⃣ Verify Load Balancer:
Common Issues & Solutions
Issue
Cause
Solution
503 Service Unavailable
API servers unreachable.
Check master node IPs in haproxy.cfg.
Connection refused
Firewall blocking traffic.
Allow traffic: ufw allow 6443/tcp.
haproxy not starting
Config syntax error.
Run haproxy -c -f /etc/haproxy/haproxy.cfg to check syntax.
Step 4: Deploy Worker Nodes Across Availability Zones
1️⃣ Get the join command:
2️⃣ Join worker nodes:
3️⃣ Verify worker nodes joined successfully:
Common Issues & Solutions
Issue
Cause
Solution
node not ready
Kubelet is not running.
Restart kubelet: systemctl restart kubelet.
timeout error
Network issues.
Verify cluster network setup.
Step 5: Validate High Availability Setup
1️⃣ Check all control plane nodes are Ready:
2️⃣ Verify etcd health:
3️⃣ Check HAProxy Load Balancing:
If HA is working correctly, API requests should be balanced between control planes.
Summary
✅ Multiple control plane nodes deployed
✅ Highly available etcd cluster configured
✅ API server load balancer set up
✅ Worker nodes deployed across multiple availability zones
✅ Cluster verified for HA setup
Next Task: Do you want to proceed with Disaster Recovery Automation? 😊