Kubeadm

You're right! Here’s a more detailed breakdown of common scenarios and issues you might face when deploying a Kubernetes cluster using kubeadm, along with solutions.


Task 1: Deploying a Kubernetes Cluster Using kubeadm

Step 1: Install Dependencies

Commands

sudo apt update && sudo apt install -y apt-transport-https curl
sudo curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key | sudo tee /etc/apt/trusted.gpg.d/kubernetes.asc
echo "deb https://pkgs.k8s.io/core:/stable:/v1.28/deb/ /" | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt update
sudo apt install -y kubelet kubeadm kubectl

Common Issues & Solutions

Issue
Cause
Solution

kubeadm: command not found

Kubeadm is not installed properly.

Run which kubeadm and reinstall using sudo apt install -y kubeadm.

The following signatures were invalid: EXPKEYSIG...

Expired GPG key in repo.

Refresh the key: sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys <KEY>

kubelet is not running

Service is not enabled.

Run sudo systemctl enable --now kubelet.

modprobe: FATAL: Module br_netfilter not found

Required kernel module missing.

Run sudo modprobe br_netfilter and `echo 'br_netfilter'


Step 2: Disable Swap & Configure Sysctl

Commands

Common Issues & Solutions

Issue
Cause
Solution

swapoff: command not found

Running on a minimal OS image.

Install util-linux package: sudo apt install util-linux.

sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables

Kernel module missing.

Load it using sudo modprobe br_netfilter.

bridge-nf-call-iptables = 0

Kernel not applying settings.

Run sudo sysctl -p /etc/sysctl.d/k8s.conf.


Step 3: Initialize the Kubernetes Cluster

Command

Common Issues & Solutions

Issue
Cause
Solution

kubeadm init hangs at "pre-pulling images"

Slow internet or registry unreachable.

Pre-pull images manually: kubeadm config images pull.

Failed to start API server: failed to get etcd endpoints

etcd is not running.

Check logs: journalctl -u etcd -f.

Port 6443 is already in use

Another process is using it.

Find and kill it: `sudo netstat -tulnp

timed out waiting for the condition

Firewall is blocking API Server.

Open ports: sudo ufw allow 6443/tcp.


Step 4: Set Up Kubeconfig

Commands

Common Issues & Solutions

Issue
Cause
Solution

The connection to the server 127.0.0.1:6443 was refused

Kubeconfig is missing or API server not running.

Ensure kube-apiserver is running using sudo systemctl status kubelet.

nodes are NotReady

CNI plugin is not installed.

Install CNI (Calico/Flannel).

kubectl command not found

Kubectl is not installed.

Reinstall it using sudo apt install kubectl.


Step 5: Install a CNI Plugin (Calico)

Command

Common Issues & Solutions

Issue
Cause
Solution

pods stuck in ContainerCreating

CNI plugin is missing.

Ensure kubectl get pods -n kube-system shows a running CNI pod.

Readiness probe failed for calico-node

Networking issue.

Restart Calico: kubectl delete pod -n kube-system -l k8s-app=calico-node.


Step 6: Join Worker Nodes

Command (Run on worker nodes)

Common Issues & Solutions

Issue
Cause
Solution

[ERROR] preflight: couldn't validate the identity of the API Server

Token expired.

Generate a new token: kubeadm token create --print-join-command.

kubelet: failed to start container runtime

Docker or containerd issue.

Restart container runtime: sudo systemctl restart containerd.

cgroup driver mismatch

Docker and kubelet are using different cgroup drivers.

Set cgroupDriver: systemd in /etc/containerd/config.toml.


Step 7: Verify Cluster Health

Command

Common Issues & Solutions

Issue
Cause
Solution

Cluster-info fails

API server is not running.

Restart API server: sudo systemctl restart kubelet.

Pods stuck in "ContainerCreating"

Storage class issue.

Ensure storage class is configured: kubectl get sc.

CoreDNS is not running

Network issue.

Restart CoreDNS: kubectl rollout restart deployment coredns -n kube-system.


Summary

By now, your Kubernetes cluster should be fully deployed and operational. 🚀 This setup covers common errors, troubleshooting steps, and fixes to ensure a smooth installation.


Would you like to proceed with kOps, eksctl, or move to the next admin task? 😊

Last updated