Baremetal Kubernetes with kubeadm in 10 minutes? Easy!
A while ago, being on sick leave, I challenged myself with setting up a baremetal K8s cluster. I chose Kubeadm back then, and so now I am sharing the results.
Yeh, maybe you already have the following question popped up in your mind:
Well, you may want to consider a bare-metal cluster, because
๐ you are setting up an internal infrastructure for your company, and using cloud providers is not an option,
๐ you wish to receive close-to-production experience with you own pet K8s,
๐ you are well-aware of Minikube, MicroK8s and others, but always wanted to have a real multi-node cluster out there,
๐ you don't want to pay for a pre-configured AKS/GKE/EKS/whatever solution just yet,
๐ you want to set everything up from scratch, at least once, out of curiosity,
๐ you have free time :)
Why making another article while there is plenty of them already? Well because most of the material is obviously not for beginners. Sometimes the material feels fragmented, ripped ouf of the context of something bigger or simply not covering the entire process.
What you should understand just before we continue.
โ๏ธ It still won't be a production-ready environment for mature projects! ๐จ๐จ๐จ
โ๏ธ However, with extra hours dedicated, the cluster can become production-ready.
โ๏ธ But even so, it will be good enough for hobby projects, proof of concepts, learning and exploration.
โ๏ธ You will not receive any GUI control panel, only raw kubectl.
Resonates with your needs? Then let's do it!
Nodes
Kubernetes is a cluster, it means that typically more than one machine is involved. I got for myself two instances of VDS (VPS), with 2 Cores and 4 GB of RAM each (those are the minimum requirements), Centos7 on-board. It was a cheap russian hosting with a data center in Moscow and unlimited bandwidth. It works for me just fine, but you can pick AWS's EC2, DigitalOcean or Vultr, or any other provider.
You may as well get two virtual machines with VirtualBox and Vagrant, but it will not give you the same experience.
- X.X.X.X - IP address of Node A
- Y.Y.Y.Y - IP address of Node B
Just as a part of pre-flight preparations, I run the following commands to make the whole process a bit easier.
On my machine:
printf "\nX.X.X.X k8s-master\nY.Y.Y.Y k8s-node01\n" | sudo tee -a /etc/hosts;scp ~/.ssh/id_rsa.pub root@X.X.X.X:~scp ~/.ssh/id_rsa.pub root@Y.Y.Y.Y:~
On each node under root:
yum update# create a non-root useruseradd admin# enable sudousermod -aG wheel adminpasswd adminpasswd -l admin# enable key-based sign-incd /home/admin/mkdir ./.sshchmod 711 ./.sshcat ~/id_rsa.pub >> ./.ssh/authorized_keyschmod 600 ./.ssh/authorized_keyschown -R admin:admin ./.ssh
You may also want to disable password-based SSH authentication, disable root login and switch SSH to a different port to ward off brute-forcers.
Now I can sign-in the nodes by typing ssh admin@k8s-node01 without password and able to run sudo.
Common setup
Kubernetes logically consists of two parts: Control plane and Compute.
- Control plane operates the cluster, and it is hosted on so-called master node(s).
- Compute runs containerized applications and is hosted on worker nodes.
Therefore, I choose Node A to be the master node, whereas Node B - worker node.
The following set of commands I need to run on each node:
โก๏ธ Allow nodes to address each other by name:
printf "\nX.X.X.X master\nY.Y.Y.Y node01\n" | sudo tee -a /etc/hosts;
Make sure nodes can actually talk to each other:
ping masterping node01
โก๏ธ K8s does not play nicely with SELinux, switching it off:
sudo setenforce 0sudo sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux
โก๏ธ K8s uses br_netfilter kernel module for it's internal networking. So I check if module even exists, load it, and enable on-boot load:
sudo modprobe br_netfiltersudo echo "br_netfilter" | sudo tee -a /etc/modules-load.d/br_netfilter.conf
โก๏ธ Tell the bridge module to use iptables:
echo '1' | sudo tee -a /proc/sys/net/bridge/bridge-nf-call-iptablesprintf "\nnet.bridge.bridge-nf-call-iptables = 1\nnet.ipv4.ip_forward=1\n" | sudo tee -a /etc/sysctl.confsudo sysctl -p
โก๏ธ Disable firewalld ๐ป ๐ฑ
Some out-of-the-box firewall rules are set in a way it paralyzes cluster networking. In production it totally makes sense to find what causes that and fix the issue. Since I am building a non-production environment, I don't bother investing time. So I just disable firewalld and by doing so I wipe out all firewall rules.
sudo systemctl stop firewalldsudo systemctl disable firewalldsudo systemctl mask --now firewalld# making sure there is not a single rule left, and the policy is "ACCEPT" on every chainsudo iptables -L -v -n
โก๏ธ K8s requires the swap partition to be disabled for good:
sudo swapoff -asudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab;
โก๏ธ Install and configure Docker
sudo yum install -y yum-utils device-mapper-persistent-data lvm2sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.reposudo yum install -y docker-ce# changing docker cgroupdriver to "systemd"sudo mkdir /etc/dockersudo tee /etc/docker/daemon.json > /dev/null <<'EOF'{"exec-opts": ["native.cgroupdriver=systemd"],"log-driver": "json-file","log-opts": {"max-size": "100m"},"storage-driver": "overlay2","storage-opts": ["overlay2.override_kernel_check=true"]}EOF
โก๏ธ Install kubelet, kubectl and kubeadm
sudo tee /etc/yum.repos.d/kubernetes.repo > /dev/null <<'EOF'[kubernetes]name=Kubernetesbaseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64enabled=1gpgcheck=1repo_gpgcheck=1gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpghttps://packages.cloud.google.com/yum/doc/rpm-package-key.gpgEOFsudo yum install -y kubelet kubeadm kubectl
โก๏ธ Launch Docker daemon
sudo systemctl start dockersudo systemctl enable docker
โก๏ธ Launch Kubelet daemon
Kubeled daemon in charge of doing all K8s-related stuff on this particular node.
sudo systemctl start kubeletsudo systemctl enable kubelet
Allright, hopefully no errors there, and we can proceed to more neat stuff.
Master-specific setup
The commands below should be executed on the master node only.
โก๏ธ Let kubeadm do all the heavy lifting:
sudo kubeadm init --apiserver-advertise-address=X.X.X.X --pod-network-cidr=10.244.0.0/16 --apiserver-cert-extra-sans=localhost,127.0.0.1
If everything went well, I just need to do a few more extra steps.
โก๏ธ Create a configuration file for the current user (admin)
mkdir -p ${HOME}/.kubesudo cp -i /etc/kubernetes/admin.conf ${HOME}/.kube/configsudo chown $(id -u):$(id -g) ${HOME}/.kube/config
โก๏ธ Deploy a pod network
So we have containers. A container is a containerized application written in any language (Node, Java, Go, ...) and running inside of the cluster. A pod is a group of containers, which share common resources. Pods are exposed to the outer world via services.
Yeh, sounds complicated. But this is what makes K8s quite abstracted and therefore unbiased and flexible.
So the pod network then is an implementation of K8s networking model: it enables pods talking to each other, to services, and via services - to the outer world.
There are numerous implementations out there, but I will stick to the kinda default one, which is called flannel. It is basic and proven to be working.
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
Joining the nodes
Hooray! Technically nothing stops us from joining the nodes together now.
On the master node I create a new token for joining the cluster:
sudo kubeadm token create
This will get me the token of format xxxxxx.xxxxxxxxxxxxxxxx.
Worker-specific setup
The commands below should be executed on the worker node(s) only.
โก๏ธ Add a worker node to the cluster
Member that token I made? I need that one to execute the join command on each worker node:
sudo kubeadm join X.X.X.X:6443 --token xxxxxx.xxxxxxxxxxxxxxxx --discovery-token-unsafe-skip-ca-verification
From this moment on, there will be no need to do something else on the worker node, so I may end that ssh session.
Checking the status of the cluster
I can now go back to the master node and check how the cluster is doing:
kubectl get nodes
If I see something like this...
NAME STATUS ROLES AGE VERSIONnode01 Ready <none> 4h7m v1.19.3master Ready master 45h v1.19.3
... that means: "Congratulations ๐๐๐ You and I just got our own K8s clusters up and running!"
Most of the tutorials usually end here, or they maybe show how to expose an Apache container via a NodePort-type service (make a container accessible on the worker node directly via a random port greater than 1023). This obviously is not commonly-used IRL, as it defeats one of the key advantages of K8s. So instead let's go a bit further and setup ingress.
โก๏ธ Ingress
Basically, ingress tells the cluster ยซHey, for this given domain, path and protocol route all requests to the service A. For other domain, path and protocol - to the service B (and so on)ยป. So ingress works as a router for the incoming requests.
Sounds quite powerful, doesn't it? I am gonna use ingress-nginx, the default kubernetes ingress, as it is pretty simple and easily configurable:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.40.2/deploy/static/provider/baremetal/deploy.yaml
Let's check how ingress is doing:
kubectl get pods -n ingress-nginx
In a few minutes, the status should look somewhat like this:
NAME READY STATUS RESTARTS AGEingress-nginx-admission-create-7dhcj 0/1 Completed 0 22mingress-nginx-admission-patch-kcsvt 0/1 Completed 0 22mingress-nginx-controller-785557f9c9-7ldcl 1/1 Running 0 22m
โก๏ธ Load balancer
What ingress does not know, obviously, is how exactly the cluster is connected to the outer world. This task is solved by a load balancer that typically works outside of the cluster. Every cloud provider implements it's own one, and their load balancers vary from one to another. Since I don't have any cloud provider here, I need a substitution.
So there is a project called Metallb, which I am going to leverage for this matter. Please find which version is the latest. Also, sometimes they do backward-incompatible installation instruction updates, so if something does not work, feel free to check their docs out!
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.4/manifests/namespace.yamlkubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.4/manifests/metallb.yamlkubectl create secret generic -n metallb-system memberlist --from-literal=secretkey="$(openssl rand -base64 128)"
I also need to provide a config file, otherwise the balancer stays dormant.
tee /tmp/metallb-conf.yml > /dev/null <<'EOF'apiVersion: v1kind: ConfigMapmetadata:namespace: metallb-systemname: configdata:config: |address-pools:- name: defaultprotocol: layer2addresses:- Y.Y.Y.YEOFkubectl apply -f /tmp/metallb-conf.yml;rm /tmp/metallb-conf.yml;
To see how the balancer is doing:
kubectl get pods -n metallb-system
It should look like this:
NAME READY STATUS RESTARTS AGEcontroller-8687cdc65-h6d5m 1/1 Running 0 14sspeaker-b4p95 1/1 Running 0 14sspeaker-xhm72 1/1 Running 0 14s
Okay, so this is pretty it. I have set the cluster up, and it is working. In the next article (which is coming soon) I will deploy an infrastructure using Terraform, with actual domains and SSL support! Stay tuned.
Troubleshooting
๐๐ When joining a new node, the process hangs at the "[preflight] Running pre-flight checks" stage forever.
Endless process or timeout error sometimes lie in the I/O plane. Please make sure that nodes can ping each other by IP and the master node has only kubelet-produced rules in iptables.
๐๐ Pods are stuck at "ContainerCreating" step
This problem may have many causes. In general, to see the events of a pod and where it stuck, a describe pods command is used:
kubectl describe pods -n ingress-nginx
If something goes totally pear-shaped, you may consider running the following piece of commands on each node to revert the cluster to pre kubectl init (join) state:
sudo kubeadm reset -fsudo rm -rf /etc/cni/net.dsudo iptables -P INPUT ACCEPTsudo iptables -P FORWARD ACCEPTsudo iptables -P OUTPUT ACCEPTsudo iptables -t nat -Fsudo iptables -t mangle -Fsudo iptables -Fsudo iptables -Xsudo systemctl restart dockersudo systemctl restart kubeletrm -rf ${HOME}/.kube
Useful links
- kubeadm - K8s's official bootstrapping tool
- kops - K8s installer on AWS/GCE/OpenStack/Digital Ocean/Spot Ocean
- kubespray - another installer
- rancher
- k8s cheatsheet
- list of pod networks in the alphabetical order
- list of ingress controllers
Sergei Gannochenko
Golang, React, TypeScript, Docker, AWS, Jamstack.
19+ years in dev.