I used Ubuntu 20.04 on all the servers. This manual is also suitable for baremetal setup.
In this manual we will use 7 VMs in the same 10.0.0.0/24 network:
- Kubeapi Load Balancer with HAProxy (10.0.0.90)
- 3x master nodes (10.0.0.71-10.0.0.73)
- 3x worker nodes (10.0.0.81-83)
The best way to create 7x VM quickly is to prepare one and clone it (or it’s disk image) before the cluster setup. Once the OS (Ubuntu/Debian) installed, perform the package update and upgrade:
sudo apt update && sudo apt upgrade
If you setting up a VM and clonning it, it needs to change the machine ID and host SSH keys:
rm -f /etc/machine-id
dbus-uuidgen --ensure=/etc/machine-id
rm /var/lib/dbus/machine-id
dbus-uuidgen --ensure
/bin/rm -v /etc/ssh/ssh_host_*
dpkg-reconfigure openssh-server
systemctl restart ssh
reboot
Installing CA and creating certificates
We will use CFSSL tool from Cloudflare
wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
chmod +x cfssl*
sudo mv cfssl_linux-amd64 /usr/local/bin/cfssl
sudo mv cfssljson_linux-amd64 /usr/local/bin/cfssljson
Check:
cfssl version
Installing HAProxy as kubeapi load balancer
On the kubeapi load balancer node (10.0.0.90) performing:
sudo apt install haproxy
then the configuration needs to be changed, editing /etc/haproxy/haproxy.cfg:
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
stats timeout 30s
user haproxy
group haproxy
daemon
# Default SSL material locations
ca-base /etc/ssl/certs
crt-base /etc/ssl/private
# See: https://ssl-config.mozilla.org/#server=haproxy&server-version=2.0.3&config=intermediate
ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 408 /etc/haproxy/errors/408.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.http
# PART BEFORE IS NOT CHANGED, ONLY THE FOLLOWING ADDED:
frontend kubernetes
bind 10.0.0.90:6443
option tcplog
mode tcp
default_backend kubernetes-master-nodes
backend kubernetes-master-nodes
mode tcp
balance roundrobin
option tcp-check
server master01 10.0.0.71:6443 check fall 3 rise 2
server master02 10.0.0.72:6443 check fall 3 rise 2
server master03 10.0.0.73:6443 check fall 3 rise 2
applying configuration:
sudo systemctl restart haproxy
Creating a CA and generating the certificates
Create a file for the certification authority configuration ca-config.json with the following content:
{
"signing": {
"default": {
"expiry": "8760h"
},
"profiles": {
"kubernetes": {
"usages": ["signing", "key encipherment", "server auth", "client auth"],
"expiry": "8760h"
}
}
}
}
Create the certificate authority signing request file ca-csr.json with the following content:
{
"CN": "Kubernetes",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "IN",
"L": "Ukraine",
"O": "Organization",
"OU": "CA",
"ST": "State"
}
]
}
Generate the certificate authority certificate and private key:
cfssl gencert -initca ca-csr.json | cfssljson -bare ca
Check if the files appeared in the current directory.
Generate the certificate for the Etcd cluster, make the certificate signing request configuration file kubernetes-csr.json with the following content:
{
"CN": "Kubernetes",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "IN",
"L": "Ukraine",
"O": "Organization",
"OU": "CA",
"ST": "State"
}
]
}
and generate the certificate and private key, to the -hostname add you IPs of HAProxy kubeapi load balancer and master nodes:
cfssl gencert \
-ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-hostname=10.0.0.90,10.0.0.71,10.0.0.72,10.0.0.73,127.0.0.1,kubernetes.default \
-profile=kubernetes kubernetes-csr.json | \
cfssljson -bare kubernetes
Check if the files appeared.
Copy certificates ca.pem, kubernetes.pem and kubernetes-key.pem to each node, for example to the /home/user directory.
Preparing master nodes and setting up required packages
On all the master nodes:
sudo apt-get remove docker docker-engine docker.io containerd runc
sudo apt-get install -y \
apt-transport-https \
ca-certificates \
curl \
gnupg-agent \
software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"sudo apt-get remove docker docker-engine docker.io containerd runc
sudo apt-get install -y \
apt-transport-https \
ca-certificates \
curl \
gnupg-agent \
software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
sudo apt-get install -y docker-ce docker-ce-cli containerd.io
sudo usermod -aG docker ubuntu
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
sudo mkdir -p /etc/apt/keyrings/
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.29/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.29/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
swapoff -a
and remove swap from /etc/fstab.
Setting up the Etcd high available cluster
On the master nodes:
sudo mkdir /etc/etcd /var/lib/etcd
sudo mv ~/ca.pem ~/kubernetes.pem ~/kubernetes-key.pem /etc/etcd
wget https://github.com/etcd-io/etcd/releases/download/v3.4.13/etcd-v3.4.13-linux-amd64.tar.gz
tar xvzf etcd-v3.4.13-linux-amd64.tar.gz
sudo mv etcd-v3.4.13-linux-amd64/etcd* /usr/local/bin/
create the Etcd systemd unit file by editing /etc/systemd/system/etcd.service
[Unit]
Description=etcd
Documentation=https://github.com/coreos
[Service]
ExecStart=/usr/local/bin/etcd \
--name 10.0.0.71 \
--cert-file=/etc/etcd/kubernetes.pem \
--key-file=/etc/etcd/kubernetes-key.pem \
--peer-cert-file=/etc/etcd/kubernetes.pem \
--peer-key-file=/etc/etcd/kubernetes-key.pem \
--trusted-ca-file=/etc/etcd/ca.pem \
--peer-trusted-ca-file=/etc/etcd/ca.pem \
--peer-client-cert-auth \
--client-cert-auth \
--initial-advertise-peer-urls https://10.0.0.71:2380 \
--listen-peer-urls https://10.0.0.71:2380 \
--listen-client-urls https://10.0.0.71:2379,http://127.0.0.1:2379 \
--advertise-client-urls https://10.0.0.71:2379 \
--initial-cluster-token etcd-cluster-0 \
--initial-cluster 10.0.0.71=https://10.0.0.71:2380,10.0.0.72=https://10.0.0.72:2380,10.0.0.73=https://10.0.0.73:2380 \
--initial-cluster-state new \
--data-dir=/var/lib/etcd
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
Reload the daemon configuration, enable and start Etcd:
sudo systemctl daemon-reload
sudo systemctl enable etcd
sudo systemctl start etcd
Check it is running and clustered:
ETCDCTL_API=3 etcdctl member list
The output has to be like that:
25e141cd5630aead, started, 10.0.0.73, https://10.0.0.73:2380, https://10.0.0.73:2379, false
4e7196ec0c691986, started, 10.0.0.71, https://10.0.0.71:2380, https://10.0.0.71:2379, false
a02a3223c1119bbf, started, 10.0.0.72, https://10.0.0.72:2380, https://10.0.0.72:2379, false
Initialization of the master nodes
On the master nodes create a file config.yaml with the following content:
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
kubernetesVersion: v1.29.3
controlPlaneEndpoint: "10.0.0.90:6443"
etcd:
external:
endpoints:
- https://10.0.0.71:2379
- https://10.0.0.72:2379
- https://10.0.0.73:2379
caFile: /etc/etcd/ca.pem
certFile: /etc/etcd/kubernetes.pem
keyFile: /etc/etcd/kubernetes-key.pem
networking:
podSubnet: 10.30.0.0/24
apiServer:
certSANs:
- "10.0.0.90"
extraArgs:
apiserver-count: "3"
Where 10.0.0.7x – our master nodes, 10.0.0.90 – certification authority (that is on HAproxy kubeapi load balancer VM), and 10.30.0.0/24 – subned for pods.
Copy file config.yaml to the other master nodes!
Initialise kubeadm on the first master node (10.0.0.71):
sudo kubeadm init --config=config.yaml
In case the preflight check is failing with the following message:
[init] Using Kubernetes version: v1.29.3
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR CRI]: container runtime is not running: output: time="2024-03-25T17:44:49Z" level=fatal msg="validate service connection: validate CRI v1 runtime API for endpoint \"unix:///var/run/containerd/containerd.sock\": rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
remove /etc/containerd/config.toml and restart containerd.service:
sudo rm /etc/containerd/config.toml
sudo systemctl restart containerd
it also may fail if the firewall restrictions in action, do the following:
sudo firewall-cmd --permanent --add-port=6443/tcp
sudo firewall-cmd --permanent --add-port=2379-2380/tcp
sudo firewall-cmd --permanent --add-port=10250/tcp
sudo firewall-cmd --permanent --add-port=10251/tcp
sudo firewall-cmd --permanent --add-port=10252/tcp
sudo firewall-cmd --permanent --add-port=10255/tcp
sudo firewall-cmd –reload
Once all the preflight checks are passed, the initialization should be performed well, repeat the kubeadm init as above.
Then we have to copy the certificates /etc/kubernetes/pki from the first (10.0.0.71) master node to two others.
sudo scp -r /etc/kubernetes/pki [email protected]:~
sudo scp -r /etc/kubernetes/pki [email protected]:~
On the second (master02, 10.0.0.72) and third (master03, 10.0.0.73) master nodes remove apiserver.crt and apiserver.key:
rm ./pki/apiserver.
and move the certificates to the /etc/kubernetes directory:
sudo mv ~/pki /etc/kubernetes/
Perform the same initialization of the rest of master nodes (master02 and master03) with the file config.yaml we copied previously from the first master node (master01):
sudo kubeadm init --config=config.yaml
By the initialization of the masters you will get the following message:
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:
kubeadm join 10.0.0.90:6443 --token kzw57k.x5mitlly89nhobrd \
--discovery-token-ca-cert-hash sha256:d7f3b00c8b71c39f81d92227f32217c24d26f0f1e821c5151387be49ff7d1e05 \
--control-plane
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.0.0.90:6443 --token kzw57k.x5mitlly89nhobrd \
--discovery-token-ca-cert-hash sha256:d7f3b00c8b71c39f81d92227f32217c24d26f0f1e821c5151387be49ff7d1e05
Save the part that describes how to join nodes to the cluster.
By the following command you have to see all your 3 masters in the output:
kubectl get nodes
it should look like that:
NAME STATUS ROLES AGE VERSION
master01 NotReady control-plane 66m v1.29.1
master02 NotReady control-plane 46m v1.29.3
master03 NotReady control-plane 44m v1.29.3
and after some time
NAME STATUS ROLES AGE VERSION
master01 Ready control-plane 3h15m v1.29.1
master02 Ready control-plane 175m v1.29.3
master03 Ready control-plane 173m v1.29.3
To initialize the worker nodes, perform on each output you saved when it was initialization of master nodes:
kubeadm join 10.0.0.90:6443 --token kzw57k.x5mitlly89nhobrd \
--discovery-token-ca-cert-hash sha256:d7f3b00c8b71c39f81d92227f32217c24d26f0f1e821c5151387be49ff7d1e05
then you can check your nodes appeared in output of the kubectl get nodes command as previously masters nodes do.
Deploying the CNI to have the overlay network
In this example we will use Calico, but you are free to use any of existing ones.
Download and install the manifest:
curl https://calico-v3-25.netlify.app/archive/v3.25/manifests/calico.yaml -O
kubectl apply -f calico.yaml
kubectl get pods -n kube-system
Setting up the ingress controller
We will use the nginx ingress controller, but you are free to use any of exsiting ones.
git clone https://github.com/nginxinc/kubernetes-ingress.git --branch v3.4.3
cd kubernetes-ingress/
kubectl apply -f deployments/common/ns-and-sa.yaml
kubectl apply -f deployments/rbac/rbac.yaml
kubectl apply -f examples/shared-examples/default-server-secret/default-server-secret.yaml
kubectl apply -f deployments/common/nginx-config.yaml
kubectl apply -f deployments/common/ingress-class.yaml
kubectl apply -f deployments/daemon-set/nginx-ingress.yaml
kubectl get pods --namespace=nginx-ingress
That’s all folks!