I just finished my setup of Kubernetes for my home lab and before I forget the process I want to record my notes.
I used Ubuntu Server 18.04 but any version should work. After a clean install the first thing to do is fetch the latest updates.
sudo apt update && sudo apt upgrade
Next I installed juju
sudo snap install juju --classic
Next install LXD
sudo snap install lxd
Configure LXD
sudo /snap/bin/lxd.migrate
sudo usermod -a -G lxd $USER
Setup LXD
sudo lxd init
These are my answers to the questions
Would you like to use LXD clustering? no
Do you want to configure a new storage pool? yes
Name of the new storage pool: default
Name of the storage backend to use: dir
Would you like to connect to a MAAS server? no
Would you like to create a new local network bridge? yes
What should the new bridge be called? lxdbr0
What IPv4 address should be used? 10.8.8.1/24
Would you like LXD to NAT IPv4 traffic on your bridge? yes
What IPv6 address should be used? none
Would you like LXD to be available over the network? no
Would you like stale cached images to be updated automatically? yes
Would you like a YAML "lxd init" preseed to be printed? no
Use conjure-up.
This is the selection screen when using conjure up. There are a couple of "spells" available. The one I selected was Charmed Distribution of Kubernetes.
The option "with nVidia GPU workers" did not allow me to deploy to LXD.
I do not want the add-on. I can always install them in the future if I want them using juju.
I had a lot of trouble with "localhost". It was not clear to me that I could not use ZFS as the storage backend for LXD. You must pick dir.
It was also not clear that the network stack must have ip-v6 disabled on LXD. When you run LXD init, and it asks
What IPv6 address should be used?
none is the correct answer.
Picking the LXD network bridge and storage pool.
As long as the storage pool backend is DIR, you should be okay.
I picked the default option.
At this point, most people can hit the deploy button, unless you have an nVidia GPU in your system...
Navigate down to kubernetes-master
and hit enter on configure
Open Advanced Configuration
Find enable-nvidia-plugin
and give it the value false
.
When this is finished, APPLY CHANGES
My internet connection is slow, so the deploy process took about 2 hours. For some reason, conjure-up and friends used up 28GB of memory during the deploy process and then halted on "waiting for 6 kube-system pods to start". A quick search on Google led me to a solution by someone who goes by the alias of snooksy.
To keep everything in one place, I will also document the solution here:
Once conjure-up gets stuck, open a new shell and try the following:
$ juju whoami
Controller: conjure-up-localhost-2f0
Model: conjure-charmed-kubernet-d97
User: admin
Notice the model in use. (I bolded it.)
$ lxc profile list
+---------------------------------------------------------+---------+
| NAME | USED BY |
+---------------------------------------------------------+---------+
| default | 7 |
+---------------------------------------------------------+---------+
| juju-conjure-charmed-kubernet-d97 | 6 |
+---------------------------------------------------------+---------+
| juju-conjure-charmed-kubernet-d97-kubernetes-master-747 | 1 |
+---------------------------------------------------------+---------+
| juju-controller | 1 |
+---------------------------------------------------------+---------+
There is a profile that matches the current model that juju is working with.
We want to edit this.
$ lxc profile show juju-conjure-charmed-kubernet-d97
You should see something like this, but missing aadisable2
$ lxc profile edit juju-conjure-charmed-kubernet-d97
Hopefully you are familiar with Vi.
One you have made the change, just keep waiting.
There is no need to restart the lxc containers.
Restarting them actually ends up killing conjure-up.
Now that the cluster is up and running, how do I access it?
juju scp kubernetes-master/0:config ~/.kube/config