I just finished my setup of Kubernetes for my home lab and before I forget the process I want to record my notes.
I used Ubuntu Server 18.04 but any version should work. After a clean install the first thing to do is fetch the latest updates.
sudo apt update && sudo apt upgradeNext I installed juju
sudo snap install juju --classicNext install LXD
sudo snap install lxdConfigure LXD
sudo /snap/bin/lxd.migratesudo usermod -a -G lxd $USERSetup LXD
sudo lxd initThese are my answers to the questions
Would you like to use LXD clustering? noDo you want to configure a new storage pool? yes Name of the new storage pool: defaultName of the storage backend to use: dirWould you like to connect to a MAAS server? noWould you like to create a new local network bridge? yesWhat should the new bridge be called? lxdbr0What IPv4 address should be used? 10.8.8.1/24Would you like LXD to NAT IPv4 traffic on your bridge? yes What IPv6 address should be used? noneWould you like LXD to be available over the network? noWould you like stale cached images to be updated automatically? yesWould you like a YAML "lxd init" preseed to be printed? noUse conjure-up.
This is the selection screen when using conjure up. There are a couple of "spells" available. The one I selected was Charmed Distribution of Kubernetes.
The option "with nVidia GPU workers" did not allow me to deploy to LXD.
I do not want the add-on. I can always install them in the future if I want them using juju.
I had a lot of trouble with "localhost". It was not clear to me that I could not use ZFS as the storage backend for LXD. You must pick dir.
It was also not clear that the network stack must have ip-v6 disabled on LXD. When you run LXD init, and it asks
What IPv6 address should be used?none is the correct answer.
Picking the LXD network bridge and storage pool.
As long as the storage pool backend is DIR, you should be okay.
I picked the default option.
At this point, most people can hit the deploy button, unless you have an nVidia GPU in your system...
Navigate down to kubernetes-master and hit enter on configure
Open Advanced Configuration
Find enable-nvidia-plugin and give it the value false.
When this is finished, APPLY CHANGES
My internet connection is slow, so the deploy process took about 2 hours. For some reason, conjure-up and friends used up 28GB of memory during the deploy process and then halted on "waiting for 6 kube-system pods to start". A quick search on Google led me to a solution by someone who goes by the alias of snooksy.
To keep everything in one place, I will also document the solution here:
Once conjure-up gets stuck, open a new shell and try the following:
$ juju whoamiController: conjure-up-localhost-2f0
Model: conjure-charmed-kubernet-d97
User: admin
Notice the model in use. (I bolded it.)
$ lxc profile list+---------------------------------------------------------+---------+| NAME | USED BY |+---------------------------------------------------------+---------+| default | 7 |+---------------------------------------------------------+---------+| juju-conjure-charmed-kubernet-d97 | 6 |+---------------------------------------------------------+---------+| juju-conjure-charmed-kubernet-d97-kubernetes-master-747 | 1 |+---------------------------------------------------------+---------+| juju-controller | 1 |+---------------------------------------------------------+---------+There is a profile that matches the current model that juju is working with.
We want to edit this.
$ lxc profile show juju-conjure-charmed-kubernet-d97You should see something like this, but missing aadisable2
$ lxc profile edit juju-conjure-charmed-kubernet-d97Hopefully you are familiar with Vi.
One you have made the change, just keep waiting.
There is no need to restart the lxc containers.
Restarting them actually ends up killing conjure-up.
Now that the cluster is up and running, how do I access it?
juju scp kubernetes-master/0:config ~/.kube/config