Virtual Thoughts

Virtualisation, Storage and various other ramblings.

Category: Cloud (page 1 of 5)

On-prem K8s clusters with Rancher, Terraform and Ubuntu

One of the attractive characteristics of Kubernetes is how it can run pretty much anywhere – in the cloud, in the data center, on the edge, on your local machine and much more. Leveraging existing investments in datacenter resources can be logical when deciding where to place new Kubernetes clusters, and this post goes into automating this with Rancher and Terraform.


For this exercise the following is leveraged:

  • Rancher 2.3
  • vSphere 6.7
  • Ubuntu 18.04 LTS

An Ubuntu VM will be created and configured into a template to spin up Kubernetes nodes.

Step 1 – Preparing a Ubuntu Server VM

In Rancher 2.3 Node templates for vSphere can leverage either of the following:

For the purposes of this demo, "Deploy from template" will be used, given its simplicity.

To create a new VM template, we must first create a VM. Right-click an appropriate object in vCenter and select "New Virtual Machine"

Select a source:

Give it a name:

Give it a home (compute):

Give it a home (storage):

Specify the VM hardware version:

Specify the guest OS:

Configure the VM properties, ensure the Ubuntu install CD is mounted:

After this, power up the VM and walk through the install steps. After which it can be turned into a template:

Rancher doesn’t have much in the way of requirements for the VM. For this install method a VM needs to have:

  • Cloud-Init (Installed by default on Ubuntu 18.04).
  • SSH connectivity (Rancher will provide its own SSH certificates as per Cloud-Init bootstrap) – Ensure SSH server has been installed.

A Note on Cloud-Init

For Vanilla Ubuntu Server installs, it uses Cloud-Init as part of the general Installation process. As such, cloud-init can not be re-invoked on startup by default. To get around this for templating purposes, the VM must be void of the existing cloud-init configuration prior to being turned into a template. To accomplish this, run the following:

sudo rm -rf /var/lib/cloud/instances

Before shutting down the VM and converting it into a template.

Constructing the Terraform Script

Now the VM template has been created it can be leveraged by a Terraform script:

Specify the provider: (Note – insecure = "true" Is required for vCenter servers leveraging an untrusted certificate, such as self-signed.

provider "rancher2" {
  api_url    = ""
  access_key = #ommited - reference a Terraform varaible/environment variable/secret/etc
  secret_key = #ommited - reference a Terraform varaible/environment variable/secret/etc
  insecure = "true"

Specify the Cloud Credentials:

# Create a new rancher2 Cloud Credential
resource "rancher2_cloud_credential" "vsphere-terraform" {
  name = "vsphere-terraform"
  description = "Terraform Credentials"
  vsphere_credential_config {
    username = "Terraform@vsphere.local"
    password = #ommited - reference a Terraform varaible/environment variable/secret/etc
    vcenter = ""

Specify the Node Template settings:

Note we can supply extra cloud-config options to further customise the VM, including adding additional SSH keys for users.

resource "rancher2_node_template" "vSphereTestTemplate" {
  name = "vSphereTestTemplate"
  description = "Created by Terraform"
  cloud_credential_id =
   vsphere_config {
   cfgparam = ["disk.enableUUID=TRUE"]
   clone_from = "/Homelab/vm/Ubuntu1804WithCloudInit"
   cloud_config = "#cloud-config\nusers:\n  - name: demo\n    ssh-authorized-keys:\n      - ssh-rsa [SomeKey]
   cpu_count = "4"
   creation_type = "template"
   disk_size = "20000"
   memory_size = "4096"
   datastore = "/Homelab/datastore/NFS-500"
   datacenter = "/Homelab"
   pool = "/Homelab/host/MGMT/Resources"
   network = ["/Homelab/network/VDS-MGMT-DEFAULT"]

Specify the cluster settings:

resource "rancher2_cluster" "vsphere-test" {
  name = "vsphere-test"
  description = "Terraform created vSphere Cluster"
  rke_config {
    network {
      plugin = "canal"

Specify the Node Pool:

resource "rancher2_node_pool" "nodepool" {

  cluster_id =
  name = "all-in-one"
  hostname_prefix =  "vsphere-cluster-0"
  node_template_id =
  quantity = 1
  control_plane = true
  etcd = true
  worker = true

After which the script can be executed.

What’s going on?

From a high level the following activities are being executed:

  1. Rancher requests VM’s from vSphere using supplied Cloud Credentials.
  2. vSphere clones the VM Templateeverywhere with the specified configuration parameters.
  3. An ISO image is mounted to the VM, which contains certificates and configuration generated by Rancher in the cloud-init format.
  4. Cloud-Init on startup reads this ISO image and applies the configuration.
  5. Rancher builds the Kubernetes cluster by Installing Docker and pulling down the images.

After which, a shiny new cluster will be created!

Pi-Hole and K8s v2 – Now with DNS over HTTPS

In a previous post, I went through the process of configuring Pi-Hole within a Kubernetes cluster for the purpose of facilitating a network-wide adblocking. Although helpful, I wanted to augment this with DNS over HTTPS.

Complete manifests can be found here. Shout out to visibilityspots for the cloudflared image on Dockerhub


DNS, as a protocol, is insecure and can be prone to manipulation and man-in-the-middle attacks. DNS over HTTPS helps address this by encrypting the data between the DNS over HTTPS client and the DNS over HTTPS-based DNS resolver. One of which is provided by Cloudflare.

Thankfully, Pi-Hole has some documentation on how to implement this for the traditional Pi-Hole setups. But for Kubernetes-based deployments, this requires a different approach.


The DNS over HTTPS client is facilitated by a Cloudflare daemon. In a traditional Pi-Hole setup this is simply run alongside Pi-Hole itself, but in a containerised environment there are predominantly two ways to address this:

As a separate microservice

This approach leverages two different deployments, one for the Pi-Hole service, and one for cloudflared. While workable, I felt that this was a less optimal approach

Service-To-Service communication between Pi-Hole and Cloudflared

As another container within the Pi-Hole pod

Given the tight relationship between these containers, and the fact their respective services run on different ports, this seems like a more efficient approach.

Intra-pod communication between Pi-Hole and CloudflareD

As the containers share the same network interface, one pod can access the other over either the veth interface, or simply the localhost address. For Pi-Hole, we can facilitate this via a configmap change:

apiVersion: v1
kind: ConfigMap
  name: pihole-env
  namespace: pihole-system


Once the respective manifest files have been deployed and clients are pointing to pi-hole as a DNS resolver, it can be tested by accessing As per the example below, DNS over HTTPS has been identified.

Creating a highly available, cross AZ, loadbalanced ETCD cluster in AWS with Terraform

Having experimented with Terraform recently, I decided to leverage this tool by creating an etcd cluster in AWS. This blog post goes through the steps used to accomplish this. For readability, I’ve only quoted pertinent code snippets, but all of the code can be found at


I do not profess to be an Etcd, Terraform or AWS expert, therefore he be dragons in the form of implementations unlikely to be best practice or production-ready. In particular, I would like to revisit this at some point and enhance it to include:

  • Securing all Etcd communication with generated certs.
  • Implement a mechanism for rotating members in/out of the cluster.
  • Leverage auto-scaling groups for both bastion and Etcd members.
  • Locking down the security groups.
  • ….and a  lot more.



The basic principles of this implementation are as follows:

  • Etcd members will reside in private subnets and accessed via an internal load-balancer.
  • Etcd members will be joined to a cluster by leveraging the Etcd discovery URL.
  • A bastion host will be used to proxy all access to the private subnets.
  • The Terraform script creates all the components, at the VPC level and beyond.

The scripts are separated into the following files:

├── deployetcd.tpl
├── terraformec2.pem
├── terraform.tfstate
└── terraform.tfstate.backup


This is a template file for Terraform, it’s used to generate a bootstrap script to run on each of our Etcd nodes.

To begin with, download Etcd, extract and do some general housekeeping with regards to access and users. It will also output two environment variables – ETCD_HOST_IP and ETCD_NAME, which is needed for the Systemd unit file.

cd /usr/local/src
sudo wget ""
sudo tar -xvf etcd-v3.3.9-linux-amd64.tar.gz
sudo mv etcd-v3.3.9-linux-amd64/etcd* /usr/local/bin/
sudo mkdir -p /etc/etcd /var/lib/etcd
sudo groupadd -f -g 1501 etcd
sudo useradd -c "etcd user" -d /var/lib/etcd -s /bin/false -g etcd -u 1501 etcd
sudo chown -R etcd:etcd /var/lib/etcd

export ETCD_HOST_IP=$(ip addr show eth0 | grep "inet\b" | awk '{print $2}' | cut -d/ -f1)
export ETCD_NAME=$(hostname -s)

Create the Systemd unit file

sudo -E bash -c 'cat << EOF > /lib/systemd/system/etcd.service
Description=etcd service

ExecStart=/usr/local/bin/etcd \\
 --data-dir /var/lib/etcd \\
 --discovery ${discoveryURL} \\
 --initial-advertise-peer-urls http://$ETCD_HOST_IP:2380 \\
 --name $ETCD_NAME \\
 --listen-peer-urls http://$ETCD_HOST_IP:2380 \\
 --listen-client-urls http://$ETCD_HOST_IP:2379, \\
 --advertise-client-urls http://$ETCD_HOST_IP:2379 \\


${discoveryURL} is a placeholder variable. The Terraform script will replace this and generate the script file with a value for this variable.

This is where the discovery token is generated, parsed into the template file to which a new file will be generated – our complete bootstrap script. Simply performing an HTTP get request to the discovery URL will return a unique identifier URL in the response body. So we grab this response and store it as a data object. Note “size=3” denotes the explicit size of the Etcd cluster.

data "http" "etcd-join" {
  url = ""

Take this data and parse it into the template file, filling it the variable ${discoveryURL} with the unique URL generated.

data "template_file" "service_template" {
  template = "${file("./deployetcd.tpl")}"

  vars = {
    discoveryURL = "${data.http.etcd-join.body}"

Take the generated output, and store it as “” in the local directory

resource "local_file" "template" {
  content  = "${data.template_file.service_template.rendered}"
  filename = ""

The purpose of this code block is to generate one script to be executed on all three Etcd members, each joining a specific 3 node etcd cluster.

This script performs the following:

  • Create a bastion host in the public subnet with a public IP.
  • Create three etcd instanced in each AZ, and execute the generated script via the bastion host:
    connection {
    host = "${aws_instance.etc1.private_ip}"
    port = "22"
    type = "ssh"
    user = "ubuntu"
    private_key = "${file("./terraformec2.pem")}"
    timeout = "2m"
    agent = false

    bastion_host = "${aws_instance.bastion.public_ip}"
    bastion_port = "22"
    bastion_user = "ec2-user"
    bastion_private_key = "${file("./terraformec2.pem")}"

  provisioner "file" {
    source      = ""
    destination = "/tmp/"

  provisioner "remote-exec" {
    inline = [
      "chmod +x /tmp/",

  • Contains access and secret keys for AWS access

Configures the following:

  • Elastic IP for NAT gateway.
  • NAT gateway so private subnets can route out to pull etcd.
  • Three subnets in the EU-West-2 region
    • EU-West-2a
    • EU-West-2b
    • EU-West-2c
  • An ALB that spans across the aforementioned AZ’s:
    • Create a target group.
    • Create a listener on Etcd port.
    • Attach EC2 instances.
    • A health check that probes :2779/version on the Etcd EC2 instances.
  • An Internet Gateway and attach to VPC

  • Creates a default security group, allowing all (don’t do this in production)

  • Creates the VPC with a subnet of


  • Key used to SSH into the bastion host and Etcd EC2 instances.


Log on to each ec2 instance and check the service via “journalctl -u etcd.service”. In particular, we’re interested in the peering and if any errors occur.

ubuntu@ip-10-0-1-100:~$ journalctl -u etcd.service
Sep 08 15:34:22 ip-10-0-1-100 etcd[1575]: found peer b8321dfeb5729811 in the cluster
Sep 08 15:34:22 ip-10-0-1-100 etcd[1575]: found peer 7e245ce888bd3e1f in the cluster
Sep 08 15:34:22 ip-10-0-1-100 etcd[1575]: found self b4ccf12cb29f9dda in the cluster
Sep 08 15:34:22 ip-10-0-1-100 etcd[1575]: discovery URL=
 cluster 29d1ca9a6b0f3f87
Sep 08 15:34:23 ip-10-0-1-100 systemd[1]: Started etcd service.
Sep 08 15:34:23 ip-10-0-1-100 etcd[1575]: ready to serve client requests
Sep 08 15:34:23 ip-10-0-1-100 etcd[1575]: set the initial cluster version to 3.3
Sep 08 15:34:23 ip-10-0-1-100 etcd[1575]: enabled capabilities for version 3.3

We can also check they’re registered and healthy with the ALB:

Next, run a couple of commands against the ALB address:

[ec2-user@ip-10-0-4-242 ~]$ etcdctl --endpoints member list
7e245ce888bd3e1f: name=ip-10-0-3-100 peerURLs= clientURLs= isLeader=false
b4ccf12cb29f9dda: name=ip-10-0-1-100 peerURLs= clientURLs= isLeader=false
b8321dfeb5729811: name=ip-10-0-2-100 peerURLs= clientURLs= isLeader=true
[ec2-user@ip-10-0-4-242 ~]$ etcdctl --endpoints cluster-health
member 7e245ce888bd3e1f is healthy: got healthy result from
member b4ccf12cb29f9dda is healthy: got healthy result from
member b8321dfeb5729811 is healthy: got healthy result from
cluster is healthy

Splendid. Now this cluster is ready to start serving clients.

« Older posts

© 2020 Virtual Thoughts

Theme by Anders NorenUp ↑

Social media & sharing icons powered by UltimatelySocial
Visit Us