Virtual Thoughts

Virtualisation, Storage and various other ramblings.

KubeVirt on ARM64 – CDI Workaround

According to the KubeVirt documentation, CDI is not currently supported on ARM64, which is the architecture my Turing RK1 nodes use.

As a workaround, I experimented with writing an image directly to a PVC which can then be cloned/mounted to a KubeVirt VM. This example dd's an ISO image to a PVC:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: fedora-workstation-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 30Gi
  volumeMode: Block
---
apiVersion: batch/v1
kind: Job
metadata:
  name: upload-fedora-workstation-job
spec:
  template:
    spec:
      containers:
      - name: writer
        image: fedora:latest
        command: ["/bin/bash", "-c"]
        args:
          - |
            set -e
            echo "[1/3] Installing tools..."
            dnf install -y curl xz
            echo "[2/3] Downloading and decompressing Fedora Workstation image..."
            curl -L https://download.fedoraproject.org/pub/fedora/linux/releases/41/Workstation/aarch64/images/Fedora-Workstation-41-1.4.aarch64.raw.xz | xz -d > /tmp/disk.raw
            echo "[3/3] Writing image to PVC block device..."
            dd if=/tmp/disk.raw of=/dev/vda bs=4M status=progress conv=fsync
            echo "Done writing Fedora Workstation image to PVC!"
        volumeDevices:
        - name: disk
          devicePath: /dev/vda
        volumeMounts:
        - name: tmp
          mountPath: /tmp
        securityContext:
          runAsUser: 0
      restartPolicy: Never
      volumes:
      - name: disk
        persistentVolumeClaim:
          claimName: fedora-workstation-pvc
      - name: tmp
        emptyDir: {}

Which can then be mounted to a VM:

apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: my-arm-vm
spec:
  running: true
  template:
    metadata:
      labels:
        kubevirt.io/domain: my-arm-vm
    spec:
      domain:
        cpu:
          cores: 2
        resources:
          requests:
            memory: 2Gi
        devices:
          disks:
            - name: disk0
              disk:
                bus: virtio
      volumes:
        - name: disk0
          persistentVolumeClaim:
            claimName: fedora-workstation-pvc

Customising ArgoCD ApplicationSets with Template Patches

In a recent attempt to automate my homelab cluster (ref), I now manage all of my cluster applications using ArgoCD, including cilium. I also leverage applicationSet objects in ArgoCD as an app-of-apps pattern.

After a Cilium update however, it would fail to sync:

One way to address this is to add `ServerSideApply=true` to the the resulting application manifest:

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: bootstrap-applications
  namespace: argocd
spec:
  goTemplate: true
  goTemplateOptions: ["missingkey=error"]
  generators:
  - git:
      repoURL: 'https://github.com/David-VTUK/turing-pi-automation.git'
      revision: HEAD
      directories:
      - path: 'argocd-apps/helm-charts/import-from-cluster-standup/*'
  template:
    metadata:
      name: '{{ .path.basename }}'
    spec:
      project: default
      source:
        repoURL: 'https://github.com/David-VTUK/turing-pi-automation.git'
        targetRevision: HEAD
        path: '{{ .path.path }}'
        helm:
          valueFiles:
            - values.yaml
      destination:
        server: 'https://kubernetes.default.svc'
        namespace: '{{ .path.basename }}'
      syncPolicy:
        automated:
          prune: true
          selfHeal: true
        syncOptions:
          - CreateNamespace=true
          - ServerSideApply=true

The downside to this, however, is all applications from this applicationset will inherit this value, which is less than ideal.

Template Patches

TemplatePatch can be used in conjunction with ApplicationSet to selectively make changes to the resulting application based on specific criteria:

# Several lines omitted for brevity
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: bootstrap-applications
  namespace: argocd
spec:
  template:
  templatePatch: |
    {{ if eq .path.basename "cilium" }}
      spec:
        syncPolicy:
          syncOptions:
            - ServerSideApply=true # required to avoid the "annotations too long error"
            - CreateNamespace=true
    {{- end }}

After a resync the previous error is resolved, but Cilium would not be in sync:

Which is noted in a GitHub issue

To address this, templatePatch can be extended to ignore these by leveraging ignoreDifferences.

# Several lines omitted for brevity
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: bootstrap-applications
  namespace: argocd
spec:
  template:
  templatePatch: |
    {{ if eq .path.basename "cilium" }}
      spec:
        ignoreDifferences:
        - group: monitoring.coreos.com
          kind: ServiceMonitor
          name: ""
          jsonPointers:
            - /spec
        syncPolicy:
          syncOptions:
            - ServerSideApply=true # required to avoid the "annotations too long error"
            - CreateNamespace=true
    {{- end }}

Kubernetes on RK1 / Turing Pi 2: Automation with Ansible, Cilium and Cert-Manager

TLDR: Take me to the Playbook

Note – This is just a high-level overview, I’ll likely follow up with a post dedicated on the CIlium/BGP configuration.

I’ve had my Turing Pi 2 board for a while now, and during that time I’ve struggled to decide which automation tooling to use to bootstrap K3s to it. However, I reached a decision to use Ansible. It’s not something I’m overly familiar with, but this would provide a good opportunity to learn by doing.

The idea is pretty straightforward:

Each RK1 module is fairly well equipped:

  • 32GB Ram
  • 8 Core CPU
  • 512GB NVME SSD
  • Pre-Installed with Ubuntu

Mikrotik Config

Prior to standing up the cluster, the Mikrotik router can be pre-configured to peer with each of the RK1 Nodes, as I did:

routing/bgp/connection/ add address-families=ip as=64512 disabled=no local.role=ibgp name=srv-rk1-01 output.default-originate=always remote.address=172.16.10.221 routing-table=main

routing/bgp/connection/ add address-families=ip as=64512 disabled=no local.role=ibgp name=srv-rk1-02 output.default-originate=always remote.address=172.16.10.222 routing-table=main

routing/bgp/connection/ add address-families=ip as=64512 disabled=no local.role=ibgp name=srv-rk1-03 output.default-originate=always remote.address=172.16.10.223 routing-table=main

routing/bgp/connection/ add address-families=ip as=64512 disabled=no local.role=ibgp name=srv-rk1-04 output.default-originate=always remote.address=172.16.10.224 routing-table=main

Workflow

The following represents an overview of the steps code repo

To summarise each step:

Create Partition

Each of my RK1 Modules has a dedicated 512GB NVME drive – This will be used for primary Kubernetes storage as well as container storage. The drive is presented as a raw block device and therefore needs partitioning before mounting.

Mount Partition

The created partition is mounted to /mnt/data and checked.

Create Symlinks

Three directories are primarily used by K3s/Containerd to store data. Symlinks are created so their contents effectively reside on the NVME drive. These are:

/run/k3s -> /mnt/data/k3s
/var/lib/kubelet -> /mnt/data/k3s-kubelet
/var/lib/rancher -> /mnt/data/k3s-rancher

Install K3s

To facilitate replacing both Kube-Proxy and the default CNI to Cilium’s equivalents, a number of flags are passed to the Server install script:

--flannel-backend=none
--disable-network-policy
--write-kubeconfig-mode 644
--disable servicelb
--token {{ k3s_token }}
--disable-cloud-controller
--disable local-storage
--disable-kube-proxy
--disable traefik

In addition, the GatewayAPI CRD’s are installed:

- name: Apply gateway API CRDs
  kubernetes.core.k8s:
    state: present
    src: https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.1.0/experimental-install.yaml

Install Cilium

Cilium is is customised with the following options enabled:

  • BGP Control Plane
  • Hubble (Relay and UI)
  • GatewayAPI
  • BGP configuration to Peer with my Mikrotik router

Install Cert-Manager

Cert-Manager facilitates certificate generation for exposed services. In my environment, the API Gateway is annotated in a way that Cert-Manager will automatically generate a TLS Certificate for, using DNS challenges to Azure DNS.

This also includes the required clusterIssuer resource that provides configuration and authentication details

Expose Hubble-UI

A gateway and httproute resource is created to expose the Hubble UI:

Mikrotik BGP Peering Check

Using Winbox, the BGP peering and route propagation can be checked:

In my instance, 10.200.200.1 resolves to my API Gateway IP, with each node advertising this address.

« Older posts

© 2025 Virtual Thoughts

Theme by Anders NorenUp ↑

Social media & sharing icons powered by UltimatelySocial
RSS
Twitter
Visit Us
Follow Me