Oct 25, 2022 by Isaac Johnson

I did something dumb. I bumped a power cord. As I left for a camping trip, I must have inadvertently bumped the magsafe connector on the Macbook Air that serves as the primary node on the cluster. When that host eventually ran out of power, it went down.

When it came up, K3s wouldn’t start. It would fail and fail and fail. And I was certain I had corrupted something. I tried upgrades, reinstalls, etc. I went ahead and began upgrading the whole cluster to the latest 1.24… when I realized the problem. I neglected to remove the k3s-agent from when that Macbook air actually was a worker/agent for a different cluster. It had rejoined it’s parent cluster and blocked 6443.

I uninstalled the agent, but it was too late. The cluster at this point was half upgraded to 1.24.3. The Ingresses had stopped working. The lack of dockershim (yanked out in 1.24) might have been causing troubles with some charts. The end result was my cluster was pretty much hosed. And If I couldn’t get the Ingress to serve traffic, I really couldn’t use it.

azure-vote-front                                           LoadBalancer   <pending>     80:31761/TCP                                                                                                45h
react-form                                                 LoadBalancer     <pending>     80:31267/TCP                                                                                                45h
nginx-ingress-release-nginx-ingress                        LoadBalancer   <pending>     80:30466/TCP,443:31716/TCP                                                                                  44h

So I decided I must rebuild. I did this by taking what was the “old” cluster and turning it into the new. Like some sort of poor man’s Blue Green, I began the overhaul of rebuilding the cluster, yet again.

Each time I do this, I get it a little better. So in the effort of “showing my work” and perhaps helping others build back their k3s clusters, let’s, together, build a fresh 1.23.9 K3s cluster.

Getting some backups

First, I got backups of old data. This was my oldest cluster (which will become the new)

I’ll make a brand new NFS storage location


I’ll use a more meaningful name k3snfs77b2. I will not be using the Recycyle bin or access controls internally


Also, this time i’ll expand the CIDR from to (so i can use all my network range)

Next I did a standard Ubuntu upgrade of packages. I find brining up k8s clusters to be a good time to upgrade the underlying OS of nodes

We can the uninstall what is there

$ /usr/local/bin/k3s-uninstall.sh

Pro-tip: If you get stuck with the uninstall hanging, perhaps on broken PVCs.. you can use the background “&” and then use ps -ef to find the hung umount command and kill it.. then let the script finish. Doing this and rebooting a couple times generally does the trick of removing the old

Now we can install K3s on the master. Note: I’m picking a specific version, the latest (as of writing) of 1.23 (v1.23.9+k3s1)

Quick Note: If you plan to expose this externally, add your external IP with INSTALL_K3S_EXEC=”–tls-san x.x.x.x”

$ curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.23.9+k3s1 INSTALL_K3S_EXEC="server --disable traefik" sh -
We need to setup an ingress controller, Here I’ll use Nginx

You’ll want to patch the ingress class to be default

$ kubectl get ingressclass -o yaml > ingress.class.yaml
$ vi ingress.class.yaml
$ kubectl get ingressclass -o yaml > ingress.class.yaml.bak
builder@DESKTOP-QADGF36:~/Workspaces/jekyll-blog$ diff ingress.class.yaml ingress.class.yaml.bak
<       ngressclass.kubernetes.io/is-default-class: "true"
builder@DESKTOP-QADGF36:~/Workspaces/jekyll-blog$ kubectl apply -f ingress.class.yaml
Warning: resource ingressclasses/nginx is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
ingressclass.networking.k8s.io/nginx configured

Cert Manager

Apply the Cert Manager

$ kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.8.0/cert-manager.yaml

Add the Issuers

builder@DESKTOP-72D2D9T:~/Workspaces/jekyll-blog$ cat cm-issuer.yml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
  name: letsencrypt-staging
    email: isaac.johnson@gmail.com
    server: https://acme-staging-v02.api.letsencrypt.org/directory
      name: letsencrypt-staging
    - http01:
          class: nginx
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
  name: letsencrypt-prod-old
    server: https://acme-v02.api.letsencrypt.org/directory
    email: isaac.johnson@gmail.com
      name: letsencrypt-prod
    - http01:
          class: nginx

Now apply

builder@DESKTOP-72D2D9T:~/Workspaces/jekyll-blog$ kubectl apply -f cm-issuer.yml
clusterissuer.cert-manager.io/letsencrypt-staging created
clusterissuer.cert-manager.io/letsencrypt-prod-old created

I went to test an Ingress.

I’ll add the Azure Vote sample app as a good sample test

I’ll add the ingress to Azure Vote as I did before. That one used basic Auth so first I’ll copy over the secret

builder@DESKTOP-QADGF36:~/Workspaces/kubernetes-ingress$ kubectx default
Switched to context "default".
builder@DESKTOP-QADGF36:~/Workspaces/kubernetes-ingress$ kubectl get secret basic-auth
$ cat Ingress-Azure.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
    cert-manager.io/cluster-issuer: letsencrypt-prod
    ingress.kubernetes.io/auth-realm: Authentication Required - ok
    ingress.kubernetes.io/auth-secret: basic-auth
    ingress.kubernetes.io/auth-type: basic
    nginx.ingress.kubernetes.io/auth-realm: Authentication Required - foo
    nginx.ingress.kubernetes.io/auth-secret: basic-auth
    nginx.ingress.kubernetes.io/auth-type: basic
    nginx.ingress.kubernetes.io/proxy-body-size: "0"
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    app: azurevote
    release: azurevoterelease
  name: azurevote-ingress
  namespace: default
  ingressClassName: nginx
  - host: azurevote.freshbrewed.science
      - backend:
            name: azure-vote-front
              number: 80
        path: /
        pathType: Prefix
  - hosts:
    - azurevote.freshbrewed.science
    secretName: azurevote-tls

Note, I could do it without basic auth as well

builder@DESKTOP-72D2D9T:~/Workspaces/jekyll-blog$ cat Ingress-Azure.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
    cert-manager.io/cluster-issuer: letsencrypt-prod
    ingress.kubernetes.io/proxy-body-size: 2048m
    ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/proxy-body-size: 2048m
    nginx.ingress.kubernetes.io/proxy-read-timeout: "900"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "900"
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.org/client-max-body-size: 2048m
    app: azurevote
    release: azurevoterelease
  name: azurevote-ingress
  namespace: default
  ingressClassName: nginx
  - host: azurevote.freshbrewed.science
      - backend:
            name: azurevote-ui
              number: 80
        path: /
        pathType: Prefix
  - hosts:
    - azurevote.freshbrewed.science
    secretName: azurevote-tls

Now Apply

$ kubectl apply -f Ingress-Azure.yaml 
ingress.networking.k8s.io/azurevote-ingress created

Note: I already setup Route53 routing on the azurevote CNAME. You can refer to prior docs on how to do that

We can check the cert

$ kubectl get cert
Exposing the cluster

Obviously, Once I cared to start testing ingress traffic, I needed to swap my exposed cluster. The new master ( is now the exposed 80 and 443 host


Testing Azure Vote

We can now see the Azure Vote app is working


Route53 credentials

For doing DNS challenge, we’ll need to setup Route53 credentials

$ cat prod-route53-credentials-secret.yaml
apiVersion: v1
  secret-access-key: adsfasdfasdfasdfasdfasdfasdfasdfasdfasdfasdf==
kind: Secret
  name: prod-route53-credentials-secret
  namespace: default
type: Opaque

builder@DESKTOP-72D2D9T:~/Workspaces/jekyll-blog$ kubectl apply -f prod-route53-credentials-secret-cert-manager.yaml -n cert-manager
secret/prod-route53-credentials-secret created

We can then use it for our new LE Prod ClusterIssuer

$ cat le-prod-new.yml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
  name: letsencrypt-prod
    server: https://acme-v02.api.letsencrypt.org/directory
    email: isaac.johnson@gmail.com
      name: letsencrypt-prod
    - selector:
          - "freshbrewed.science"
          region: us-east-1
          accessKeyID: AKIARMVOGITWIUAKR45K
            name: prod-route53-credentials-secret
            key: secret-access-key
          # you can also assume a role with these credentials
          role: arn:aws:iam::095928337644:role/MyACMERole

$ kubectl apply -f le-prod-new.yml
clusterissuer.cert-manager.io/letsencrypt-prod created


First, I tried the standard NFS provisioner

Now add the NFS SC

$ helm install stable/nfs-server-provisioner --set persistence.enabled=true,persistence.size=5Gi --set nfs.server= --set nfs.path=/volume1/k3snfs77b2 --generate-name
WARNING: This chart is deprecated
then patched it in as default

$ kubectl patch storageclass local-path -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}' && kubectl patch storageclass nfs -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
storageclass.storage.k8s.io/local-path patched
storageclass.storage.k8s.io/nfs patched


However, I found it kept failing me on certain activities. While building out the pubsub demo I reworked it.

NFS that works

Two things needed to be done, first, setup the RBAC ClusterRole and ClusterRoleBinding as well as Role and RoleBinding.

$ cat k3s-prenfs.yaml 
apiVersion: v1
kind: ServiceAccount
  name: nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: default
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
  name: nfs-client-provisioner-runner
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create", "update", "patch"]
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
  name: run-nfs-client-provisioner
  - kind: ServiceAccount
    name: nfs-client-provisioner
    # replace with namespace where provisioner is deployed
    namespace: default
  kind: ClusterRole
  name: nfs-client-provisioner-runner
  apiGroup: rbac.authorization.k8s.io
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
  name: leader-locking-nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: default
  - apiGroups: [""]
    resources: ["endpoints"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
  name: leader-locking-nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: default
  - kind: ServiceAccount
    name: nfs-client-provisioner
    # replace with namespace where provisioner is deployed
    namespace: default
  kind: Role
  name: leader-locking-nfs-client-provisioner
  apiGroup: rbac.authorization.k8s.io
apiVersion: storage.k8s.io/v1
kind: StorageClass
  name: managed-nfs-storage
provisioner: fuseim.pri/ifs # or choose another name, must match deployment's env PROVISIONER_NAME'
  archiveOnDelete: "false"
  allowVolumeExpansion: "true"
  reclaimPolicy: "Delete"
allowVolumeExpansion: true

and then using the deployment. I started by following my own guide from 2020, but in K8s version 1.20 and beyond, there is actually an issue with selfLink being deprecated.

Therefore, the manifest that worked used a new container image (gcr.io/k8s-staging-sig-storage/nfs-subdir-external-provisioner:v4.0.0)

apiVersion: apps/v1
kind: Deployment
  name: nfs-client-provisioner
    app: nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: default
  replicas: 1
    type: Recreate
      app: nfs-client-provisioner
        app: nfs-client-provisioner
      serviceAccountName: nfs-client-provisioner
        - name: nfs-client-provisioner
          image: gcr.io/k8s-staging-sig-storage/nfs-subdir-external-provisioner:v4.0.0
            - name: nfs-client-root
              mountPath: /persistentvolumes
            - name: PROVISIONER_NAME
              value: fuseim.pri/ifs
            - name: NFS_SERVER
            - name: NFS_PATH
              value: /volume1/k3snfs77b2
        - name: nfs-client-root
            path: /volume1/k3snfs77b2

Then I swapped SC defaults

$ kubectl patch storageclass nfs -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}' && kubectl patch storageclass local-path  -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}' && kubectl patch storageclass managed-nfs-storage -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'


I wanted to add Redis, both to tests my PVCs and to have an in-cluster memorystore

$ helm install my-redis-release bitnami/redis-cluster
NAME: my-redis-release
LAST DEPLOYED: Mon Jul 25 21:44:20 2022
NAMESPACE: default
STATUS: deployed
CHART NAME: redis-cluster
APP VERSION: 7.0.4** Please be patient while the chart is being deployed **

To get your password run:
    export REDIS_PASSWORD=$(kubectl get secret --namespace "default" my-redis-release-redis-cluster -o jsonpath="{.data.redis-password}" | base64 -d)

You have deployed a Redis&reg; Cluster accessible only from within you Kubernetes Cluster.INFO: The Job to create the cluster will be created.To connect to your Redis&reg; cluster:

1. Run a Redis&reg; pod that you can use as a client:
kubectl run --namespace default my-redis-release-redis-cluster-client --rm --tty -i --restart='Never' \
--image docker.io/bitnami/redis-cluster:7.0.4-debian-11-r1 -- bash

2. Connect using the Redis&reg; CLI:

redis-cli -c -h my-redis-release-redis-cluster -a $REDIS_PASSWORD

We can see it used the NFS (I did this before the NFS swap)

$ kubectl get pvc
To setup Harbor again, we need to explicitly request the certs

builder@DESKTOP-72D2D9T:~/Workspaces/jekyll-blog$ cat create-secrets-harbor.yaml
apiVersion: cert-manager.io/v1
kind: Certificate
  name: harbor-fb-science
  namespace: default
  commonName: harbor.freshbrewed.science
  - harbor.freshbrewed.science
    kind: ClusterIssuer
    name: letsencrypt-prod
  secretName: harbor.freshbrewed.science-cert
apiVersion: cert-manager.io/v1
kind: Certificate
  name: notary-fb-science
  namespace: default
  commonName: notary.freshbrewed.science
  - notary.freshbrewed.science
    kind: ClusterIssuer
    name: letsencrypt-prod
  secretName: notary.freshbrewed.science-cert
builder@DESKTOP-72D2D9T:~/Workspaces/jekyll-blog$ kubectl apply -f create-secrets-harbor.yaml
certificate.cert-manager.io/harbor-fb-science created
certificate.cert-manager.io/notary-fb-science created

Now applied, we can watch for when they are satisfied

Now we can use them in our deploy

$ cat harbor-registry.values.yaml
      cert-manager.io/cluster-issuer: letsencrypt-production
    className: nginx
      core: harbor.freshbrewed.science
      notary: notary.freshbrewed.science
    certSource: secret
      notarySecretName: notary.freshbrewed.science-cert
      secretName: harbor.freshbrewed.science-cert
  type: ingress
externalURL: https://harbor.freshbrewed.science
harborAdminPassword: Tm90TXlQYXNzd29yZAo=
  enabled: true
  enabled: true
secretKey: 8d10dlskeit8fhtg

$ helm upgrade --install harbor-registry harbor/harbor --values ./harbor-registry.values.yaml
Release "harbor-registry" does not exist. Installing it now.
NAME: harbor-registry
LAST DEPLOYED: Mon Jul 25 21:48:33 2022
NAMESPACE: default
STATUS: deployed
Please wait for several minutes for Harbor deployment to complete.
Then you should be able to visit the Harbor portal at https://harbor.freshbrewed.science
For more details, please visit https://github.com/goharbor/harbor

Add a K3s Worker Node (Agent)

You will want to uninstall old agents. They leave a lot of junk around and this was a step I neglected to do before

because it calls /usr/local/bin/k3s-killall.sh and that is failing on PVCs, I commented out that line

To add agents, I’ll need the token from the master

isaac@isaac-MacBookAir:~$ !163
sudo cat /var/lib/rancher/k3s/server/node-token
[sudo] password for isaac:

I’ll need the new kubeconfig, I can grab that at the same time

isaac@isaac-MacBookAir:~$ sudo cat /etc/rancher/k3s/k3s.yaml | base64 -w 0

And I can just echo it back to create a local config

echo YXBpVmVyc2lvbjogdjEKY2x1c3RlcnM6Ci0gY2x1c3RlcjoKICAgIGNlcnRpZmljYXRlLWF1dGhvcml0eS1kYXRhOiBMUzB0TFMxQ1JVZEpUaUJEUlZKVVNVWkpRMEZVUlMwdExTMHRDazFKU1VKbFJFTkRRVkl5WjBGM1NVSkJaMGxDUVVSQlMwSm5aM0ZvYT... | base64 --decode | sed 's/' > ~/.kube/New-mac77-internal-config

For convenance, I went and merged that into my main kube config so i could use kubectx to change clusters

builder@DESKTOP-QADGF36:~$ kubectx
builder@DESKTOP-QADGF36:~$ kubectx oldmaccluster
Switched to context "oldmaccluster".
builder@DESKTOP-QADGF36:~$ kubectl get nodes
NAME               STATUS   ROLES                  AGE   VERSION
isaac-macbookair   Ready    control-plane,master   9h    v1.23.9+k3s1

Agent steps

If you haven’t already, you’ll want to stop the existing agents (we’ve done this)

$ sudo /usr/local/bin/k3s-killall.sh
$ sudo systemctl stop k3s-agent.service
$ sudo service k3s-agent stop

To be thorough, we also do the remove all

$ sudo /usr/local/bin/k3s-agent-uninstall.sh

The key things are k3s should be gone and no running k3s processes

Now we can add the node. Note that I’m specifically picking the k3s version of v1.23.9+k3s1 as the latest really trashed the primary cluster.

hp@hp-HP-EliteBook-850-G2:~$ curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.23.9+k3s1 K3S_URL= K3S_TOKEN=K107d8e80976d8e1258a502cc802d2ad6c4c35cc2f16a36161e32417e87738014a8::server:581be6c9da1c56ea3d8d5d776979585a sh -
And now we see our new node

If we have no issues, the process should go fast

Now that Harbor is up, I'll login as admin and then create a local "Isaac" admin user.



Add the Helm Repo if missing

I’ll create a fresh values file, this time calling the cluster k3s77b02

$ cat datadog.release.values.yml
targetSystem: "linux"
  # apiKey: <DATADOG_API_KEY>
  # appKey: <DATADOG_APP_KEY>
  # If not using secrets, then use apiKey and appKey instead
  apiKeyExistingSecret: dd-secret
  clusterName: k3s77b02
  tags: []
    enabled: true
  appKey: 51bbf169c11305711e4944b9e74cd918838efbb2
    enabled: true
    port: 8126
    portEnabled: true
    containerCollectAll: true
    enabled: true
    enabled: true
    enabled: true
    processCollection: true
  replicas: 2
    create: true
    serviceAccountName: default
    enabled: true
    createReaderRbac: true
    useDatadogMetrics: true
      type: ClusterIP
      port: 8443
    create: true
    serviceAccountName: default
  enabled: true
    create: true
    serviceAccountName: default
  replicas: 2

Then I just install

$ helm install my-dd-release -f datadog.release.values.yml datadog/datadog
W0726 06:45:47.415196    4816 warnings.go:70] spec.template.metadata.annotations[container.seccomp.security.alpha.kubernetes.io/system-probe]: deprecated since v1.19, non-functional in v1.25+; use the "seccompProfile" field instead
NAME: my-dd-release
LAST DEPLOYED: Tue Jul 26 06:45:46 2022
NAMESPACE: default
STATUS: deployed
Datadog agents are spinning up on each node in your cluster. After a few
minutes, you should see your agents starting in your event stream:
You disabled creation of Secret containing API key, therefore it is expected
that you create Secret named 'dd-secret' which includes a key called 'api-key' containing the API key.

The Datadog Agent is listening on port 8126 for APM service.

####               WARNING: Deprecation notice               ####

The option `datadog.apm.enabled` is deprecated, please use `datadog.apm.portEnabled` to enable TCP communication to the trace-agent.
The option `datadog.apm.socketEnabled` is enabled by default and can be used to rely on unix socket or name-pipe communication.

####   WARNING: Cluster-Agent should be deployed in high availability mode     ####

The Cluster-Agent should be in high availability mode because the following features
are enabled:
* Admission Controller
* External Metrics Provider

To run in high availability mode, our recommandation is to update the chart
configuration with:
* set `clusterAgent.replicas` value to `2` replicas .
* set `clusterAgent.createPodDisruptionBudget` to `true`.

GH Actions Runner

Much the same, let me add the GH Actions Runner charts

Create a namespace, then apply the third

$ kubectl create ns actions-runner-system
namespace/actions-runner-system created
$ kubectl apply -f cm.ars.secret.yaml -n actions-runner-system
secret/controller-manager created

Now installed

$ helm upgrade --install --namespace actions-runner-system --wait actions-runner-controller actions-runner-controller/actions-runner-controller --set authSecret.name=controller-manager
Release "actions-runner-controller" does not exist. Installing it now.
NAME: actions-runner-controller
LAST DEPLOYED: Tue Jul 26 06:51:04 2022
NAMESPACE: actions-runner-system
STATUS: deployed
1. Get the application URL by running these commands:
  export POD_NAME=$(kubectl get pods --namespace actions-runner-system -l "app.kubernetes.io/name=actions-runner-controller,app.kubernetes.io/instance=actions-runner-controller" -o jsonpath="{.items[0].metadata.name}")
  export CONTAINER_PORT=$(kubectl get pod --namespace actions-runner-system $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}")
  echo "Visit to use your application"
  kubectl --namespace actions-runner-system port-forward $POD_NAME 8080:$CONTAINER_PORT

My Next step is to build and push the Dockerfile which is our base image

builder@DESKTOP-QADGF36:~/Workspaces/jekyll-blog/ghRunnerImage$ cat Dockerfile
FROM summerwind/actions-runner:latest

RUN sudo apt update -y \
  && umask 0002 \
  && sudo apt install -y ca-certificates curl apt-transport-https lsb-release gnupg

# Install MS Key
RUN curl -sL https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor | sudo tee /etc/apt/trusted.gpg.d/microsoft.gpg > /dev/null

# Add MS Apt repo
RUN umask 0002 && echo "deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ focal main" | sudo tee /etc/apt/sources.list.d/azure-cli.list

# Install Azure CLI
RUN sudo apt update -y \
  && umask 0002 \
  && sudo apt install -y azure-cli awscli ruby-full

RUN sudo chown runner /usr/local/bin

RUN sudo chmod 777 /var/lib/gems/2.7.0

RUN sudo chown runner /var/lib/gems/2.7.0

# Install Expect and SSHPass

RUN sudo apt update -y \
  && umask 0002 \
  && sudo apt install -y sshpass expect

# save time per build
RUN umask 0002 \
  && gem install jekyll bundler

RUN sudo rm -rf /var/lib/apt/lists/*


I’ll first create the private Harbor project


I can see I’m admin there


But I’ll likely want to create a docker puller user

The password will be in that base64 string (so I needn’t modify anything else)

builder@DESKTOP-QADGF36:~$ echo eyJhdXRocyI6eyJoYXJib3IuZnJlc2hicmV3ZWQuc2NpZW5jZSI6eyJ1c2VybmFtZSI6ImltYWdlcHVsbGVyIiwicGFzc3dvcmQiOiJub3RoZXJlYWxwYXNzd29yZCIsImVtYWlsIjoiaXNhYWMuam9obnNvbkBnbWFpbC5jb20iLCJhdXRoIjoiYVdhc2RmYXNkZmFzZGZhc2ZkZENFPSJ9fX0= | base64 --decode

Then I create the new user


And add to the project


Unlike before where we just created manually

kubectl create secret docker-registry myharborreg --docker-server=harbor.freshbrewed.science --docker-username=imagepuller --docker-password=adsfasdfasdfsadf --docker-email=isaac.johnson@gmail.com

we can instead just copy over the harbor reg cred as we are keeping the same password

I’ll first build the Dockerfile

builder@DESKTOP-QADGF36:~/Workspaces/jekyll-blog/ghRunnerImage$ docker build -t harbor.freshbrewed.science/freshbrewedprivate/myghrunner:1.1.13 .
Ahh! the old Entity too large error. The forever changing annotations of Nginx.

We can see the “new” Ingress looks like this:

builder@DESKTOP-QADGF36:~$ kubectl get ingress harbor-registry-ingress -o yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
    cert-manager.io/cluster-issuer: letsencrypt-production
    ingress.kubernetes.io/proxy-body-size: "0"
    ingress.kubernetes.io/ssl-redirect: "true"
    meta.helm.sh/release-name: harbor-registry
    meta.helm.sh/release-namespace: default
    nginx.ingress.kubernetes.io/proxy-body-size: "0"
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
  creationTimestamp: "2022-07-26T02:48:40Z"
  generation: 1
    app: harbor
    app.kubernetes.io/managed-by: Helm
    chart: harbor
    heritage: Helm
    release: harbor-registry
  name: harbor-registry-ingress
  namespace: default
  resourceVersion: "33018"
  uid: 04392320-159c-4c39-9cf2-2fb8b388fa29

and the former (working) as

apiVersion: networking.k8s.io/v1
kind: Ingress
    cert-manager.io/cluster-issuer: letsencrypt-production
    ingress.kubernetes.io/proxy-body-size: "0"
    ingress.kubernetes.io/ssl-redirect: "true"
    kubectl.kubernetes.io/last-applied-configuration: |
    meta.helm.sh/release-name: harbor-registry
    meta.helm.sh/release-namespace: default
    nginx.ingress.kubernetes.io/proxy-body-size: "0"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.org/client-max-body-size: "0"
    nginx.org/proxy-connect-timeout: "600"
    nginx.org/proxy-read-timeout: "600"
  creationTimestamp: "2022-06-13T00:35:26Z"
  generation: 2
    app: harbor
    app.kubernetes.io/managed-by: Helm
    chart: harbor
    heritage: Helm
    release: harbor-registry
  name: harbor-registry-ingress
  namespace: default
  resourceVersion: "16992267"
  uid: f16adda2-1a47-4486-af4d-c8ab7d9c75c3

I’ll add the missing values

builder@DESKTOP-QADGF36:~$ kubectl get ingress harbor-registry-ingress -o yaml > harbor-registry-ingress.yaml
builder@DESKTOP-QADGF36:~$ kubectl get ingress harbor-registry-ingress -o yaml > harbor-registry-ingress.yaml.bak
builder@DESKTOP-QADGF36:~$ vi harbor-registry-ingress.yaml
builder@DESKTOP-QADGF36:~$ diff harbor-registry-ingress.yaml harbor-registry-ingress.yaml.bak
<     nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
<     nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
<     nginx.org/client-max-body-size: "0"
<     nginx.org/proxy-connect-timeout: "600"
<     nginx.org/proxy-read-timeout: "600"

builder@DESKTOP-QADGF36:~$ kubectl apply -f harbor-registry-ingress.yaml
Warning: resource ingresses/harbor-registry-ingress is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
ingress.networking.k8s.io/harbor-registry-ingress configured

And now the push works

We can now apply the runner deployment

builder@DESKTOP-QADGF36:~/Workspaces/jekyll-blog/ghRunnerImage$ cat newRunnerDeployment.yml
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
  name: new-jekyllrunner-deployment
  namespace: default
  replicas: 1
  selector: null
    metadata: {}
      dockerEnabled: true
      dockerdContainerResources: {}
      - name: AWS_DEFAULT_REGION
        value: us-east-1
      - name: AWS_ACCESS_KEY_ID
            key: USER_NAME
            name: awsjekyll
            key: PASSWORD
            name: awsjekyll
      - name: DATADOG_API_KEY
            key: DDAPIKEY
            name: ddjekyll
      image: harbor.freshbrewed.science/freshbrewedprivate/myghrunner:1.1.13
      imagePullPolicy: IfNotPresent
      - name: myharborreg
      #- name: alicloud
      - new-jekyllrunner-deployment
      repository: idjohnson/jekyll-blog
      resources: {}

$ kubectl apply -f newRunnerDeployment.yml
runnerdeployment.actions.summerwind.dev/new-jekyllrunner-deployment created

A reminder, the GH token used came from cm.ars.secret.yaml. So if it has expired, that is where to update

We can see it is working by looking at the running pods


and then checking the active runners in GH



We already have the latest binary from our recent blog so we can just re-install it

$ dapr -v
$ dapr init -k
⌛  Making the jump to hyperspace...
ℹ️  Note: To install Dapr using Helm, see here: https://docs.dapr.io/getting-started/install-dapr-kubernetes/#install-with-helm-advanced

ℹ️  Container images will be pulled from Docker Hub
✅  Deploying the Dapr control plane to your cluster...
✅  Success! Dapr has been installed to namespace dapr-system. To verify, run `dapr status -k' in your terminal. To get started, go here: https://aka.ms/dapr-getting-started


as with Dapr, we have the latest binary, so we can just install loft.sh

builder@DESKTOP-QADGF36:~/Workspaces/jekyll-blog/ghRunnerImage$ loft -v
loft version 2.2.0
builder@DESKTOP-QADGF36:~/Workspaces/jekyll-blog/ghRunnerImage$ loft start --host=loft.freshbrewed.science
[warn]   There is a newer version of Loft: v2.2.1. Run `loft upgrade` to upgrade to the newest version.

? Seems like you try to use 'loft start' with a different kubernetes context than before. Please choose which kubernetes context you want to use

[info]   Welcome to Loft!
[info]   This installer will help you configure and deploy Loft.

? Enter your email address to create the login for your admin user isaac.johnson@gmail.com

[info]   Executing command: helm upgrade loft loft --install --reuse-values --create-namespace --repository-config='' --kube-context oldmaccluster --namespace loft --repo https://charts.loft.sh/ --set admin.email=isaac.johnson@gmail.com --set admin.password=4d5140b3-asdf-asdf-asdf-asdfasdf --set ingress.enabled=true --set ingress.host=loft.freshbrewed.science --reuse-values

[done] √ Loft has been deployed to your cluster!
[done] √ Loft pod successfully started

? Unable to reach Loft at https://loft.freshbrewed.science. Do you want to start port-forwarding instead?
 No, please re-run the DNS check

###################################     DNS CONFIGURATION REQUIRED     ##################################

Create a DNS A-record for loft.freshbrewed.science with the EXTERNAL-IP of your nginx-ingress controller.
To find this EXTERNAL-IP, run the following command and look at the output:

> kubectl get services -n ingress-nginx
NAME                       TYPE           CLUSTER-IP | EXTERNAL-IP   |  PORT(S)                      AGE
ingress-nginx-controller   LoadBalancer | XX.XXX.XXX.XX |  80:30984/TCP,443:31758/TCP   19m

EXTERNAL-IP may be 'pending' for a while until your cloud provider has created a new load balancer.


The command will wait until loft is reachable under the host. You can also abort and use port-forwarding instead
by running 'loft start' again.

[done] √ Loft is reachable at https://loft.freshbrewed.science

##########################   LOGIN   ############################

Username: admin
Password: 4d5140b3-asdf-asdf-asdf-asdfasdfasdfasdf  # Change via UI or via: loft reset password

Login via UI:  https://loft.freshbrewed.science
Login via CLI: loft login --insecure https://loft.freshbrewed.science

!!! You must accept the untrusted certificate in your browser !!!

Follow this guide to add a valid certificate: https://loft.sh/docs/administration/ssl


Loft was successfully installed and can now be reached at: https://loft.freshbrewed.science

I did use the Ingress definition from the old. Namely it set the class and the cert provider

$ kubectl get ingress -n loft -o yaml
apiVersion: v1
- apiVersion: networking.k8s.io/v1
  kind: Ingress
      cert-manager.io/cluster-issuer: letsencrypt-prod
      ingress.kubernetes.io/proxy-body-size: "0"
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"networking.k8s.io/v1","kind":"Ingress","metadata":{"annotations":{"cert-manager.io/cluster-issuer":"letsencrypt-prod","ingress.kubernetes.io/proxy-body-size":"0","meta.helm.sh/release-name":"loft","meta.helm.sh/release-namespace":"loft","nginx.ingress.kubernetes.io/proxy-body-size":"0","nginx.ingress.kubernetes.io/proxy-buffer-size":"32k","nginx.ingress.kubernetes.io/proxy-buffers-number":"8 32k","nginx.ingress.kubernetes.io/proxy-read-timeout":"43200","nginx.ingress.kubernetes.io/proxy-send-timeout":"43200","nginx.org/websocket-services":"loft"},"creationTimestamp":"2022-06-21T17:02:53Z","generation":1,"labels":{"app":"loft","app.kubernetes.io/managed-by":"Helm","chart":"loft-2.2.0","heritage":"Helm","release":"loft"},"name":"loft-ingress","namespace":"loft","resourceVersion":"1583518","uid":"87125ded-906a-4e03-8458-eb6207258fe8"},"spec":{"ingressClassName":"nginx","rules":[{"host":"loft.freshbrewed.science","http":{"paths":[{"backend":{"service":{"name":"loft","port":{"number":80}}},"path":"/","pathType":"ImplementationSpecific"}]}}],"tls":[{"hosts":["loft.freshbrewed.science"],"secretName":"tls-loft"}]},"status":{"loadBalancer":{"ingress":[{"ip":""}]}}}
      meta.helm.sh/release-name: loft
      meta.helm.sh/release-namespace: loft
      nginx.ingress.kubernetes.io/proxy-body-size: "0"
      nginx.ingress.kubernetes.io/proxy-buffer-size: 32k
      nginx.ingress.kubernetes.io/proxy-buffers-number: 8 32k
      nginx.ingress.kubernetes.io/proxy-read-timeout: "43200"
      nginx.ingress.kubernetes.io/proxy-send-timeout: "43200"
      nginx.org/websocket-services: loft
    creationTimestamp: "2022-06-21T18:04:46Z"
    generation: 1
      app: loft
      app.kubernetes.io/managed-by: Helm
      chart: loft-2.2.0
      heritage: Helm
      release: loft
    name: loft-ingress
    namespace: loft
    resourceVersion: "16992274"
    uid: 13b35ddf-9bf8-4313-8650-d062cfb508eb
    ingressClassName: nginx
    - host: loft.freshbrewed.science
        - backend:
              name: loft
                number: 80
          path: /
          pathType: ImplementationSpecific
    - hosts:
      - loft.freshbrewed.science
      secretName: tls-loft
    loadBalancer: {}
kind: List
  resourceVersion: ""

Recreating the Original Cluster

Now that we have a running Primary (mac77)

builder@DESKTOP-QADGF36:~$ kubectx mac77
# Show IPs
$ kubectl get nodes -o yaml | grep 192 | grep internal-ip

Additionally, at this time, the former cluster is dead

This means for this mac81 cluster, which I’ll spin fresh, I can also use an unused node from the old builder-macbookpro2

I plan to try using MetalLB again with this cluster. It’s also a good time to update the OS


builder@anna-MacBookAir:~$ uptime
 09:15:51 up 4 min,  2 users,  load average: 5.37, 5.54, 2.46

Seems a reboot proved it was already running… I did many steps to remove it, until such time i rebooted and there was no k3s running.

builder@anna-MacBookAir:~$ ps -ef | grep k3s
builder     1664    1628  0 09:37 pts/0    00:00:00 grep --color=auto k3s

Now install on Primary

Now I’ll install the latest (again, this will be the “test” cluster)

builder@anna-MacBookAir:~$ curl -sfL https://get.k3s.io  | INSTALL_K3S_CHANNEL=latest K3S_KUBECONFIG_MODE="644" INSTALL_K3S_EXEC="--tls-san" sh -
[sudo] password for builder:
[INFO]  Finding release for channel latest
[INFO]  Using v1.24.4+k3s1 as release
[INFO]  Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.24.4+k3s1/sha256sum-amd64.txt
[INFO]  Downloading binary https://github.com/k3s-io/k3s/releases/download/v1.24.4+k3s1/k3s
[INFO]  Verifying binary download
[INFO]  Installing k3s to /usr/local/bin/k3s
[INFO]  Skipping installation of SELinux RPM
[INFO]  Creating /usr/local/bin/kubectl symlink to k3s
[INFO]  Creating /usr/local/bin/crictl symlink to k3s
[INFO]  Creating /usr/local/bin/ctr symlink to k3s
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-uninstall.sh
[INFO]  env: Creating environment file /etc/systemd/system/k3s.service.env
[INFO]  systemd: Creating service file /etc/systemd/system/k3s.service
[INFO]  systemd: Enabling k3s unit
Created symlink /etc/systemd/system/multi-user.target.wants/k3s.service → /etc/systemd/system/k3s.service.
[INFO]  systemd: Starting k3s

I need the token to allow other hosts to join

builder@anna-MacBookAir:~$ sudo cat /var/lib/rancher/k3s/server/node-token

We now have a running “test” cluster on mac81

Lastly, to get the new kubconfig

$ sudo cat /etc/rancher/k3s/k3s.yaml | base64 -w 0

On my workstation, i can then decode

echo YXBpVmVyc2lvbjogdjEKY2x1c3RlcnM6Ci0gY2x1c3RlcjoKICAgIGNlcnRpZmljYXRlLWF1dGh... | base64 --decode | sed 's/' | sed 's/default/mac81/g' > ~/.kube/mac81-int

I then merged it with the main kubeconfig We can now test it

One thing I’ve started doing is saving my Kubeconfig in AKV for safe keeping

builder@DESKTOP-QADGF36:~$ az keyvault secret set -n k3sremoteconfig --vault-name idjakv --file ~/.kube/config

Cert Manager

We’ll try keeping Traefik and now add the latest Cert-Manager

Then I’ll add the Secret and Cluster Issuer

$ kubectl apply -f mac77.secret.prod-route53-credentials-secret.yaml
secret/prod-route53-credentials-secret created
$ kubectl apply -f mac77.clusterissuer.yaml
clusterissuer.cert-manager.io/letsencrypt-staging created
clusterissuer.cert-manager.io/letsencrypt-prod-old created
clusterissuer.cert-manager.io/letsencrypt-prod created

Getting the Kubeconfig quickly

I’ve had to respin clusters so often, I find the process of fetching a new kubeconfig rather tiresome.

First, on your known master, setup passwordless login

Next, since I don’t need to base64 it to bring it back, i can just echo and fix the address to match the host

$ ssh builder@ 'cat /etc/rancher/k3s/k3s.yaml' | sed 's/' > ~/.kube/mac81-int

should work, doesnt: https://www.oueta.com/kubernetes/using-multiple-kubeconfig-files-and-how-to-merge-to-a-single/

Bring it together in Ansible

We now flash forward a month later.

I’ve been using AWX in Kubernetes quite successfully. Occasionally the postgres backend loses its mind and requires a thumb to come back, but it’s temporary and the data is all there so I haven’t given it much thought.

It was during this time I’ve developed a comprehensive playbook to handle this. You can find this, and others, on my public playbooks repo.

We’ll first cover the scrub (needed just once) and the reload (can be used any time after).

Scrub K3s

This first playbook we’ll discuss is scrubek3s.

- name: Scrub K3s
  hosts: all

  - name: Uninstall previous k3s
    ansible.builtin.shell: |
      /usr/local/bin/k3s-uninstall.sh &
      /usr/local/bin/k3s-agent-uninstall.sh &
      sleep 10
    become: true
    ignore_errors: True
      chdir: /tmp

  - name: reboot
    ansible.builtin.shell: |
      reboot now
    become: true
    ignore_errors: True
      chdir: /tmp

This is pretty straightfoward. It just heads out to the hosts and uninstalls things, then reboots. I found for some hosts, the uninstall would hang due to stuck pvc mounts or containerd containers that just wouldn’t die. Rather than loop through kill -9 a lot, i found the easier solution was to uninstall, wait about 10s then force a reboot. This really cleared up any stuck systems.

Also, if you have been in a mixed environment, you.. okay… I, tend to forget which were primary and which were secondary hosts. This just nukes ‘em all.

K3s Reload

The Playbook I’ve been more active in updating is my K3S reload.

Just today I worked out a perl script that completes the last mile (and what lead me to decide to finally post this article).

The reloadk3s.yaml playbook has 3 phases.

The first, which applies to all, is to setup any required dependencies. This assumes Ubuntu (or a linux with aptitude package manager) and will add Python, GCSFuse, the AWS CLI and s3fs, and lastly CIFS needed for NFS PVCs.

- name: Reload K3s
  hosts: all

  - name: Install Python3
    ansible.builtin.shell: |
      chdir: /tmp

  - name: Install Python3
    ansible.builtin.shell: |
      apt-get install -y python3 python3-venv python3-pip
    become: true
      chdir: /tmp

  - name: Install Cifs and iscsi
    ansible.builtin.shell: |
      apt-get install -y cifs-utils open-iscsi
    become: true
      chdir: /tmp

  - name: Install GCP Fuse
    ansible.builtin.shell: |
      export GCSFUSE_REPO=gcsfuse-`lsb_release -c -s`
      rm -f /etc/apt/sources.list.d/gcsfuse.list || true
      echo "deb http://packages.cloud.google.com/apt $GCSFUSE_REPO main" | sudo tee /etc/apt/sources.list.d/gcsfuse.list
      curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
      apt-get update
      apt-get install -y gcsfuse
    become: true
      chdir: /tmp

  - name: Add AWS CLI
    ansible.builtin.shell: |
      apt-get install -y awscli s3fs
    become: true
      chdir: /tmp

The next Phase updates just the Primary host which will service as the k3s main. You’ll note that I pin it to a version (1.23.10). You’ll also note that for just that release, the K3s team fat fingered the release label so there is a funny “%2B” which is the UTF-8 escape character for “+”. Any other release is usually vx.xx.x+k3s1.

I’ll be using that specific version with the INSTALL_K3S_VERSION parameter here and in the last phase where we add agents.

While I could mask it, I’m not going to since I’ll just have the playbooks updated anyhow. You’ll see that my mac81 cluster serves traffic locally on as well as externally on

This is important. It’s possible, but a real pain in the rump, to try and fix the TLS certs later. Adding in that extra SAN at creation with INSTALL_K3S_EXEC="--tls-san" will save you headaches later. This is also true if you expose your K3s externally via NAT to other hosts (e.g. VirtualBox, Hyper-V etc). To add multiple, I believe you just add more in the EXEC (e.g. INSTALL_K3S_EXEC="--tls-san --tls-san").

- name: Update Primary
  hosts: AnnaMacbook

  - name: Uninstall previous k3s
    ansible.builtin.shell: |
      /usr/local/bin/k3s-uninstall.sh || true
      /usr/local/bin/k3s-agent-uninstall.sh || true
    become: true
      chdir: /tmp
  - name: Install K3s
    ansible.builtin.shell: |
      curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION="v1.23.10%2Bk3s1" K3S_KUBECONFIG_MODE="644" INSTALL_K3S_EXEC="--tls-san" sh -
    become: true
      chdir: /tmp
  - name: Test host
    ansible.builtin.shell: |
      cat /var/lib/rancher/k3s/server/node-token | tr -d '\n'
    register: nodetokenresult
    become: true
      chdir: /tmp

  - name: Output Kubeconfig
    ansible.builtin.shell: |
      cat /etc/rancher/k3s/k3s.yaml | sed 's/' | base64 -w 0
    register: kubeconfig
    become: true
      chdir: /tmp
  - name: Clean prior kubeconfigs
    ansible.builtin.shell: |
      rm -f /tmp/mac81-* || true
    become: true
      chdir: /tmp

  - name: Prep Config
    ansible.builtin.shell: |
      cat /etc/rancher/k3s/k3s.yaml | sed 's/[0-9]*/' > /tmp/mac81-ext
      cat /etc/rancher/k3s/k3s.yaml | sed 's/' > /tmp/mac81-int
    register: kubeconfig
      chdir: /tmp
  - name: Set Config in Azure KV
    ansible.builtin.shell: |
      az keyvault secret set --vault-name idjakv --name mac81-int --file /tmp/mac81-int || true
      az keyvault secret set --vault-name idjakv --name mac81-ext --file /tmp/mac81-ext || true
    register: kubeconfig
    become: true
      chdir: /tmp

  - name: copy perl script
      src: updateKConfigs.pl
      dest: /tmp/updateKConfigs.pl
      owner: builder
      mode: '0755'

  - name: Update Combined
    ansible.builtin.shell: |      
      az keyvault secret show --vault-name idjakv --name k3sremoteconfig | jq -r .value > /tmp/existing.yaml
      perl /tmp/updateKConfigs.pl /tmp/existing.yaml /tmp/mac81-int /tmp/updated.yaml
      az keyvault secret set --vault-name idjakv --name k3sremoteconfig --file /tmp/updated.yaml || true
    register: kubeconfigall
    become: true
      chdir: /tmp

We should address those last three blocks as well.

At the tail end of the Primary host update, I stash the “internal” kubeconfig as mac81-int in my AAzure Key Vault. This does assume, at some point, I’ve gone on that host and installed the Azure CLI and logged in as myself.

At a future point, adding in the Azure CLI install and automated login will be part of this script. For now, should i log out, Ansible will stop there and I’ll have to go fix that.


The last two parts leverage a quick Perl script (had to dust off those skills)

The updateKConfigs.pl was my solution to the fact that my larger combined Kubeconfig that has auth for local Docker and other cloud-based Kubernetes is a real pain to update.

Up till today, I would complete this work, then download the ‘int’ config, open my ~/.kube/config, then replace the key bits; certificate-authority-data, client-certificate-data and client-key-data.

Since I blast this ‘mac81’ cluster fairly often, it’s a real hassle. And having that combined kubeconfig with kubectx to switch is really really handy.

Thus, this script will take in the

  1. existing (downloaded) combined kubeconfig
  2. the new “internal” config
  3. the output file to write.

For anyone else to use it, there are just a couple regexp lines you would need to update to address your own cluster. I had tried for some time to use just straight bash, but some lines come after a match (client cert and key data) and some before (cert auth data).

You can also run this on itself (e.g. perl updateKConfigs.pl /home/you/.kube/config /tmp/my-new-kubeconfig.yaml /home/you/.kube/config)


my ($combined,$newint,$output) = @ARGV;

@filec = <FILEH>;

my $newcad=`cat $newint | grep 'certificate-authority-data' | sed 's/^.*: //'`;
my $clientcertdata=`cat $newint | grep 'client-certificate-data' | sed 's/^.*: //'`;
my $clientkeydata=`cat $newint | grep 'client-key-data' | sed 's/^.*: //'`;

for (my $i = 0; $i < scalar(@filec); $i += 1)
	#	print "$i\n";

	#  certificate-authority-data
	if (($filec[$i] =~ /server: https:\/\/||($filec[$i] =~ /server: https:\/\/
		#print $filec[($i - 1)];
		$filec[($i - 1)] =~ s/^(.*)data: .*/\1data: /;
		chomp($filec[($i - 1)]);
		#	print $filec[($i - 1)] . $newcad;
	        $filec[($i - 1)] .= $newcad;
	# client cert and key data
	if ($filec[$i] =~ /^- name: mac81/) {
	    $filec[$i+2] =~ s/^(.*)data: .*/\1data: /;
	    $filec[$i+2] .= $clientcertdata;
	    $filec[$i+3] =~ s/^(.*)data: .*/\1data: /;
	    $filec[$i+3] .= $clientkeydata;

# print updated file
foreach my $line (@filec)
   print FILEO $line;

exit 0;

So running in practice now takes just over 2 minutes total


And pulling a fresh ‘combined’ kubeconfig lets me use kubectx just fine

builder@DESKTOP-QADGF36:~/Workspaces/ansible-playbooks$ az keyvault secret show --vault-name idjakv --name k3sremoteconfig | jq -r .value > ~/.kube/config
builder@DESKTOP-QADGF36:~/Workspaces/ansible-playbooks$ kubectx mac81
Switched to context "mac81".
builder@DESKTOP-QADGF36:~/Workspaces/ansible-playbooks$ kubectl get nodes
NAME                  STATUS   ROLES                  AGE   VERSION
anna-macbookair       Ready    control-plane,master   24m   v1.23.10+k3s1
builder-macbookpro2   Ready    <none>                 23m   v1.23.10+k3s1
isaac-macbookpro      Ready    <none>                 23m   v1.23.10+k3s1
builder@DESKTOP-QADGF36:~/Workspaces/ansible-playbooks$ kubectx mac77
Switched to context "mac77".
builder@DESKTOP-QADGF36:~/Workspaces/ansible-playbooks$ kubectl get nodes
NAME                          STATUS   ROLES                  AGE   VERSION
hp-hp-elitebook-850-g2        Ready    <none>                 88d   v1.23.9+k3s1
builder-hp-elitebook-850-g1   Ready    <none>                 88d   v1.23.9+k3s1
isaac-macbookair              Ready    control-plane,master   88d   v1.23.9+k3s1
builder-hp-elitebook-850-g2   Ready    <none>                 80d   v1.23.9+k3s1

The Template in AWX takes no parameters so it’s pretty simple to add



We covered backing up old data, setting up NFS, and adding an Ingress Controller. We loaded in the Cert Manager and tested with the Azure Vote App. We configured Route53 as a ClusterIssuer and added an NFS StorageClass.

We added some key applications including Redis and Harbor as our Container Registry. We added a few Worker Nodes then added Datadog as our Kubernetes monitoring suite. We added Github Runners including rebuilding and publishing our own Github runner container. We added Dapr.io and Loft.sh and setup a second test cluster.

Finally, we used Ansible on AWS to automate the full rebuilds of k3s and saving a combined kubeconfig to Azure Key Vault.

k3s ansible AWX

