From ea0f59bcbd02a340432549ed983468fc450bc42c Mon Sep 17 00:00:00 2001 From: Paul Buetow Date: Mon, 30 Mar 2026 22:40:07 +0300 Subject: Add f3s part 9 draft: GitOps with ArgoCD Replace old part-X draft with rewritten part 9 covering ArgoCD migration, in-cluster git server, sync waves, and updated app inventory (30 apps across 5 namespaces). Co-Authored-By: Claude Opus 4.6 --- .../DRAFT-f3s-kubernetes-with-freebsd-part-9.gmi | 677 +++++++++++++ ...RAFT-f3s-kubernetes-with-freebsd-part-9.gmi.tpl | 569 +++++++++++ ...RAFT-f3s-kubernetes-with-freebsd-part-X.gmi.tpl | 1036 -------------------- .../argocd-app-tree.png | Bin 0 -> 206524 bytes .../argocd-apps-list.png | Bin 0 -> 392311 bytes .../argocd-login.png | Bin 0 -> 771580 bytes 6 files changed, 1246 insertions(+), 1036 deletions(-) create mode 100644 gemfeed/DRAFT-f3s-kubernetes-with-freebsd-part-9.gmi create mode 100644 gemfeed/DRAFT-f3s-kubernetes-with-freebsd-part-9.gmi.tpl delete mode 100644 gemfeed/DRAFT-f3s-kubernetes-with-freebsd-part-X.gmi.tpl create mode 100644 gemfeed/f3s-kubernetes-with-freebsd-part-9/argocd-app-tree.png create mode 100644 gemfeed/f3s-kubernetes-with-freebsd-part-9/argocd-apps-list.png create mode 100644 gemfeed/f3s-kubernetes-with-freebsd-part-9/argocd-login.png diff --git a/gemfeed/DRAFT-f3s-kubernetes-with-freebsd-part-9.gmi b/gemfeed/DRAFT-f3s-kubernetes-with-freebsd-part-9.gmi new file mode 100644 index 00000000..3323bb64 --- /dev/null +++ b/gemfeed/DRAFT-f3s-kubernetes-with-freebsd-part-9.gmi @@ -0,0 +1,677 @@ +# f3s: Kubernetes with FreeBSD - Part 9: GitOps with ArgoCD + +> DRAFT - Not yet published + +This is the 9th post in the f3s series about my self-hosting home lab. f3s? The "f" stands for FreeBSD, and the "3s" stands for k3s, the Kubernetes distribution I use on FreeBSD-based physical machines. + +=> ./2024-11-17-f3s-kubernetes-with-freebsd-part-1.gmi 2024-11-17 f3s: Kubernetes with FreeBSD - Part 1: Setting the stage +=> ./2024-12-03-f3s-kubernetes-with-freebsd-part-2.gmi 2024-12-03 f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation +=> ./2025-02-01-f3s-kubernetes-with-freebsd-part-3.gmi 2025-02-01 f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts +=> ./2025-04-05-f3s-kubernetes-with-freebsd-part-4.gmi 2025-04-05 f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs +=> ./2025-05-11-f3s-kubernetes-with-freebsd-part-5.gmi 2025-05-11 f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network +=> ./2025-07-14-f3s-kubernetes-with-freebsd-part-6.gmi 2025-07-14 f3s: Kubernetes with FreeBSD - Part 6: Storage +=> ./2025-10-02-f3s-kubernetes-with-freebsd-part-7.gmi 2025-10-02 f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments +=> ./2025-12-07-f3s-kubernetes-with-freebsd-part-8.gmi 2025-12-07 f3s: Kubernetes with FreeBSD - Part 8: Observability + +=> ./f3s-kubernetes-with-freebsd-part-1/f3slogo.png f3s logo + +=> ./f3s-kubernetes-with-freebsd-part-9/argocd-app-tree.png ArgoCD Application Resource Tree + +## Table of Contents + +* ⇢ f3s: Kubernetes with FreeBSD - Part 9: GitOps with ArgoCD +* ⇢ ⇢ Introduction +* ⇢ ⇢ GitOps in a Nutshell +* ⇢ ⇢ ArgoCD +* ⇢ ⇢ Why Bother for a Home Lab? +* ⇢ ⇢ Deploying ArgoCD +* ⇢ ⇢ ⇢ Accessing ArgoCD +* ⇢ ⇢ In-Cluster Git Server +* ⇢ ⇢ Repository Organization +* ⇢ ⇢ Migrating an App: Miniflux as Example +* ⇢ ⇢ ⇢ Migration Order +* ⇢ ⇢ Complex Migration: Prometheus Multi-Source +* ⇢ ⇢ ⇢ Sync Waves +* ⇢ ⇢ The Result +* ⇢ ⇢ What Changed Day-to-Day +* ⇢ ⇢ Challenges Along the Way +* ⇢ ⇢ ⇢ Helm Release Adoption +* ⇢ ⇢ ⇢ PersistentVolumes +* ⇢ ⇢ ⇢ Secrets +* ⇢ ⇢ ⇢ Grafana Not Reloading +* ⇢ ⇢ ⇢ Prometheus Multi-Source Ordering +* ⇢ ⇢ Future Ideas +* ⇢ ⇢ Lessons Learned +* ⇢ ⇢ Wrapping Up + +## Introduction + +In previous posts, I deployed applications to the k3s cluster using Helm charts and Justfiles--running `just install` or `just upgrade` to push changes to the cluster. That worked, but it had some drawbacks: + +* No single source of truth--cluster state depends on which commands were run and when +* Every change requires manually running commands +* No easy way to tell if the cluster drifted from the desired config +* Rolling back means re-running old Helm commands +* No audit trail for who changed what + +This post covers the migration to GitOps with ArgoCD. After this, the Git repo is the single source of truth, and ArgoCD keeps the cluster in sync automatically. + +## GitOps in a Nutshell + +The idea behind GitOps is simple: describe your entire desired state in Git, and let an agent in the cluster pull that state and reconcile it continuously. Every change goes through a commit, so you get version history, collaboration, and rollback for free. + +For Kubernetes specifically: + +* All manifests, Helm charts, and config live in a Git repo +* ArgoCD watches that repo +* Push a change, ArgoCD applies it +* If someone manually tweaks something in the cluster, ArgoCD detects the drift and reverts it + +## ArgoCD + +ArgoCD is a GitOps continuous delivery tool for Kubernetes. It runs as a controller in the cluster, continuously comparing live state against what's defined in Git. + +=> https://argo-cd.readthedocs.io ArgoCD Documentation + +The features I care about most for f3s: + +* Automatic sync--monitors Git and applies changes to the cluster +* Application CRDs--each app is a Kubernetes custom resource +* Health checks--knows whether an app is healthy or degraded +* Web UI--visual overview of all applications and their sync status +* Sync waves and hooks--control deployment order and run post-deploy jobs +* Multi-source--combine upstream Helm charts with custom manifests + +## Why Bother for a Home Lab? + +Honestly, the biggest reason is disaster recovery. If the cluster dies, I can: + +* Bootstrap a fresh k3s cluster +* Install ArgoCD +* Point it at the Git repo +* Everything deploys automatically + +That's it. No "let me check my shell history to remember how I set this up." + +It's also a great way to learn. Setting up GitOps for real--even on a small cluster--teaches you things you won't pick up from tutorials alone. Debugging sync issues, figuring out sync waves, dealing with secrets management--all stuff that's directly applicable at work too. + +Beyond that: push to Git, things deploy. No SSH'ing to a workstation to run Helm commands. And if I manually tweak something while debugging and forget about it, ArgoCD reverts it back to the desired state. That's happened more than once. + +## Deploying ArgoCD + +ArgoCD manages everything else via GitOps, but ArgoCD itself needs a bootstrap. Chicken-and-egg problem. + +The installation lives in the config repo: + +=> https://codeberg.org/snonux/conf/src/branch/master/f3s/argocd codeberg.org/snonux/conf/f3s/argocd + +I deployed it using Helm via a Justfile: + +```sh +$ cd conf/f3s/argocd +$ just install +helm repo add argo https://argoproj.github.io/argo-helm +helm repo update +kubectl create namespace cicd +kubectl apply -f persistent-volumes.yaml +helm install argocd argo/argo-cd --namespace cicd -f values.yaml +kubectl apply -f ingress.yaml +``` + +A few things worth noting in the `values.yaml`: + +Persistent storage for the repo-server so cloned Git repos survive pod restarts: + +```yaml +repoServer: + volumes: + - name: repo-server-data + persistentVolumeClaim: + claimName: argocd-repo-server-pvc + volumeMounts: + - name: repo-server-data + mountPath: /home/argocd/repo-cache + env: + - name: XDG_CACHE_HOME + value: /home/argocd/repo-cache +``` + +Server runs in insecure mode since TLS is terminated by the OpenBSD edge relays (same pattern as all other f3s services): + +```yaml +server: + insecure: true +configs: + params: + server.insecure: true +``` + +Dex (SSO) and notifications are disabled--overkill for a single-user home lab: + +```yaml +dex: + enabled: false +notifications: + enabled: false +``` + +The admin password is auto-generated on first install and stored in `argocd-initial-admin-secret`. It's preserved across Helm upgrades, so no manual secret creation needed: + +```sh +$ just get-password +# Reads from argocd-initial-admin-secret +``` + +### Accessing ArgoCD + +After deployment, ArgoCD runs in the `cicd` namespace: + +```sh +$ kubectl get pods -n cicd +NAME READY STATUS RESTARTS AGE +argocd-application-controller-0 1/1 Running 0 45d +argocd-applicationset-controller-66d6b9b8f4-vhm9k 1/1 Running 0 45d +argocd-redis-77b8d6c6d4-mz9hg 1/1 Running 0 45d +argocd-repo-server-5f98f77b97-8xtcq 1/1 Running 0 45d +argocd-server-6b9c4b4f8d-kxw7p 1/1 Running 0 45d +``` + +=> ./f3s-kubernetes-with-freebsd-part-9/argocd-login.png ArgoCD login page + +The ingress exposes both a WAN and LAN endpoint: + +```yaml +# WAN access (via OpenBSD relayd) +- host: argocd.f3s.foo.zone +# LAN access (via FreeBSD CARP VIP, with TLS) +- host: argocd.f3s.lan.foo.zone +``` + +## In-Cluster Git Server + +I didn't want ArgoCD pulling from Codeberg over the internet every time it checks for changes. If Codeberg is down (or my internet is), the cluster can't reconcile. So I set up a Git server inside the cluster itself. + +=> https://codeberg.org/snonux/conf/src/commit/190473b/f3s/git-server codeberg.org/snonux/conf/f3s/git-server (at 190473b) + +The git-server runs as a single pod in the `cicd` namespace with two containers sharing a PVC: + +* An SSH git server (Alpine + OpenSSH + git-shell) for pushing changes from my laptop +* A CGit web UI with git-http-backend (nginx + fcgiwrap) for browsing repos and HTTP clones + +ArgoCD uses the HTTP backend to clone repos. Most Application manifests point at: + +``` +http://git-server.cicd.svc.cluster.local/conf.git +``` + +For pushing, I use SSH via a NodePort (30022). The git user is locked down to git-shell--no actual shell access. SSH keys are managed through a Kubernetes Secret. + +There's a chicken-and-egg situation here. The git-server's own ArgoCD Application manifest points at Codeberg (not at itself), since ArgoCD needs to bootstrap the git-server before it can use it: + +```yaml +# argocd-apps/cicd/git-server.yaml +source: + repoURL: https://codeberg.org/snonux/conf.git + targetRevision: master + path: f3s/git-server/helm-chart +``` + +Once the pod is up, all other apps use the in-cluster URL. The dependency chain is: Codeberg -> git-server -> everything else. + +The repo storage lives on NFS. Initial setup was just cloning the Codeberg repo as a bare repo into the NFS volume, then pointing my laptop's git remote at the NodePort: + +```sh +$ git remote add f3s f3s-git:/repos/conf.git +$ git push f3s master +``` + +ArgoCD detects the change within a few minutes and syncs. No internet required. The whole thing is intentionally minimal--no database, no accounts, no webhooks. Just git over SSH for writes and HTTP for reads. + +## Repository Organization + +I reorganized the config repo to support GitOps. Application manifests are grouped by Kubernetes namespace: + +``` +/home/paul/git/conf/f3s/ +├── argocd-apps/ +│ ├── cicd/ # CI/CD tooling (2 apps) +│ │ ├── argo-rollouts.yaml +│ │ └── git-server.yaml +│ ├── infra/ # Infrastructure (4 apps) +│ │ ├── cert-manager.yaml +│ │ ├── pkgrepo.yaml +│ │ ├── registry.yaml +│ │ └── traefik-config.yaml +│ ├── monitoring/ # Observability stack (6 apps) +│ │ ├── alloy.yaml +│ │ ├── grafana-ingress.yaml +│ │ ├── loki.yaml +│ │ ├── prometheus.yaml +│ │ ├── pushgateway.yaml +│ │ └── tempo.yaml +│ ├── services/ # User-facing applications (18 apps) +│ │ ├── anki-sync-server.yaml +│ │ ├── apache.yaml +│ │ ├── audiobookshelf.yaml +│ │ ├── filebrowser.yaml +│ │ ├── immich.yaml +│ │ ├── ipv6test.yaml +│ │ ├── jellyfin.yaml +│ │ ├── keybr.yaml +│ │ ├── kobo-sync-server.yaml +│ │ ├── miniflux.yaml +│ │ ├── navidrome.yaml +│ │ ├── opodsync.yaml +│ │ ├── pihole.yaml +│ │ ├── radicale.yaml +│ │ ├── syncthing.yaml +│ │ ├── tracing-demo.yaml +│ │ ├── wallabag.yaml +│ │ └── webdav.yaml +│ └── test/ # Test/example applications +├── miniflux/ # Per-app directories (unchanged) +│ ├── helm-chart/ +│ │ ├── Chart.yaml +│ │ ├── values.yaml +│ │ └── templates/ +│ └── Justfile +├── prometheus/ +│ ├── manifests/ # Additional manifests for multi-source +│ └── Justfile +└── ... +``` + +The per-app directories (miniflux, prometheus, etc.) stayed mostly the same--ArgoCD points at the same Helm charts. The main additions are the `argocd-apps/` directory structure and `manifests/` subdirectories for complex apps. + +## Migrating an App: Miniflux as Example + +I migrated all apps incrementally, one at a time. The procedure was the same for each. Here's miniflux as a concrete example. + +Before ArgoCD, the Justfile looked like this: + +```makefile +install: + kubectl apply -f helm-chart/persistent-volumes.yaml + helm install miniflux ./helm-chart --namespace services + +upgrade: + helm upgrade miniflux ./helm-chart --namespace services + +uninstall: + helm uninstall miniflux --namespace services +``` + +Workflow: edit chart, run `just upgrade`, hope you didn't forget anything. + +To migrate, I created an Application manifest telling ArgoCD where the Helm chart lives and how to sync it: + +```yaml +apiVersion: argoproj.io/v1alpha1 +kind: Application +metadata: + name: miniflux + namespace: cicd + finalizers: + - resources-finalizer.argocd.argoproj.io +spec: + project: default + source: + repoURL: http://git-server.cicd.svc.cluster.local/conf.git + targetRevision: master + path: f3s/miniflux/helm-chart + destination: + server: https://kubernetes.default.svc + namespace: services + syncPolicy: + automated: + prune: true + selfHeal: true + syncOptions: + - CreateNamespace=false + retry: + limit: 3 + backoff: + duration: 5s + factor: 2 + maxDuration: 1m +``` + +Then applied it: + +```sh +# 1. Apply the Application manifest +$ kubectl apply -f argocd-apps/services/miniflux.yaml +application.argoproj.io/miniflux created + +# 2. Verify ArgoCD adopted the existing resources +$ argocd app get miniflux +Name: miniflux +Sync Status: Synced to master (4e3c216) +Health Status: Healthy + +# 3. Test that the app still works +$ curl -I https://flux.f3s.foo.zone +HTTP/2 200 +``` + +About 10 minutes, zero downtime. ArgoCD recognised the already-running resources matched the Helm chart in Git and adopted them without re-deploying. + +After the migration, the Justfile turned into utility commands--no more install/upgrade/uninstall: + +```makefile +status: + @kubectl get pods -n services -l app=miniflux-server + @kubectl get pods -n services -l app=miniflux-postgres + @kubectl get application miniflux -n cicd \ + -o jsonpath='Sync: {.status.sync.status}, Health: {.status.health.status}' + +sync: + @kubectl annotate application miniflux -n cicd \ + argocd.argoproj.io/refresh=normal --overwrite + +logs: + kubectl logs -n services -l app=miniflux-server --tail=100 -f + +restart: + kubectl rollout restart -n services deployment/miniflux-server + +port-forward port="8080": + kubectl port-forward -n services svc/miniflux {{port}}:8080 + +psql: + kubectl exec -it -n services deployment/miniflux-postgres -- psql -U miniflux +``` + +New workflow: edit chart, commit, push. ArgoCD picks it up within a few minutes. Run `just sync` if you're impatient. + +### Migration Order + +I started with the simplest services (miniflux, wallabag, radicale, etc.)--apps with straightforward Helm charts and no complex dependencies. This let me validate the pattern before touching anything critical. + +After that: infrastructure apps (registry, cert-manager, pkgrepo, traefik-config), then the monitoring stack (tempo, loki, alloy, and finally prometheus--the most complex one), and last the CI/CD tools (git-server, argo-rollouts). + +## Complex Migration: Prometheus Multi-Source + +Prometheus was the tricky one. It combines an upstream Helm chart with a bunch of custom manifests--recording rules, dashboards, persistent volumes, and a post-sync hook to restart Grafana. + +ArgoCD's multi-source feature handles this cleanly: + +```yaml +apiVersion: argoproj.io/v1alpha1 +kind: Application +metadata: + name: prometheus + namespace: cicd +spec: + sources: + # Source 1: Upstream Helm chart + - repoURL: https://prometheus-community.github.io/helm-charts + chart: kube-prometheus-stack + targetRevision: 55.5.0 + helm: + releaseName: prometheus + valuesObject: + kubeEtcd: + enabled: true + endpoints: + - 192.168.2.120 + - 192.168.2.121 + - 192.168.2.122 + # ... hundreds of lines of config + + # Source 2: Custom manifests from Git + - repoURL: http://git-server.cicd.svc.cluster.local/conf.git + targetRevision: master + path: f3s/prometheus/manifests + + syncPolicy: + automated: + prune: false # Manual pruning--too risky for the monitoring stack + selfHeal: true + syncOptions: + - ServerSideApply=true +``` + +The `prometheus/manifests/` directory has 13 files, each with a sync wave annotation to control deployment order: + +``` +f3s/prometheus/manifests/ +├── persistent-volumes.yaml # Wave 0 +├── grafana-restart-rbac.yaml # Wave 0 +├── additional-scrape-configs-secret.yaml # Wave 1 +├── grafana-datasources-configmap.yaml # Wave 1 +├── freebsd-recording-rules.yaml # Wave 3 +├── openbsd-recording-rules.yaml # Wave 3 +├── zfs-recording-rules.yaml # Wave 3 +├── argocd-application-alerts.yaml # Wave 3 +├── epimetheus-dashboard.yaml # Wave 4 +├── zfs-dashboards.yaml # Wave 4 +├── argocd-applications-dashboard.yaml # Wave 4 +├── node-resources-multi-select-dashboard.yaml # Wave 4 +├── prometheus-nodeport.yaml # Wave 4 +└── grafana-restart-hook.yaml # Wave 10 (PostSync) +``` + +### Sync Waves + +Without sync waves, ArgoCD deploys all resources at once in no particular order. That's fine for simple apps, but for something like Prometheus it causes failures--a PersistentVolumeClaim can't bind if the PersistentVolume doesn't exist yet, and a PrometheusRule can't be created if the CRD hasn't been registered. + +Sync waves fix this. You annotate each resource with a wave number: + +```yaml +annotations: + argocd.argoproj.io/sync-wave: "3" +``` + +ArgoCD deploys all wave 0 resources first, waits until they're healthy, then moves to wave 1, waits again, and so on. Resources without the annotation default to wave 0. + +For the Prometheus stack, the waves look like this: + +* Wave 0: PersistentVolumes, RBAC--infrastructure that everything else depends on +* Wave 1: Secrets, ConfigMaps--config that Prometheus and Grafana need at startup +* Wave 3: PrometheusRule CRDs--recording rules for FreeBSD, OpenBSD, ZFS, ArgoCD (the operator from wave 0 needs to be running first) +* Wave 4: Dashboard ConfigMaps and nodeport config +* Wave 10: PostSync hook--a Job that runs after all waves complete + +The PostSync hook is worth explaining. ArgoCD supports lifecycle hooks (`PreSync`, `Sync`, `PostSync`) that run Jobs at specific points. The Grafana restart hook is a good example--it restarts Grafana after every sync so it picks up updated datasources and dashboards: + +```yaml +apiVersion: batch/v1 +kind: Job +metadata: + name: grafana-restart-hook + namespace: monitoring + annotations: + argocd.argoproj.io/hook: PostSync + argocd.argoproj.io/hook-delete-policy: BeforeHookCreation + argocd.argoproj.io/sync-wave: "10" +spec: + template: + spec: + serviceAccountName: grafana-restart-sa + restartPolicy: OnFailure + containers: + - name: kubectl + image: bitnami/kubectl:latest + command: + - /bin/sh + - -c + - | + kubectl wait --for=condition=available --timeout=300s \ + deployment/prometheus-grafana -n monitoring || true + kubectl delete pod -n monitoring \ + -l app.kubernetes.io/name=grafana --ignore-not-found=true + backoffLimit: 2 +``` + +## The Result + +All 30 applications across 5 namespaces, synced and healthy: + +```sh +$ argocd app list +NAME CLUSTER NAMESPACE PROJECT STATUS HEALTH SYNCPOLICY +alloy https://kubernetes.default.svc monitoring default Synced Healthy Auto-Prune +anki-sync-server https://kubernetes.default.svc services default Synced Healthy Auto-Prune +apache https://kubernetes.default.svc services default Synced Healthy Auto-Prune +argo-rollouts https://kubernetes.default.svc cicd default Synced Healthy Auto-Prune +audiobookshelf https://kubernetes.default.svc services default Synced Healthy Auto-Prune +cert-manager https://kubernetes.default.svc infra default Synced Healthy Auto-Prune +filebrowser https://kubernetes.default.svc services default Synced Healthy Auto-Prune +git-server https://kubernetes.default.svc cicd default Synced Healthy Auto-Prune +grafana-ingress https://kubernetes.default.svc monitoring default Synced Healthy Auto-Prune +immich https://kubernetes.default.svc services default Synced Healthy Auto-Prune +ipv6test https://kubernetes.default.svc services default Synced Healthy Auto-Prune +jellyfin https://kubernetes.default.svc services default Synced Healthy Auto-Prune +keybr https://kubernetes.default.svc services default Synced Healthy Auto-Prune +kobo-sync-server https://kubernetes.default.svc services default Synced Healthy Auto-Prune +loki https://kubernetes.default.svc monitoring default Synced Healthy Auto-Prune +miniflux https://kubernetes.default.svc services default Synced Healthy Auto-Prune +navidrome https://kubernetes.default.svc services default Synced Healthy Auto-Prune +opodsync https://kubernetes.default.svc services default Synced Healthy Auto-Prune +pihole https://kubernetes.default.svc services default Synced Healthy Auto-Prune +pkgrepo https://kubernetes.default.svc infra default Synced Healthy Auto-Prune +prometheus https://kubernetes.default.svc monitoring default Synced Healthy Auto +pushgateway https://kubernetes.default.svc monitoring default Synced Healthy Auto-Prune +radicale https://kubernetes.default.svc services default Synced Healthy Auto-Prune +registry https://kubernetes.default.svc infra default Synced Healthy Auto-Prune +syncthing https://kubernetes.default.svc services default Synced Healthy Auto-Prune +tempo https://kubernetes.default.svc monitoring default Synced Healthy Auto-Prune +traefik-config https://kubernetes.default.svc infra default Synced Healthy Auto-Prune +tracing-demo https://kubernetes.default.svc services default Synced Healthy Auto-Prune +wallabag https://kubernetes.default.svc services default Synced Healthy Auto-Prune +webdav https://kubernetes.default.svc services default Synced Healthy Auto-Prune +``` + +=> ./f3s-kubernetes-with-freebsd-part-9/argocd-apps-list.png ArgoCD managing all 30 applications in the f3s cluster + +## What Changed Day-to-Day + +The practical difference is pretty big: + +Single source of truth. Clone the repo, look at `argocd-apps/`, and you know exactly what's running. No more `helm list` or guessing. + +Push and forget. Edit a Helm value, commit, push. ArgoCD picks it up within a few minutes. No SSH, no `just upgrade`. + +Self-healing. I've tweaked things manually for debugging, forgotten about it, and ArgoCD quietly reverted it. That's saved me from some confusing "why is this behaving differently?" moments. + +Rollback = git revert. That's it. `git revert HEAD && git push` and ArgoCD syncs back to the previous state. + +Disaster recovery. Bootstrap k3s, install ArgoCD, apply the Application manifests, wait. The cluster rebuilds itself. I haven't had to do this for real yet, but I've tested it and it works. + +Drift detection. The ArgoCD UI shows immediately if something is out of sync. Much better than running `kubectl` commands and comparing output manually. + +## Challenges Along the Way + +### Helm Release Adoption + +When ArgoCD tries to manage resources already deployed by Helm, it can get confused. The fix: make sure the Application manifest matches the current Helm values exactly. ArgoCD then recognizes the resources and adopts them without re-deploying. + +### PersistentVolumes + +PVs are cluster-scoped, and many of my Helm charts created them with `kubectl apply` outside of Helm. For simple apps I moved PV definitions into the Helm chart templates. For complex apps like Prometheus, I used the multi-source pattern with PVs in a separate `manifests/` directory at sync wave 0. + +### Secrets + +Secrets shouldn't live in Git as plaintext. For now, I create them manually with `kubectl create secret` and reference them from Helm charts. ArgoCD doesn't manage the secrets themselves. This works fine but isn't fully declarative--External Secrets Operator is on the list for the future. + +### Grafana Not Reloading + +After updating datasource ConfigMaps, Grafana wouldn't notice until the pod was restarted. The PostSync hook (the Grafana restart Job in sync wave 10) handles this automatically now. + +### Prometheus Multi-Source Ordering + +Without sync waves, Prometheus resources deployed in random order and things broke. PVs need to exist before PVCs, secrets before the operator, recording rules after the CRDs are registered. Adding sync wave annotations to everything in `prometheus/manifests/` fixed all the ordering issues. + +## Future Ideas + +External Secrets Operator to manage secrets declaratively without committing them to Git. + +ApplicationSets for apps with nearly identical manifests. One template could replace 10+ individual Application files: + +```yaml +apiVersion: argoproj.io/v1alpha1 +kind: ApplicationSet +metadata: + name: simple-services + namespace: cicd +spec: + generators: + - list: + elements: + - app: miniflux + - app: wallabag + - app: radicale + template: + metadata: + name: '{{app}}' + spec: + source: + repoURL: http://git-server.cicd.svc.cluster.local/conf.git + targetRevision: master + path: 'f3s/{{app}}/helm-chart' + destination: + server: https://kubernetes.default.svc + namespace: services + syncPolicy: + automated: + prune: true + selfHeal: true +``` + +App-of-Apps pattern so even the Application manifests themselves are managed by ArgoCD. One root Application watches `f3s/argocd-apps/` and deploys everything inside. Disaster recovery then becomes a single `kubectl apply`: + +```sh +$ kubectl apply -f root-app.yaml +# ArgoCD deploys all 30 applications automatically +``` + +ArgoCD Image Updater for apps with custom Docker images--automatically update the image tag in Git when a new image is pushed to the registry. + +## Lessons Learned + +Migrate incrementally, one app at a time. It lets you validate the pattern before it affects everything. + +Start with the boring apps. Complex stuff like Prometheus should come last, after you've gotten comfortable with the basics. + +Sync waves matter for anything non-trivial. Random deployment order causes weird failures. + +Multi-source is great for combining upstream charts with custom config. Keeps things cleanly separated. + +PostSync hooks replace the manual steps you'll inevitably forget. Automate them. + +The ArgoCD Web UI is surprisingly useful. Seeing the resource tree and health status at a glance beats running `kubectl` commands. + +## Wrapping Up + +The migration took a few weeks, doing one or two apps at a time. The result: 30 applications across 5 namespaces, all managed declaratively through Git. Push a change, it deploys. Break something, `git revert`. Cluster dies, rebuild from the repo. + +All the config lives here: + +=> https://codeberg.org/snonux/conf/src/branch/master/f3s codeberg.org/snonux/conf/f3s + +ArgoCD Application manifests organized by namespace: + +=> https://codeberg.org/snonux/conf/src/branch/master/f3s/argocd-apps codeberg.org/snonux/conf/f3s/argocd-apps + +I can't imagine going back to running Helm commands manually. + +Other *BSD-related posts: + +=> ./2025-12-07-f3s-kubernetes-with-freebsd-part-8.gmi 2025-12-07 f3s: Kubernetes with FreeBSD - Part 8: Observability +=> ./2025-10-02-f3s-kubernetes-with-freebsd-part-7.gmi 2025-10-02 f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments +=> ./2025-07-14-f3s-kubernetes-with-freebsd-part-6.gmi 2025-07-14 f3s: Kubernetes with FreeBSD - Part 6: Storage +=> ./2025-05-11-f3s-kubernetes-with-freebsd-part-5.gmi 2025-05-11 f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network +=> ./2025-04-05-f3s-kubernetes-with-freebsd-part-4.gmi 2025-04-05 f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs +=> ./2025-02-01-f3s-kubernetes-with-freebsd-part-3.gmi 2025-02-01 f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts +=> ./2024-12-03-f3s-kubernetes-with-freebsd-part-2.gmi 2024-12-03 f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation +=> ./2024-11-17-f3s-kubernetes-with-freebsd-part-1.gmi 2024-11-17 f3s: Kubernetes with FreeBSD - Part 1: Setting the stage +=> ./2024-04-01-KISS-high-availability-with-OpenBSD.gmi 2024-04-01 KISS high-availability with OpenBSD +=> ./2024-01-13-one-reason-why-i-love-openbsd.gmi 2024-01-13 One reason why I love OpenBSD +=> ./2022-10-30-installing-dtail-on-openbsd.gmi 2022-10-30 Installing DTail on OpenBSD +=> ./2022-07-30-lets-encrypt-with-openbsd-and-rex.gmi 2022-07-30 Let's Encrypt with OpenBSD and Rex +=> ./2016-04-09-jails-and-zfs-on-freebsd-with-puppet.gmi 2016-04-09 Jails and ZFS with Puppet on FreeBSD + +E-Mail your comments to `paul@nospam.buetow.org` :-) + +=> ../ Back to the main site diff --git a/gemfeed/DRAFT-f3s-kubernetes-with-freebsd-part-9.gmi.tpl b/gemfeed/DRAFT-f3s-kubernetes-with-freebsd-part-9.gmi.tpl new file mode 100644 index 00000000..87d6b6dd --- /dev/null +++ b/gemfeed/DRAFT-f3s-kubernetes-with-freebsd-part-9.gmi.tpl @@ -0,0 +1,569 @@ +# f3s: Kubernetes with FreeBSD - Part 9: GitOps with ArgoCD + +> DRAFT - Not yet published + +This is the 9th post in the f3s series about my self-hosting home lab. f3s? The "f" stands for FreeBSD, and the "3s" stands for k3s, the Kubernetes distribution I use on FreeBSD-based physical machines. + +<< template::inline::index f3s-kubernetes-with-freebsd-part + +=> ./f3s-kubernetes-with-freebsd-part-1/f3slogo.png f3s logo + +=> ./f3s-kubernetes-with-freebsd-part-9/argocd-app-tree.png ArgoCD Application Resource Tree + +<< template::inline::toc + +## Introduction + +In previous posts, I deployed applications to the k3s cluster using Helm charts and Justfiles--running `just install` or `just upgrade` to push changes to the cluster. That worked, but it had some drawbacks: + +* No single source of truth--cluster state depends on which commands were run and when +* Every change requires manually running commands +* No easy way to tell if the cluster drifted from the desired config +* Rolling back means re-running old Helm commands +* No audit trail for who changed what + +This post covers the migration to GitOps with ArgoCD. After this, the Git repo is the single source of truth, and ArgoCD keeps the cluster in sync automatically. + +## GitOps in a Nutshell + +The idea behind GitOps is simple: describe your entire desired state in Git, and let an agent in the cluster pull that state and reconcile it continuously. Every change goes through a commit, so you get version history, collaboration, and rollback for free. + +For Kubernetes specifically: + +* All manifests, Helm charts, and config live in a Git repo +* ArgoCD watches that repo +* Push a change, ArgoCD applies it +* If someone manually tweaks something in the cluster, ArgoCD detects the drift and reverts it + +## ArgoCD + +ArgoCD is a GitOps continuous delivery tool for Kubernetes. It runs as a controller in the cluster, continuously comparing live state against what's defined in Git. + +=> https://argo-cd.readthedocs.io ArgoCD Documentation + +The features I care about most for f3s: + +* Automatic sync--monitors Git and applies changes to the cluster +* Application CRDs--each app is a Kubernetes custom resource +* Health checks--knows whether an app is healthy or degraded +* Web UI--visual overview of all applications and their sync status +* Sync waves and hooks--control deployment order and run post-deploy jobs +* Multi-source--combine upstream Helm charts with custom manifests + +## Why Bother for a Home Lab? + +Honestly, the biggest reason is disaster recovery. If the cluster dies, I can: + +* Bootstrap a fresh k3s cluster +* Install ArgoCD +* Point it at the Git repo +* Everything deploys automatically + +That's it. No "let me check my shell history to remember how I set this up." + +It's also a great way to learn. Setting up GitOps for real--even on a small cluster--teaches you things you won't pick up from tutorials alone. Debugging sync issues, figuring out sync waves, dealing with secrets management--all stuff that's directly applicable at work too. + +Beyond that: push to Git, things deploy. No SSH'ing to a workstation to run Helm commands. And if I manually tweak something while debugging and forget about it, ArgoCD reverts it back to the desired state. That's happened more than once. + +## Deploying ArgoCD + +ArgoCD manages everything else via GitOps, but ArgoCD itself needs a bootstrap. Chicken-and-egg problem. + +The installation lives in the config repo: + +=> https://codeberg.org/snonux/conf/src/branch/master/f3s/argocd codeberg.org/snonux/conf/f3s/argocd + +I deployed it using Helm via a Justfile: + +```sh +$ cd conf/f3s/argocd +$ just install +helm repo add argo https://argoproj.github.io/argo-helm +helm repo update +kubectl create namespace cicd +kubectl apply -f persistent-volumes.yaml +helm install argocd argo/argo-cd --namespace cicd -f values.yaml +kubectl apply -f ingress.yaml +``` + +A few things worth noting in the `values.yaml`: + +Persistent storage for the repo-server so cloned Git repos survive pod restarts: + +```yaml +repoServer: + volumes: + - name: repo-server-data + persistentVolumeClaim: + claimName: argocd-repo-server-pvc + volumeMounts: + - name: repo-server-data + mountPath: /home/argocd/repo-cache + env: + - name: XDG_CACHE_HOME + value: /home/argocd/repo-cache +``` + +Server runs in insecure mode since TLS is terminated by the OpenBSD edge relays (same pattern as all other f3s services): + +```yaml +server: + insecure: true +configs: + params: + server.insecure: true +``` + +Dex (SSO) and notifications are disabled--overkill for a single-user home lab: + +```yaml +dex: + enabled: false +notifications: + enabled: false +``` + +The admin password is auto-generated on first install and stored in `argocd-initial-admin-secret`. It's preserved across Helm upgrades, so no manual secret creation needed: + +```sh +$ just get-password +# Reads from argocd-initial-admin-secret +``` + +### Accessing ArgoCD + +After deployment, ArgoCD runs in the `cicd` namespace: + +```sh +$ kubectl get pods -n cicd +NAME READY STATUS RESTARTS AGE +argocd-application-controller-0 1/1 Running 0 45d +argocd-applicationset-controller-66d6b9b8f4-vhm9k 1/1 Running 0 45d +argocd-redis-77b8d6c6d4-mz9hg 1/1 Running 0 45d +argocd-repo-server-5f98f77b97-8xtcq 1/1 Running 0 45d +argocd-server-6b9c4b4f8d-kxw7p 1/1 Running 0 45d +``` + +=> ./f3s-kubernetes-with-freebsd-part-9/argocd-login.png ArgoCD login page + +The ingress exposes both a WAN and LAN endpoint: + +```yaml +# WAN access (via OpenBSD relayd) +- host: argocd.f3s.foo.zone +# LAN access (via FreeBSD CARP VIP, with TLS) +- host: argocd.f3s.lan.foo.zone +``` + +## In-Cluster Git Server + +I didn't want ArgoCD pulling from Codeberg over the internet every time it checks for changes. If Codeberg is down (or my internet is), the cluster can't reconcile. So I set up a Git server inside the cluster itself. + +=> https://codeberg.org/snonux/conf/src/commit/190473b/f3s/git-server codeberg.org/snonux/conf/f3s/git-server (at 190473b) + +The git-server runs as a single pod in the `cicd` namespace with two containers sharing a PVC: + +* An SSH git server (Alpine + OpenSSH + git-shell) for pushing changes from my laptop +* A CGit web UI with git-http-backend (nginx + fcgiwrap) for browsing repos and HTTP clones + +ArgoCD uses the HTTP backend to clone repos. Most Application manifests point at: + +``` +http://git-server.cicd.svc.cluster.local/conf.git +``` + +For pushing, I use SSH via a NodePort (30022). The git user is locked down to git-shell--no actual shell access. SSH keys are managed through a Kubernetes Secret. + +There's a chicken-and-egg situation here. The git-server's own ArgoCD Application manifest points at Codeberg (not at itself), since ArgoCD needs to bootstrap the git-server before it can use it: + +```yaml +# argocd-apps/cicd/git-server.yaml +source: + repoURL: https://codeberg.org/snonux/conf.git + targetRevision: master + path: f3s/git-server/helm-chart +``` + +Once the pod is up, all other apps use the in-cluster URL. The dependency chain is: Codeberg -> git-server -> everything else. + +The repo storage lives on NFS. Initial setup was just cloning the Codeberg repo as a bare repo into the NFS volume, then pointing my laptop's git remote at the NodePort: + +```sh +$ git remote add f3s f3s-git:/repos/conf.git +$ git push f3s master +``` + +ArgoCD detects the change within a few minutes and syncs. No internet required. The whole thing is intentionally minimal--no database, no accounts, no webhooks. Just git over SSH for writes and HTTP for reads. + +## Repository Organization + +I reorganized the config repo to support GitOps. Application manifests are grouped by Kubernetes namespace: + +``` +/home/paul/git/conf/f3s/ +├── argocd-apps/ +│ ├── cicd/ # CI/CD tooling (2 apps) +│ │ ├── argo-rollouts.yaml +│ │ └── git-server.yaml +│ ├── infra/ # Infrastructure (4 apps) +│ │ ├── cert-manager.yaml +│ │ ├── pkgrepo.yaml +│ │ ├── registry.yaml +│ │ └── traefik-config.yaml +│ ├── monitoring/ # Observability stack (6 apps) +│ │ ├── alloy.yaml +│ │ ├── grafana-ingress.yaml +│ │ ├── loki.yaml +│ │ ├── prometheus.yaml +│ │ ├── pushgateway.yaml +│ │ └── tempo.yaml +│ ├── services/ # User-facing applications (18 apps) +│ │ ├── anki-sync-server.yaml +│ │ ├── apache.yaml +│ │ ├── audiobookshelf.yaml +│ │ ├── filebrowser.yaml +│ │ ├── immich.yaml +│ │ ├── ipv6test.yaml +│ │ ├── jellyfin.yaml +│ │ ├── keybr.yaml +│ │ ├── kobo-sync-server.yaml +│ │ ├── miniflux.yaml +│ │ ├── navidrome.yaml +│ │ ├── opodsync.yaml +│ │ ├── pihole.yaml +│ │ ├── radicale.yaml +│ │ ├── syncthing.yaml +│ │ ├── tracing-demo.yaml +│ │ ├── wallabag.yaml +│ │ └── webdav.yaml +│ └── test/ # Test/example applications +├── miniflux/ # Per-app directories (unchanged) +│ ├── helm-chart/ +│ │ ├── Chart.yaml +│ │ ├── values.yaml +│ │ └── templates/ +│ └── Justfile +├── prometheus/ +│ ├── manifests/ # Additional manifests for multi-source +│ └── Justfile +└── ... +``` + +The per-app directories (miniflux, prometheus, etc.) stayed mostly the same--ArgoCD points at the same Helm charts. The main additions are the `argocd-apps/` directory structure and `manifests/` subdirectories for complex apps. + +## Migrating an App: Miniflux as Example + +I migrated all apps incrementally, one at a time. The procedure was the same for each. Here's miniflux as a concrete example. + +Before ArgoCD, the Justfile looked like this: + +```makefile +install: + kubectl apply -f helm-chart/persistent-volumes.yaml + helm install miniflux ./helm-chart --namespace services + +upgrade: + helm upgrade miniflux ./helm-chart --namespace services + +uninstall: + helm uninstall miniflux --namespace services +``` + +Workflow: edit chart, run `just upgrade`, hope you didn't forget anything. + +To migrate, I created an Application manifest telling ArgoCD where the Helm chart lives and how to sync it: + +```yaml +apiVersion: argoproj.io/v1alpha1 +kind: Application +metadata: + name: miniflux + namespace: cicd + finalizers: + - resources-finalizer.argocd.argoproj.io +spec: + project: default + source: + repoURL: http://git-server.cicd.svc.cluster.local/conf.git + targetRevision: master + path: f3s/miniflux/helm-chart + destination: + server: https://kubernetes.default.svc + namespace: services + syncPolicy: + automated: + prune: true + selfHeal: true + syncOptions: + - CreateNamespace=false + retry: + limit: 3 + backoff: + duration: 5s + factor: 2 + maxDuration: 1m +``` + +Then applied it: + +```sh +# 1. Apply the Application manifest +$ kubectl apply -f argocd-apps/services/miniflux.yaml +application.argoproj.io/miniflux created + +# 2. Verify ArgoCD adopted the existing resources +$ argocd app get miniflux +Name: miniflux +Sync Status: Synced to master (4e3c216) +Health Status: Healthy + +# 3. Test that the app still works +$ curl -I https://flux.f3s.foo.zone +HTTP/2 200 +``` + +About 10 minutes, zero downtime. ArgoCD recognised the already-running resources matched the Helm chart in Git and adopted them without re-deploying. + +After the migration, the Justfile turned into utility commands--no more install/upgrade/uninstall: + +```makefile +status: + @kubectl get pods -n services -l app=miniflux-server + @kubectl get pods -n services -l app=miniflux-postgres + @kubectl get application miniflux -n cicd \ + -o jsonpath='Sync: {.status.sync.status}, Health: {.status.health.status}' + +sync: + @kubectl annotate application miniflux -n cicd \ + argocd.argoproj.io/refresh=normal --overwrite + +logs: + kubectl logs -n services -l app=miniflux-server --tail=100 -f + +restart: + kubectl rollout restart -n services deployment/miniflux-server + +port-forward port="8080": + kubectl port-forward -n services svc/miniflux {{port}}:8080 + +psql: + kubectl exec -it -n services deployment/miniflux-postgres -- psql -U miniflux +``` + +New workflow: edit chart, commit, push. ArgoCD picks it up within a few minutes. Run `just sync` if you're impatient. + +### Migration Order + +I started with the simplest services (miniflux, wallabag, radicale, etc.)--apps with straightforward Helm charts and no complex dependencies. This let me validate the pattern before touching anything critical. + +After that: infrastructure apps (registry, cert-manager, pkgrepo, traefik-config), then the monitoring stack (tempo, loki, alloy, and finally prometheus--the most complex one), and last the CI/CD tools (git-server, argo-rollouts). + +## Complex Migration: Prometheus Multi-Source + +Prometheus was the tricky one. It combines an upstream Helm chart with a bunch of custom manifests--recording rules, dashboards, persistent volumes, and a post-sync hook to restart Grafana. + +ArgoCD's multi-source feature handles this cleanly: + +```yaml +apiVersion: argoproj.io/v1alpha1 +kind: Application +metadata: + name: prometheus + namespace: cicd +spec: + sources: + # Source 1: Upstream Helm chart + - repoURL: https://prometheus-community.github.io/helm-charts + chart: kube-prometheus-stack + targetRevision: 55.5.0 + helm: + releaseName: prometheus + valuesObject: + kubeEtcd: + enabled: true + endpoints: + - 192.168.2.120 + - 192.168.2.121 + - 192.168.2.122 + # ... hundreds of lines of config + + # Source 2: Custom manifests from Git + - repoURL: http://git-server.cicd.svc.cluster.local/conf.git + targetRevision: master + path: f3s/prometheus/manifests + + syncPolicy: + automated: + prune: false # Manual pruning--too risky for the monitoring stack + selfHeal: true + syncOptions: + - ServerSideApply=true +``` + +The `prometheus/manifests/` directory has 13 files, each with a sync wave annotation to control deployment order: + +``` +f3s/prometheus/manifests/ +├── persistent-volumes.yaml # Wave 0 +├── grafana-restart-rbac.yaml # Wave 0 +├── additional-scrape-configs-secret.yaml # Wave 1 +├── grafana-datasources-configmap.yaml # Wave 1 +├── freebsd-recording-rules.yaml # Wave 3 +├── openbsd-recording-rules.yaml # Wave 3 +├── zfs-recording-rules.yaml # Wave 3 +├── argocd-application-alerts.yaml # Wave 3 +├── epimetheus-dashboard.yaml # Wave 4 +├── zfs-dashboards.yaml # Wave 4 +├── argocd-applications-dashboard.yaml # Wave 4 +├── node-resources-multi-select-dashboard.yaml # Wave 4 +├── prometheus-nodeport.yaml # Wave 4 +└── grafana-restart-hook.yaml # Wave 10 (PostSync) +``` + +### Sync Waves + +Without sync waves, ArgoCD deploys all resources at once in no particular order. That's fine for simple apps, but for something like Prometheus it causes failures--a PersistentVolumeClaim can't bind if the PersistentVolume doesn't exist yet, and a PrometheusRule can't be created if the CRD hasn't been registered. + +Sync waves fix this. You annotate each resource with a wave number: + +```yaml +annotations: + argocd.argoproj.io/sync-wave: "3" +``` + +ArgoCD deploys all wave 0 resources first, waits until they're healthy, then moves to wave 1, waits again, and so on. Resources without the annotation default to wave 0. + +For the Prometheus stack, the waves look like this: + +* Wave 0: PersistentVolumes, RBAC--infrastructure that everything else depends on +* Wave 1: Secrets, ConfigMaps--config that Prometheus and Grafana need at startup +* Wave 3: PrometheusRule CRDs--recording rules for FreeBSD, OpenBSD, ZFS, ArgoCD (the operator from wave 0 needs to be running first) +* Wave 4: Dashboard ConfigMaps and nodeport config +* Wave 10: PostSync hook--a Job that runs after all waves complete + +The PostSync hook is worth explaining. ArgoCD supports lifecycle hooks (`PreSync`, `Sync`, `PostSync`) that run Jobs at specific points. The Grafana restart hook is a good example--it restarts Grafana after every sync so it picks up updated datasources and dashboards: + +```yaml +apiVersion: batch/v1 +kind: Job +metadata: + name: grafana-restart-hook + namespace: monitoring + annotations: + argocd.argoproj.io/hook: PostSync + argocd.argoproj.io/hook-delete-policy: BeforeHookCreation + argocd.argoproj.io/sync-wave: "10" +spec: + template: + spec: + serviceAccountName: grafana-restart-sa + restartPolicy: OnFailure + containers: + - name: kubectl + image: bitnami/kubectl:latest + command: + - /bin/sh + - -c + - | + kubectl wait --for=condition=available --timeout=300s \ + deployment/prometheus-grafana -n monitoring || true + kubectl delete pod -n monitoring \ + -l app.kubernetes.io/name=grafana --ignore-not-found=true + backoffLimit: 2 +``` + +## The Result + +All 30 applications across 5 namespaces, synced and healthy: + +```sh +$ argocd app list +NAME CLUSTER NAMESPACE PROJECT STATUS HEALTH SYNCPOLICY +alloy https://kubernetes.default.svc monitoring default Synced Healthy Auto-Prune +anki-sync-server https://kubernetes.default.svc services default Synced Healthy Auto-Prune +apache https://kubernetes.default.svc services default Synced Healthy Auto-Prune +argo-rollouts https://kubernetes.default.svc cicd default Synced Healthy Auto-Prune +audiobookshelf https://kubernetes.default.svc services default Synced Healthy Auto-Prune +cert-manager https://kubernetes.default.svc infra default Synced Healthy Auto-Prune +filebrowser https://kubernetes.default.svc services default Synced Healthy Auto-Prune +git-server https://kubernetes.default.svc cicd default Synced Healthy Auto-Prune +grafana-ingress https://kubernetes.default.svc monitoring default Synced Healthy Auto-Prune +immich https://kubernetes.default.svc services default Synced Healthy Auto-Prune +ipv6test https://kubernetes.default.svc services default Synced Healthy Auto-Prune +jellyfin https://kubernetes.default.svc services default Synced Healthy Auto-Prune +keybr https://kubernetes.default.svc services default Synced Healthy Auto-Prune +kobo-sync-server https://kubernetes.default.svc services default Synced Healthy Auto-Prune +loki https://kubernetes.default.svc monitoring default Synced Healthy Auto-Prune +miniflux https://kubernetes.default.svc services default Synced Healthy Auto-Prune +navidrome https://kubernetes.default.svc services default Synced Healthy Auto-Prune +opodsync https://kubernetes.default.svc services default Synced Healthy Auto-Prune +pihole https://kubernetes.default.svc services default Synced Healthy Auto-Prune +pkgrepo https://kubernetes.default.svc infra default Synced Healthy Auto-Prune +prometheus https://kubernetes.default.svc monitoring default Synced Healthy Auto +pushgateway https://kubernetes.default.svc monitoring default Synced Healthy Auto-Prune +radicale https://kubernetes.default.svc services default Synced Healthy Auto-Prune +registry https://kubernetes.default.svc infra default Synced Healthy Auto-Prune +syncthing https://kubernetes.default.svc services default Synced Healthy Auto-Prune +tempo https://kubernetes.default.svc monitoring default Synced Healthy Auto-Prune +traefik-config https://kubernetes.default.svc infra default Synced Healthy Auto-Prune +tracing-demo https://kubernetes.default.svc services default Synced Healthy Auto-Prune +wallabag https://kubernetes.default.svc services default Synced Healthy Auto-Prune +webdav https://kubernetes.default.svc services default Synced Healthy Auto-Prune +``` + +=> ./f3s-kubernetes-with-freebsd-part-9/argocd-apps-list.png ArgoCD managing all 30 applications in the f3s cluster + +## What Changed Day-to-Day + +The practical difference is pretty big: + +* Single source of truth--clone the repo, look at `argocd-apps/`, and you know exactly what's running. No more `helm list` or guessing. +* Push and forget--edit a Helm value, commit, push. ArgoCD picks it up within a few minutes. No SSH, no `just upgrade`. +* Self-healing--I've tweaked things manually for debugging, forgotten about it, and ArgoCD quietly reverted it. That's saved me from some confusing "why is this behaving differently?" moments. +* Rollback = git revert--`git revert HEAD && git push` and ArgoCD syncs back to the previous state. +* Disaster recovery--bootstrap k3s, install ArgoCD, apply the Application manifests, wait. The cluster rebuilds itself. I haven't had to do this for real yet, but I've tested it and it works. +* Drift detection--the ArgoCD UI shows immediately if something is out of sync. Much better than running `kubectl` commands and comparing output manually. + +## Challenges Along the Way + +### Helm Release Adoption + +When ArgoCD tries to manage resources already deployed by Helm, it can get confused. The fix: make sure the Application manifest matches the current Helm values exactly. ArgoCD then recognizes the resources and adopts them without re-deploying. + +### PersistentVolumes + +PVs are cluster-scoped, and many of my Helm charts created them with `kubectl apply` outside of Helm. For simple apps I moved PV definitions into the Helm chart templates. For complex apps like Prometheus, I used the multi-source pattern with PVs in a separate `manifests/` directory at sync wave 0. + +### Secrets + +Secrets shouldn't live in Git as plaintext. For now, I create them manually with `kubectl create secret` and reference them from Helm charts. ArgoCD doesn't manage the secrets themselves. This works fine but isn't fully declarative--External Secrets Operator is on the list for the future. + +### Grafana Not Reloading + +After updating datasource ConfigMaps, Grafana wouldn't notice until the pod was restarted. The PostSync hook (the Grafana restart Job in sync wave 10) handles this automatically now. + +### Prometheus Multi-Source Ordering + +Without sync waves, Prometheus resources deployed in random order and things broke. PVs need to exist before PVCs, secrets before the operator, recording rules after the CRDs are registered. Adding sync wave annotations to everything in `prometheus/manifests/` fixed all the ordering issues. + +## Wrapping Up + +The migration took a couple of days, doing one or two apps at a time. The result: 30 applications across 5 namespaces, all managed declaratively through Git. Push a change, it deploys. Break something, `git revert`. Cluster dies, rebuild from the repo. + +All the config lives here: + +=> https://codeberg.org/snonux/conf/src/branch/master/f3s codeberg.org/snonux/conf/f3s + +ArgoCD Application manifests organized by namespace: + +=> https://codeberg.org/snonux/conf/src/branch/master/f3s/argocd-apps codeberg.org/snonux/conf/f3s/argocd-apps + +I can't imagine going back to running Helm commands manually. + +Other *BSD-related posts: + +<< template::inline::rindex bsd + +E-Mail your comments to `paul@nospam.buetow.org` :-) + +=> ../ Back to the main site diff --git a/gemfeed/DRAFT-f3s-kubernetes-with-freebsd-part-X.gmi.tpl b/gemfeed/DRAFT-f3s-kubernetes-with-freebsd-part-X.gmi.tpl deleted file mode 100644 index 0213616b..00000000 --- a/gemfeed/DRAFT-f3s-kubernetes-with-freebsd-part-X.gmi.tpl +++ /dev/null @@ -1,1036 +0,0 @@ -# f3s: Kubernetes with FreeBSD - Part X: GitOps with ArgoCD - -> DRAFT - Not yet published - -This is part X of the f3s series for my self-hosting demands in a home lab. f3s? The "f" stands for FreeBSD, and the "3s" stands for k3s, the Kubernetes distribution I use on FreeBSD-based physical machines. - -<< template::inline::index f3s-kubernetes-with-freebsd-part - -=> ./f3s-kubernetes-with-freebsd-part-1/f3slogo.png f3s logo - -<< template::inline::toc - -## Introduction - -In previous posts, I deployed applications to the k3s cluster using Helm charts and Justfiles—running `just install` or `just upgrade` to imperatively push changes to the cluster. Works fine, but has some drawbacks: - -* No single source of truth: The cluster state depends on which commands were run and when -* Manual synchronization: Every change requires manually running commands -* Drift detection is hard: No easy way to know if cluster state matches the desired configuration -* Rollback complexity: Rolling back means re-running old Helm commands -* No audit trail: Hard to track who changed what and when - -This post covers migrating from imperative Helm deployments to declarative GitOps using ArgoCD. After this, the Git repository becomes the single source of truth, and ArgoCD automatically ensures the cluster matches what's defined in Git. - -## What is GitOps? - -GitOps is an operational framework that applies DevOps best practices—version control, collaboration, CI/CD—to infrastructure automation. The core idea: the entire desired state lives in Git, and automated processes ensure the actual state matches it. - -Key principles: - -* Declarative: The system's desired state is described declaratively (YAML manifests, Helm values) -* Versioned and immutable: All changes are committed to Git, providing a complete history -* Pulled automatically: An agent in the cluster continuously pulls the desired state from Git -* Continuously reconciled: The agent ensures the actual state matches the desired state, automatically correcting drift - -For Kubernetes, this means: - -* 1. All manifests, Helm charts, and configuration live in a Git repository -* 2. A tool (ArgoCD in our case) watches the repository -* 3. When changes are pushed to Git, ArgoCD automatically applies them to the cluster -* 4. If someone manually changes resources in the cluster, ArgoCD detects the drift and can automatically revert it - -## What is ArgoCD? - -ArgoCD is a declarative, GitOps continuous delivery tool for Kubernetes. It's implemented as a Kubernetes controller that continuously monitors running applications and compares the current, live state against the desired target state defined in Git. - -=> https://argo-cd.readthedocs.io ArgoCD Documentation - -Key features: - -* Automated deployment: Monitors Git repositories and automatically syncs changes to the cluster -* Application definitions: Defines applications as CRDs (Custom Resource Definitions) -* Health assessment: Understands Kubernetes resources and can determine if an application is healthy -* Web UI and CLI: Provides both a web interface and command-line tool for managing applications -* RBAC: Role-based access control for team collaboration -* SSO integration: Can integrate with existing authentication systems -* Multi-cluster support: Can manage applications across multiple Kubernetes clusters -* Sync waves and hooks: Control the order of resource deployment and run jobs at specific lifecycle points - -## Why ArgoCD for f3s? - -For a home lab cluster, ArgoCD provides several benefits: - -Disaster recovery: If the entire cluster is lost, I can rebuild it by: - -* 1. Bootstrapping a new k3s cluster -* 2. Installing ArgoCD -* 3. Pointing ArgoCD at the Git repository -* 4. All applications automatically deploy to the desired state - -Experimentation safety: I can test changes in a separate Git branch without affecting the running cluster. Once validated, merge to master and ArgoCD applies the changes. - -Drift detection: If I manually change something in the cluster (for debugging), ArgoCD shows the difference and can automatically revert it. - -Declarative configuration: The Git repository documents the entire cluster configuration. No need to remember which `just` commands to run or in which order. - -Automatic sync: Push to Git, and changes deploy automatically. No need to SSH to a workstation and run Helm commands. - -## Deploying ArgoCD - -ArgoCD itself runs as a set of Kubernetes resources in the cluster. The official installation method uses `kubectl apply`, which is fitting—ArgoCD manages everything else via GitOps, but ArgoCD itself needs a bootstrap. - -### Prerequisites - -Create the `cicd` namespace where ArgoCD will run: - -```sh -$ kubectl create namespace cicd -namespace/cicd created -``` - -### Installing ArgoCD - -The ArgoCD installation lives in the configuration repository: - -=> https://codeberg.org/snonux/conf/src/branch/master/f3s/argocd codeberg.org/snonux/conf/f3s/argocd - -I deployed ArgoCD using Helm instead of the raw manifests. This provides easier upgrades and customization. The installation is managed via a Justfile: - -```sh -$ cd conf/f3s/argocd -$ just install -helm repo add argo https://argoproj.github.io/argo-helm -helm repo update -helm install argocd argo/argo-cd \ - --namespace cicd \ - --version 7.7.12 \ - -f values.yaml -NAME: argocd -LAST DEPLOYED: ... -NAMESPACE: cicd -STATUS: deployed -``` - -The `values.yaml` file configures several important aspects: - -Persistent storage for the repo-server: ArgoCD clones Git repositories to cache them locally. I configured a persistent volume so the cache survives pod restarts: - -```yaml -repoServer: - volumes: - - name: repo-cache - persistentVolumeClaim: - claimName: argocd-repo-cache-pvc - volumeMounts: - - name: repo-cache - mountPath: /tmp -``` - -Admin password preservation: By default, the admin password is auto-generated and stored in a secret. To ensure it persists across Helm upgrades: - -```yaml -configs: - secret: - createSecret: false -``` - -I manually created the secret before installation: - -```sh -$ ARGOCD_ADMIN_PASSWORD=$(pwgen -s 32 1) -$ BCRYPT_HASH=$(htpasswd -nbBC 10 "" "$ARGOCD_ADMIN_PASSWORD" | tr -d ':\n' | sed 's/$2y/$2a/') -$ kubectl create secret generic argocd-secret \ - --from-literal=admin.password="$BCRYPT_HASH" \ - -n cicd -$ echo "ArgoCD admin password: $ARGOCD_ADMIN_PASSWORD" -``` - -Server configuration: Enabled insecure mode since TLS is handled by the OpenBSD edge relays: - -```yaml -server: - insecure: true -``` - -### Accessing ArgoCD - -After deployment, ArgoCD runs several pods in the `cicd` namespace: - -```sh -$ kubectl get pods -n cicd -NAME READY STATUS RESTARTS AGE -argocd-application-controller-0 1/1 Running 0 45d -argocd-applicationset-controller-66d6b9b8f4-vhm9k 1/1 Running 0 45d -argocd-dex-server-7fb556b7dd-xjr2l 1/1 Running 0 45d -argocd-notifications-controller-6d8dd4c5f5-b8vwl 1/1 Running 0 45d -argocd-redis-77b8d6c6d4-mz9hg 1/1 Running 0 45d -argocd-repo-server-5f98f77b97-8xtcq 1/1 Running 0 45d -argocd-server-6b9c4b4f8d-kxw7p 1/1 Running 0 45d -``` - -I created an ingress to expose the ArgoCD web UI: - -```yaml -apiVersion: networking.k8s.io/v1 -kind: Ingress -metadata: - name: argocd-server-ingress - namespace: cicd - annotations: - spec.ingressClassName: traefik - traefik.ingress.kubernetes.io/router.entrypoints: web -spec: - rules: - - host: argocd.f3s.foo.zone - http: - paths: - - path: / - pathType: Prefix - backend: - service: - name: argocd-server - port: - number: 80 -``` - -Following the same pattern as other services, the OpenBSD edge relays terminate TLS and forward traffic through WireGuard to the cluster. ArgoCD is now accessible at: - -The ArgoCD CLI can also be used for operations: - -```sh -$ argocd login argocd.f3s.foo.zone -$ argocd app list -``` - -## ArgoCD Application Structure - -ArgoCD uses a CRD called `Application` to define what should be deployed. Each application specifies: - -* Source: Where the manifests live (Git repo, Helm chart repository, or both) -* Destination: Which cluster and namespace to deploy to - -Here's a simple example for the miniflux application: - -```yaml -ind: Application -metadata: - name: miniflux - namespace: cicd - finalizers: - - resources-finalizer.argocd.argoproj.io -spec: - project: default - source: - repoURL: https://codeberg.org/snonux/conf.git - - targetRevision: master - path: f3s/miniflux/helm-chart - destination: - - server: https://kubernetes.default.svc - namespace: services - syncPolicy: - automated: - prune: true - selfHeal: true - syncOptions: - - CreateNamespace=false - retry: - limit: 3 - backoff: - duration: 5s - factor: 2 - maxDuration: 1m -``` - -Key fields: - -* `source.path`: Points to the Helm chart directory in Git -* `destination.namespace`: Where to deploy the application -* `syncPolicy.automated.prune`: Delete resources that are removed from Git -* `syncPolicy.automated.selfHeal`: Automatically revert manual changes in the cluster -* `finalizers`: Ensures ArgoCD deletes all resources when the Application is deleted - -## Repository Organization - -I reorganized the configuration repository to support GitOps: - -``` -/home/paul/git/conf/f3s/ -├── argocd-apps/ # ArgoCD Application manifests (organized by namespace) -│ ├── README.md # Documentation of structure -│ ├── monitoring/ # Observability stack (6 apps) -│ │ ├── alloy.yaml -│ │ ├── grafana-ingress.yaml -│ │ ├── loki.yaml -│ │ ├── prometheus.yaml -│ │ ├── pushgateway.yaml -│ │ └── tempo.yaml -│ ├── services/ # User-facing applications (13 apps) -│ │ ├── anki-sync-server.yaml -│ │ ├── audiobookshelf.yaml -│ │ ├── filebrowser.yaml -│ │ ├── immich.yaml -│ │ ├── keybr.yaml -│ │ ├── kobo-sync-server.yaml -│ │ ├── miniflux.yaml -│ │ ├── opodsync.yaml -│ │ ├── radicale.yaml -│ │ ├── syncthing.yaml -│ │ ├── tracing-demo.yaml -│ │ ├── wallabag.yaml -│ │ └── webdav.yaml -│ ├── infra/ # Infrastructure services (1 app) -│ │ └── registry.yaml -│ └── test/ # Test/example applications (1 app) -│ └── example-apache-volume-claim.yaml -├── miniflux/ # Application directories (unchanged) -│ ├── helm-chart/ -│ │ ├── Chart.yaml -│ │ ├── values.yaml -│ │ └── templates/ -│ └── Justfile # Updated for ArgoCD -├── prometheus/ -│ ├── manifests/ # NEW: Additional manifests -│ │ ├── persistent-volumes.yaml -│ │ ├── grafana-restart-hook.yaml -│ │ ├── freebsd-recording-rules.yaml -│ │ └── ... -│ └── Justfile # Updated for ArgoCD -└── ... -``` - -The application directories (miniflux, prometheus, etc.) remained mostly unchanged—ArgoCD references the same Helm charts. The main additions: - -1. argocd-apps/: Application manifests organized by Kubernetes namespace for better clarity - -* `monitoring/`: 6 observability applications -* `services/`: 13 user-facing applications -* `infra/`: 1 infrastructure application (registry) -* `test/`: 1 test application - -2. */manifests/: Additional Kubernetes manifests for complex apps (like Prometheus) -3. Justfiles updated: Changed from `helm install/upgrade` to `argocd app sync` - -This organization makes it easy to apply all applications in a specific namespace or manage them independently. - -### Migration Phases - -These apps have straightforward Helm charts with no complex dependencies. Pattern established: - -* 1. Create Application manifest in `argocd-apps/` -* 2. Apply with `kubectl apply -f argocd-apps/.yaml` -* 3. Verify sync status: `argocd app get ` -* 4. Update Justfile to use ArgoCD commands - -Phase 2: Infrastructure apps (3 apps) - -* registry (Docker image registry) -* pushgateway (Prometheus metrics ingestion) -* immich (photo management with complex dependencies) - -Phase 3: Monitoring stack (4 apps) -* tempo (distributed tracing) -* loki (log aggregation) -* alloy (log collection) -* prometheus (metrics and monitoring) - -Phase 4: Monitoring addons (1 app) -* grafana-ingress (separate ingress for Grafana) - -## Example Migration: Miniflux - -Let me walk through the migration of miniflux as a concrete example. - -### Before: Imperative Helm deployment - -Original Justfile: - -```makefile -NAMESPACE := "services" -APP_NAME := "miniflux" - -install: - kubectl apply -f helm-chart/persistent-volumes.yaml - helm install {{APP_NAME}} ./helm-chart --namespace {{NAMESPACE}} - -upgrade: - helm upgrade {{APP_NAME}} ./helm-chart --namespace {{NAMESPACE}} - -uninstall: - helm uninstall {{APP_NAME}} --namespace {{NAMESPACE}} - kubectl delete -f helm-chart/persistent-volumes.yaml - -status: - @kubectl get all -n {{NAMESPACE}} -l app={{APP_NAME}} -``` - -Workflow: -1. Make changes to `helm-chart/` -2. Run `just upgrade` -3. Helm pushes changes to cluster - -### After: Declarative GitOps with ArgoCD - -Created `argocd-apps/services/miniflux.yaml`: - -```yaml -apiVersion: argoproj.io/v1alpha1 -kind: Application -metadata: - name: miniflux - namespace: cicd - finalizers: - - resources-finalizer.argocd.argoproj.io -spec: - project: default - source: - repoURL: https://codeberg.org/snonux/conf.git - targetRevision: master - path: f3s/miniflux/helm-chart - destination: - server: https://kubernetes.default.svc - namespace: services - syncPolicy: - automated: - prune: true - selfHeal: true - syncOptions: - - CreateNamespace=false - retry: - limit: 3 - backoff: - duration: 5s - factor: 2 - maxDuration: 1m -``` - -Updated Justfile: - -```makefile -NAMESPACE := "services" -APP_NAME := "miniflux" - -status: - @echo "=== Pods ===" - @kubectl get pods -n {{NAMESPACE}} -l app={{APP_NAME}} - @echo "" - @echo "=== Services ===" - @kubectl get svc -n {{NAMESPACE}} -l app={{APP_NAME}} - @echo "" - @echo "=== ArgoCD Status ===" - @kubectl get application {{APP_NAME}} -n cicd -o jsonpath='Sync: {.status.sync.status}, Health: {.status.health.status}' 2>/dev/null && echo "" - -sync: - @echo "Triggering ArgoCD sync..." - @kubectl annotate application {{APP_NAME}} -n cicd argocd.argoproj.io/refresh=normal --overwrite - @sleep 2 - @kubectl get application {{APP_NAME}} -n cicd -o jsonpath='Sync: {.status.sync.status}, Health: {.status.health.status}' && echo "" - -argocd-status: - argocd app get {{APP_NAME}} --core - -logs: - kubectl logs -n {{NAMESPACE}} -l app={{APP_NAME}} --tail=100 -f -``` - -New workflow: -1. Make changes to `helm-chart/` -2. Commit and push to Git -3. ArgoCD automatically detects and syncs changes -4. (Optional) Run `just sync` to force immediate sync - -### Migration procedure - -1. Backup current state: -```sh -$ helm get values miniflux -n services > /tmp/miniflux-backup-values.yaml -$ kubectl get all,ingress -n services -o yaml > /tmp/miniflux-backup.yaml -``` - -2. Create Application manifest: -```sh -$ kubectl apply -f argocd-apps/services/miniflux.yaml -application.argoproj.io/miniflux created -``` - -3. Verify ArgoCD adopted the resources: -```sh -$ argocd app get miniflux -Name: miniflux -Project: default -Server: https://kubernetes.default.svc -Namespace: services -URL: https://argocd.f3s.foo.zone/applications/miniflux -Repo: https://codeberg.org/snonux/conf.git -Target: master -Path: f3s/miniflux/helm-chart -SyncWindow: Sync Allowed -Sync Policy: Automated (Prune) -Sync Status: Synced to master (4e3c216) -Health Status: Healthy -``` - -4. Monitor for issues: -```sh -$ kubectl get pods -n services -l app=miniflux -w -NAME READY STATUS RESTARTS AGE -miniflux-postgres-556444cb8d-xvv2p 1/1 Running 0 54d -`` - -5. Test the application: -```sh -$ curl -I https://flux.f3s.foo.zone -HTTP/2 200 -``` - -6. Update Justfile and commit changes - -Total time: 10 minutes. Zero downtime. - -## Complex Migration: Prometheus with Multi-Source - - -The Prometheus migration was more complex because it combines: -* Upstream Helm chart (kube-prometheus-stack) -* Custom manifests (PersistentVolumes, recording rules, dashboards) -* Sync hooks (PostSync job to restart Grafana) - -```yaml -apiVersion: argoproj.io/v1alpha1 -kind: Application -metadata: - name: prometheus - namespace: cicd - finalizers: - - resources-finalizer.argocd.argoproj.io -spec: - project: default - sources: - # Source 1: Upstream Helm chart from prometheus-community - - repoURL: https://prometheus-community.github.io/helm-charts - - chart: kube-prometheus-stack - targetRevision: 55.5.0 - helm: - releaseName: prometheus - valuesObject: - # Full Prometheus configuration embedded here - kubeEtcd: - enabled: true - endpoints: - - 192.168.2.120 - - 192.168.2.121 - - 192.168.2.122 - # ... (hundreds of lines of configuration) - - # Source 2: Additional manifests from Git repository - - repoURL: https://codeberg.org/snonux/conf.git - targetRevision: master - path: f3s/prometheus/manifests - - destination: - server: https://kubernetes.default.svc - namespace: monitoring - - syncPolicy: - automated: - prune: false # Manual pruning for safety on complex stack - selfHeal: true - syncOptions: - - CreateNamespace=false - - ServerSideApply=true - retry: - limit: 3 - backoff: - duration: 10s - factor: 2 - maxDuration: 3m -``` - -The `prometheus/manifests/` directory contains: - -``` -f3s/prometheus/manifests/ -├── persistent-volumes.yaml # Sync wave 0 -├── additional-scrape-configs-secret.yaml # Sync wave 1 -├── grafana-datasources-configmap.yaml # Sync wave 1 -├── freebsd-recording-rules.yaml # Sync wave 3 -├── openbsd-recording-rules.yaml # Sync wave 3 -├── zfs-recording-rules.yaml # Sync wave 3 -├── epimetheus-dashboard.yaml # Sync wave 4 -├── zfs-dashboards.yaml # Sync wave 4 -├── grafana-restart-hook.yaml # Sync wave 10 (PostSync) -└── grafana-restart-rbac.yaml # Sync wave 0 -``` - -### Sync Waves and Hooks - -ArgoCD allows controlling the order of resource deployment using sync waves (the `argocd.argoproj.io/sync-wave` annotation): - -* Wave 0: Infrastructure (PersistentVolumes, RBAC) -* Wave 1: Configuration (Secrets, ConfigMaps) -* Wave 3: Recording rules (PrometheusRule CRDs) -* Wave 4: Dashboards (ConfigMaps with `grafana_dashboard: '1'` label) -* Wave 10: PostSync hooks (Jobs that run after everything else) - -The Grafana restart hook ensures Grafana reloads datasources after they're updated: - -```yaml -apiVersion: batch/v1 -kind: Job -metadata: - name: grafana-restart-hook - namespace: monitoring - annotations: - argocd.argoproj.io/hook: PostSync -*rgocd.argoproj.io/hook-delete-policy: BeforeHookCreation -*rgocd.argoproj.io/sync-wave: "10" -* - *plate: - spec: - serviceAccountName: grafana-restart-sa - restartPolicy: OnFailure - containers: - - name: kubectl - image: bitnami/kubectl:latest - command: - - /bin/sh - - -c - - | - kubectl wait --for=condition=available --timeout=300s deployment/prometheus-grafana -n monitoring || true - kubectl delete pod -n monitoring -l app.kubernetes.io/name=grafana --ignore-not-found=true - backoffLimit: 2 -``` - -This *he manual step in the old Justfile that required running `kubectl delete pod` after every upgrade. - -## Migration *sults - -After * all 21 applications to ArgoCD: - -```sh -$ argocd app *st -NAME CLUSTER NAMESPACE PROJECT STATUS HEALTH SYNCPOLICY -alloy https://kubernetes.default.svc monitoring default Synced Healthy Auto-Prune -anki-sync-server https://kubernetes.default.svc services default Synced Healthy Auto-Prune -audiobookshelf https://kubernetes.default.svc services default Synced Healthy Auto-Prune -example-apache https://kubernetes.default.svc test default Synced Healthy Auto-Prune -example-apache-volume-... https://kubernetes.default.svc test default Synced Healthy Auto-Prune -filebrowser https://kubernetes.default.svc services default Synced Healthy Auto-Prune -freshrss https://kubernetes.default.svc services default Synced Healthy Auto-Prune -grafana-ingress https://kubernetes.default.svc monitoring default Synced Healthy Auto-Prune -immich https://kubernetes.default.svc services default Synced Healthy Auto-Prune -keybr https://kubernetes.default.svc services default Synced Healthy Auto-Prune -kobo-sync-server https://kubernetes.default.svc services default Synced Healthy Auto-Prune -loki https://kubernetes.default.svc monitoring default Synced Healthy Auto-Prune -miniflux https://kubernetes.default.svc services default Synced Healthy Auto-Prune -opodsync https://kubernetes.default.svc services default Synced Healthy Auto-Prune -prometheus https://kubernetes.default.svc monitoring default Synced Healthy Auto -pushgateway https://kubernetes.default.svc monitoring default Synced Healthy Auto-Prune -radicale https://kubernetes.default.svc services default Synced Healthy Auto-Prune -registry https://kubernetes.default.svc infra default Synced Healthy Auto-Prune -syncthing https://kubernetes.default.svc services default Synced Healthy Auto-Prune -tempo https://kubernetes.default.svc monitoring default Synced Healthy Auto-Prune -wallabag https://kubernetes.default.svc services default Synced Healthy Auto-Prune -webdav https://kubernetes.default.svc services default Synced Healthy Auto-Prune -``` - -All 21 applications: Synced and Healthy. - -ArgoCD Web UI: - -=> ./f3s-kubernetes-with-freebsd-part-X/argocd-apps-list.png ArgoCD Applications List - -=> ./f3s-kubernetes-with-freebsd-part-X/argocd-app-tree.png ArgoCD Application Resource Tree - -## Benefits Realized - -### 1. Single Source of Truth - -The Git repository at `https://codeberg.org/snonux/conf` now contains the complete cluster configuration. Anyone can clone it and see exactly what's deployed: - -```sh -$ git clone https://codeberg.org/snonux/conf.git -$ cd conf/f3s -$ ls argocd-apps/ -alloy.yaml anki-sync-server.yaml audiobookshelf.yaml ... -``` - -### 2. Automatic Synchronization - -Push to Git, and changes deploy automatically: - -```sh -$ cd conf/f3s/miniflux/helm-chart -$ vim values.yaml # Change replica count from 1 to 2 -$ git add values.yaml -$ git commit -m "Scale miniflux to 2 replicas" -$ git push -# ArgoCD detects change within 3 minutes and syncs automatically -``` - -No need to SSH to a workstation, pull the repo, and run `just upgrade`. - -### 3. Drift Detection and Self-Healing - -If someone manually changes a resource in the cluster, ArgoCD detects it: - -```sh -$ kubectl scale deployment miniflux-server -n services --replicas=3 -deployment.apps/miniflux-server scaled - -# ArgoCD detects drift within 3 minutes -$ argocd app get miniflux -... -Sync Status: OutOfSync from master (4e3c216) -``` - -With `selfHeal: true`, ArgoCD automatically reverts the change back to 2 replicas (the value in Git). - -### 4. Easy Rollbacks - -To rollback a change: - -```sh -$ git revert HEAD -$ git push -# ArgoCD automatically rolls back to the previous state -``` - -Or rollback to a specific commit: - -```sh -$ argocd app rollback miniflux -``` - -### 5. Disaster Recovery - -If the entire cluster is destroyed, recovery is straightforward: - -1. Bootstrap a new k3s cluster -2. Create namespaces -3. Install ArgoCD -4. Apply all Application manifests: -```sh -$ kubectl apply -f argocd-apps/ -``` -5. ArgoCD deploys all 21 applications to their desired state - -Total recovery time: ~30 minutes (mostly waiting for pods to pull images and start). - -### 6. Documentation by Default - -The Application manifests serve as documentation: - -* Which Helm chart version is deployed? → Check `targetRevision` -* What custom values are configured? → Check `valuesObject` -* Which namespace does this deploy to? → Check `destination.namespace` -* Is auto-sync enabled? → Check `syncPolicy.automated` - -No more guessing or checking `helm list` output. - -### 7. Safe Experimentation - -Create a feature branch, make changes, and preview them: - -```sh -$ git checkout -b test-prometheus-upgrade -$ vim argocd-apps/prometheus.yaml # Bump chart version -$ git commit -am "Test Prometheus 56.0.0" -$ git push origin test-prometheus-upgrade - -# Temporarily point ArgoCD at the feature branch -$ kubectl patch application prometheus -n cicd \ - --type merge \ - -p '{"spec":{"source":{"targetRevision":"test-prometheus-upgrade"}}}' - -# Verify changes in ArgoCD Web UI -# If good: merge to master -# If bad: revert the patch -``` - -## Challenges and Solutions - -### Challenge 1: Helm Release Adoption - -When creating an Application for an existing Helm release, ArgoCD needs to "adopt" the resources. This failed initially with errors like: - -``` -The Helm operation failed with an error: release miniflux failed, and has been uninstalled due to atomic being set: timed out waiting for the condition -``` - -Solution: For existing Helm releases, I first ensured the Application manifest matched the current Helm values exactly. ArgoCD then recognized the resources were already in the desired state and adopted them without re-deploying. - -### Challenge 2: Persistent Volumes Not Tracked by Helm - -PersistentVolumes are cluster-scoped resources, not namespace-scoped. Many of my Helm charts created PVs using `kubectl apply -f persistent-volumes.yaml` outside of Helm. - -Solution: For simple apps, I moved the PV definitions into the Helm chart templates. For complex apps (like Prometheus), I used the multi-source pattern with PVs in the `manifests/` directory with sync wave 0. - -### Challenge 3: Secrets Management - -ArgoCD stores Application manifests in Git, but secrets shouldn't be committed in plaintext. - -Solution (current): Secrets are created manually with `kubectl create secret` and referenced by the Helm charts. The secrets themselves aren't managed by ArgoCD. - -Future enhancement: Migrate to External Secrets Operator (ESO) to manage secrets declaratively while storing the actual secrets in a separate backend (Kubernetes secrets in a separate namespace, or eventually Vault). - -### Challenge 4: Grafana Not Reloading Datasources - -After updating the Grafana datasources ConfigMap, Grafana wouldn't detect the changes until pods were manually deleted. - -Solution: Created a PostSync hook that automatically restarts Grafana pods after every ArgoCD sync. This runs as a Kubernetes Job in sync wave 10, ensuring it executes after all other resources are deployed. - -### Challenge 5: Prometheus With Multiple Sources - -Prometheus needed both the upstream Helm chart and custom manifests (recording rules, dashboards, PVs). - -Solution: Used ArgoCD's multi-source feature to combine: -* Helm chart from `prometheus-community.github.io/helm-charts` -* Additional manifests from `codeberg.org/snonux/conf.git` at path `f3s/prometheus/manifests` - -This keeps the upstream chart cleanly separated from custom configuration. - -### Challenge 6: Sync Ordering for Prometheus - -Prometheus resources have dependencies: -* PVs before PVCs -* Secrets before Prometheus Operator -* PrometheusRule CRDs before Prometheus Operator can process them -* Grafana must be running before the restart hook executes - -Solution: Added sync wave annotations to all resources in `prometheus/manifests/`: -* Wave 0: PVs, RBAC -* Wave 1: Secrets, ConfigMaps -* Wave 3: PrometheusRule CRDs (recording rules) -* Wave 4: Dashboard ConfigMaps -* Wave 10: PostSync hook (Grafana restart) - -ArgoCD deploys resources in wave order, ensuring correct sequencing. - -## Justfile Evolution - -The Justfiles evolved from deployment tools to utility scripts: - -Before (Helm deployment): -```makefile -install: - helm install miniflux ./helm-chart -n services - -upgrade: - helm upgrade miniflux ./helm-chart -n services - -uninstall: - helm uninstall miniflux -n services -``` - -After (ArgoCD utilities): -```makefile -status: - @kubectl get pods -n services -l app=miniflux - @kubectl get application miniflux -n cicd -o jsonpath='Sync: {.status.sync.status}, Health: {.status.health.status}' - -sync: - @kubectl annotate application miniflux -n cicd argocd.argoproj.io/refresh=normal --overwrite - -argocd-status: - argocd app get miniflux --core - -logs: - kubectl logs -n services -l app=miniflux --tail=100 -f -``` - -The Justfiles now provide: -* `status`: Quick health check -* `sync`: Force immediate ArgoCD sync (instead of waiting 3 minutes) -* `argocd-status`: Detailed ArgoCD application status -* `logs`: Tail application logs -* Application-specific utilities (e.g., `port-forward`, `restart`) - -## Lessons Learned - -1. Incremental migration is safer than big-bang: Migrating one app at a time allowed me to validate the pattern and fix issues before they affected all apps. - -2. Start with simple apps: The first migration (simple services) established the basic pattern. Complex apps (Prometheus) came later after the pattern was proven. - -3. Sync waves are essential for complex apps: Without sync waves, resources deployed in random order and caused failures. Proper ordering eliminated all deployment issues. - -4. Multi-source is powerful: Combining upstream Helm charts with custom manifests keeps configuration clean and maintainable. - -5. PostSync hooks replace manual steps: The Grafana restart hook eliminated a manual step that was easy to forget. - -6. Documentation in Git is better than tribal knowledge: The Application manifests document exactly what's deployed and how. No more "let me check my shell history to remember how I deployed this." - -7. Self-healing prevents configuration drift: Multiple times I've manually tweaked something for debugging, forgotten about it, and ArgoCD automatically reverted it back to the desired state. - -8. ArgoCD Web UI is invaluable: Seeing the resource tree, sync status, and health status at a glance is much better than running multiple `kubectl` commands. - -## Future Improvements - -### 1. External Secrets Operator - -Currently, secrets are manually created with `kubectl create secret`. This works but isn't declarative. Plan: - -* Deploy External Secrets Operator (ESO) -* Store actual secrets in a Kubernetes Secret in a separate `secrets` namespace -* Create ExternalSecret CRDs that reference the backend secrets -* ArgoCD manages the ExternalSecret CRDs, ESO creates the actual Secrets - -This makes secrets declarative while keeping them out of Git. - -### 2. ApplicationSet for Similar Apps - -Many apps have nearly identical Application manifests (miniflux, freshrss, wallabag, etc.). ArgoCD ApplicationSets can generate multiple Applications from a template: - -```yaml -apiVersion: argoproj.io/v1alpha1 -kind: ApplicationSet -metadata: - name: simple-services - namespace: cicd -spec: - generators: - - list: - elements: - - app: miniflux - - app: freshrss - - app: wallabag - template: - metadata: - name: '{{app}}' - spec: - project: default - source: - repoURL: https://codeberg.org/snonux/conf.git - targetRevision: master - path: 'f3s/{{app}}/helm-chart' - destination: - server: https://kubernetes.default.svc - namespace: services - syncPolicy: - automated: - prune: true - selfHeal: true -``` - -One ApplicationSet could replace 10+ individual Application manifests. - -### 3. App-of-Apps Pattern - -Currently, all Application manifests are applied manually with `kubectl apply -f argocd-apps/ -R`. An alternative is the "app-of-apps" pattern: - -Create a root Application that deploys all other Applications. With the namespace-organized structure, this could be done per-namespace or for the entire cluster: - -```yaml -apiVersion: argoproj.io/v1alpha1 -kind: Application -metadata: - name: root - namespace: cicd -spec: - source: - repoURL: https://codeberg.org/snonux/conf.git - targetRevision: master - path: f3s/argocd-apps - directory: - recurse: true # Recursively find all manifests in subdirectories - destination: - server: https://kubernetes.default.svc - namespace: cicd - syncPolicy: - automated: - prune: true - selfHeal: true -``` - -Or create separate root apps per namespace: - -```yaml -# root-monitoring.yaml -apiVersion: argoproj.io/v1alpha1 -kind: Application -metadata: - name: root-monitoring - namespace: cicd -spec: - source: - repoURL: https://codeberg.org/snonux/conf.git - targetRevision: master - path: f3s/argocd-apps/monitoring - destination: - server: https://kubernetes.default.svc - namespace: cicd - syncPolicy: - automated: - prune: true - selfHeal: true -``` - -Then disaster recovery becomes: -```sh -$ kubectl apply -f root-app.yaml -# Root app deploys all 21 applications automatically - -# Or apply by namespace -$ kubectl apply -f root-monitoring.yaml -$ kubectl apply -f root-services.yaml -$ kubectl apply -f root-infra.yaml -``` - -### 4. ArgoCD Image Updater - -For applications with custom Docker images (like the registry, tracing-demo), ArgoCD Image Updater can automatically update the image tag in Git when a new image is pushed: - -```yaml -metadata: - annotations: - argocd-image-updater.argoproj.io/image-list: | - app=registry.f3s.foo.zone/miniflux:~^v - argocd-image-updater.argoproj.io/write-back-method: git -``` - -When a new image `registry.f3s.foo.zone/miniflux:v2.1.0` is pushed, Image Updater automatically: -1. Updates the Helm values in Git -2. Commits the change -3. ArgoCD syncs the new image - -This creates a fully automated CI/CD pipeline. - -## Summary - -Migrating from imperative Helm deployments to declarative GitOps with ArgoCD transformed how I manage the f3s cluster: - Manual Helm commands for every change -* No visibility into cluster state - Disaster recovery required rebuilding from memory/notes - -After: -* Git is the single source of truth -* Automatic synchronization of changes -* Complete audit trail in Git history -* Drift detection and self-healing -* Disaster recovery: deploy ArgoCD, apply Application manifests, done -* Organized by namespace for clarity - -The migration took several days spread over a few weeks, migrating one application at a time. The result is a more maintainable, reliable, and recoverable cluster. - - - -All 21 applications are now managed via GitOps, with the configuration living in: - -=> https://codeberg.org/snonux/conf/src/branch/master/f3s codeberg.org/snonux/conf/f3s - -The ArgoCD Application manifests are organized by namespace: - -=> https://codeberg.org/snonux/conf/src/branch/master/f3s/argocd-apps codeberg.org/snonux/conf/f3s/argocd-apps - -ArgoCD has become an essential part of the f3s infrastructure, and I can't imagine managing the cluster without it. - -Other *BSD-related posts: - -*emplate::inline::rindex bsd - -E-*il your comments to `paul@nospam.buetow.org` - -*./ Back to the main site diff --git a/gemfeed/f3s-kubernetes-with-freebsd-part-9/argocd-app-tree.png b/gemfeed/f3s-kubernetes-with-freebsd-part-9/argocd-app-tree.png new file mode 100644 index 00000000..e08bea5a Binary files /dev/null and b/gemfeed/f3s-kubernetes-with-freebsd-part-9/argocd-app-tree.png differ diff --git a/gemfeed/f3s-kubernetes-with-freebsd-part-9/argocd-apps-list.png b/gemfeed/f3s-kubernetes-with-freebsd-part-9/argocd-apps-list.png new file mode 100644 index 00000000..04697bc1 Binary files /dev/null and b/gemfeed/f3s-kubernetes-with-freebsd-part-9/argocd-apps-list.png differ diff --git a/gemfeed/f3s-kubernetes-with-freebsd-part-9/argocd-login.png b/gemfeed/f3s-kubernetes-with-freebsd-part-9/argocd-login.png new file mode 100644 index 00000000..598c4f57 Binary files /dev/null and b/gemfeed/f3s-kubernetes-with-freebsd-part-9/argocd-login.png differ -- cgit v1.2.3