diff options
| -rw-r--r-- | Rexfile | 9 | ||||
| -rw-r--r-- | prompts/skills/f3s/SKILL.md | 47 | ||||
| -rw-r--r-- | prompts/skills/f3s/references/freebsd-setup.md | 119 | ||||
| -rw-r--r-- | prompts/skills/f3s/references/hardware.md | 58 | ||||
| -rw-r--r-- | prompts/skills/f3s/references/k3s-setup.md | 281 | ||||
| -rw-r--r-- | prompts/skills/f3s/references/observability.md | 273 | ||||
| -rw-r--r-- | prompts/skills/f3s/references/rocky-linux-vms.md | 160 | ||||
| -rw-r--r-- | prompts/skills/f3s/references/storage.md | 492 | ||||
| -rw-r--r-- | prompts/skills/f3s/references/ups-power.md | 77 | ||||
| -rw-r--r-- | prompts/skills/f3s/references/wireguard.md | 210 |
10 files changed, 1723 insertions, 3 deletions
@@ -69,8 +69,8 @@ sub opencode_config_content { return JSON::PP->new->ascii->pretty->canonical->encode( { - '$schema' => 'https://opencode.ai/config.json', - 'model' => 'ollama/nemotron-3-super:latest', + '$schema' => 'https://opencode.ai/config.json', + 'model' => 'ollama/qwen3-coder:30b', 'provider' => { 'ollama' => { 'models' => { @@ -83,6 +83,9 @@ sub opencode_config_content { 'nemotron-3-super:latest' => { 'name' => 'Nemotron 3 Super' }, + 'qwen3-coder-next' => { + 'name' => 'Qwen3 Coder Next' + }, 'qwen3-coder:30b' => { 'name' => 'Qwen3 Coder 30B' }, @@ -244,7 +247,7 @@ task 'home_prompts', sub { my $nested = "$target/$leaf"; if ( -l $nested && readlink($nested) eq $source ) { CORE::unlink($nested) or die "Could not remove nested $label symlink at $nested: $!"; - rmdir $target or die "Could not remove legacy $label directory at $target: $!"; + rmdir $target or die "Could not remove legacy $label directory at $target: $!"; } else { die "Refusing to overwrite existing directory at $target while linking $label"; diff --git a/prompts/skills/f3s/SKILL.md b/prompts/skills/f3s/SKILL.md new file mode 100644 index 0000000..2d5f806 --- /dev/null +++ b/prompts/skills/f3s/SKILL.md @@ -0,0 +1,47 @@ +--- +name: f3s +description: Reference skill for the f3s homelab—three Beelink S12 Pro hosts (f0/f1/f2) running FreeBSD with Rocky Linux Bhyve VMs (r0/r1/r2) and a k3s Kubernetes cluster. Use when troubleshooting or making configuration decisions for the f3s setup. +--- + +# f3s Homelab Reference + +**f3s** = **f**reeBSD + **k3s**. Three physical Beelink S12 Pro mini-PCs (Intel N100) running FreeBSD as the base OS, each hosting a Rocky Linux 9 bhyve VM, forming a 3-node HA k3s Kubernetes cluster. + +## When to Use + +- Troubleshooting the homelab cluster +- Making decisions about configuration, storage, networking, or workload placement +- Answering questions about how the setup works + +## Reference Files + +Detailed reference documentation is in the `references/` subfolder: + +- [Hardware](references/hardware.md) — Beelink S12 Pro specs, network switch, IPs, MAC addresses, Wake-on-LAN +- [FreeBSD Setup](references/freebsd-setup.md) — Base OS install, packages, ZFS snapshots, configuration +- [UPS & Power](references/ups-power.md) — APC BX750MI, apcupsd config on f0/f1/f2 +- [Rocky Linux VMs](references/rocky-linux-vms.md) — Bhyve, vm-bhyve, VM config, NVMe disk fix +- [WireGuard Mesh](references/wireguard.md) — Mesh topology, IP assignments, peer configs +- [Storage](references/storage.md) — ZFS (zdata), CARP, NFS over stunnel, zrepl replication +- [k3s Setup](references/k3s-setup.md) — HA k3s cluster, etcd, node IPs, kubeconfig, ArgoCD +- [Observability](references/observability.md) — Prometheus, Grafana, Loki, Alloy, Tempo + +## Quick Reference: Host IPs + +| Host | Role | LAN IP | WireGuard IP | +|------|------|--------|--------------| +| f0 | FreeBSD host | 192.168.1.130 | 192.168.2.130 | +| f1 | FreeBSD host | 192.168.1.131 | 192.168.2.131 | +| f2 | FreeBSD host | 192.168.1.132 | 192.168.2.132 | +| r0 | Rocky Linux VM on f0 | 192.168.1.120 | 192.168.2.120 | +| r1 | Rocky Linux VM on f1 | 192.168.1.121 | 192.168.2.121 | +| r2 | Rocky Linux VM on f2 | 192.168.1.122 | 192.168.2.122 | +| blowfish | OpenBSD internet GW | — | 192.168.2.110 | +| fishfinger | OpenBSD internet GW | — | 192.168.2.111 | +| earth | Fedora laptop (roaming) | — | 192.168.2.200 | +| pixel7pro | Android (roaming) | — | 192.168.2.201 | +| f3s-storage-ha | CARP VIP (f0/f1) | 192.168.1.138 | — | + +## Config Repository + +All manifests and config: `https://codeberg.org/snonux/conf` (directory: `f3s/`) diff --git a/prompts/skills/f3s/references/freebsd-setup.md b/prompts/skills/f3s/references/freebsd-setup.md new file mode 100644 index 0000000..247fc1d --- /dev/null +++ b/prompts/skills/f3s/references/freebsd-setup.md @@ -0,0 +1,119 @@ +# FreeBSD Base Setup + +## Installation (Part 2) + +FreeBSD installed from boot-only ISO (`FreeBSD-14.x-RELEASE-amd64-bootonly.iso`) via USB stick, using the text installer. + +Key choices during install: +- **Guided ZFS on root** (pool: `zroot`), unencrypted (boot without manual interaction) +- **Static IP** configuration (see hardware.md for IPs) +- SSH daemon, NTP server/sync, `powerd` (CPU freq scaling) enabled at boot +- User `paul` added to the `wheel` group (for `doas`) + +## Keeping Up to Date + +Patch level updates: +```sh +doas freebsd-update fetch +doas freebsd-update install +doas reboot +``` + +Version upgrade example (14.2 → 14.3): +```sh +doas freebsd-update fetch && doas freebsd-update install && doas reboot +doas freebsd-update -r 14.3-RELEASE upgrade +doas freebsd-update install && doas reboot +doas freebsd-update install +doas pkg update && doas pkg upgrade && doas reboot +``` + +Major version upgrade (14.3 → 15.0) — run one host at a time: +```sh +# Pre-upgrade: patch 14.3, snapshot ZFS, stop bhyve VM +doas freebsd-update fetch && doas freebsd-update install +doas pkg update && doas pkg upgrade +doas zfs snapshot -r zroot@pre-15.0-upgrade +doas vm stop rocky + +# Upgrade sequence (three freebsd-update install passes required) +doas freebsd-update upgrade -r 15.0-RELEASE +doas freebsd-update install && doas reboot # installs new kernel +doas freebsd-update install # installs new userland +doas pkg upgrade # required: ABI changed +doas freebsd-update install # removes old libraries +doas reboot + +# Post-upgrade: start VM, verify k3s node rejoined +doas vm start rocky +# kubectl get nodes (from laptop — node should be Ready) +``` + +Breaking changes in 15.0 to watch for: +- **bhyve PCI BARs**: if VM fails to boot, add `pci.enable_bars='true'` to `/zroot/bhyve/rocky/rocky.conf` +- **NFS privileged ports**: if NFS mounts break on r0/r1/r2, add `resvport` to Rocky Linux mount options or `--no-resvport` to NFS server flags + +Current version: **FreeBSD 15.0-RELEASE** (as of Part 8, upgraded from 14.3). + +## /etc/hosts + +All three FreeBSD hosts and Rocky VMs are in `/etc/hosts` on each node: +``` +192.168.1.130 f0 f0.lan f0.lan.buetow.org +192.168.1.131 f1 f1.lan f1.lan.buetow.org +192.168.1.132 f2 f2.lan f2.lan.buetow.org +192.168.1.120 r0 r0.lan r0.lan.buetow.org +192.168.1.121 r1 r1.lan r1.lan.buetow.org +192.168.1.122 r2 r2.lan r2.lan.buetow.org +``` +WireGuard IPs are also added (see wireguard.md). + +## Packages Installed + +```sh +doas pkg install helix doas zfs-periodic uptimed +``` + +- **helix** (`hx`): preferred text editor +- **doas**: KISS `sudo` replacement from OpenBSD; config: `/usr/local/etc/doas.conf` (wheel users run as root) +- **zfs-periodic**: automatic ZFS snapshot tool +- **uptimed**: uptime tracking daemon + +Additional packages added over time: +```sh +doas pkg install vm-bhyve bhyve-firmware # Part 4 - bhyve VMs +doas pkg install wireguard-tools # Part 5 - WireGuard +doas pkg install git go # Part 4 - benchmarking +``` + +## ZFS Snapshot Policy (zfs-periodic) + +Configured in `/etc/periodic.conf` for the `zroot` pool: + +```sh +# Daily: 7 snapshots kept +daily_zfs_snapshot_enable="YES" +daily_zfs_snapshot_pools="zroot" +daily_zfs_snapshot_keep="7" + +# Weekly: 5 snapshots kept +weekly_zfs_snapshot_enable="YES" +weekly_zfs_snapshot_pools="zroot" +weekly_zfs_snapshot_keep="5" + +# Monthly: 6 snapshots kept +monthly_zfs_snapshot_enable="YES" +monthly_zfs_snapshot_pools="zroot" +monthly_zfs_snapshot_keep="6" +``` + +Note: `zdata` pool (for NFS storage) is managed by `zrepl`, not `zfs-periodic`. + +## uptimed + +Config at `/usr/local/mimecast/etc/uptimed.conf` — `LOG_MAXIMUM_ENTRIES=0` (keep all records forever). +Check with `uprecords`. + +## Shell + +Default shell is `tcsh` (FreeBSD default). Run `rehash` after installing new packages for tcsh to find them. diff --git a/prompts/skills/f3s/references/hardware.md b/prompts/skills/f3s/references/hardware.md new file mode 100644 index 0000000..209f96e --- /dev/null +++ b/prompts/skills/f3s/references/hardware.md @@ -0,0 +1,58 @@ +# Hardware Reference + +## Physical Nodes + +Three **Beelink S12 Pro** mini-PCs with **Intel N100** CPUs. + +### Specs (per node) + +| Component | Spec | +|-----------|------| +| CPU | Intel N100 (Alder Lake-N), 4 cores/4 threads, up to 3.4 GHz | +| RAM | 16 GB DDR4 | +| Primary SSD | 500 GB M.2 (OS) | +| Secondary SSD | 2.5" slot (used for zdata pool on f0 and f1) | +| Ethernet | GbE (Realtek, interface `re0`) | +| USB | 4× USB 3.2 Gen2 | +| Power | ~8W idle per node; ~38.8W total (3 nodes + switch) under full load | +| Dimensions | 115×102×39 mm, 280 g | + +### Wake-on-LAN + +All three Beelinks support WoL (`WOL_MAGIC` on `re0`). The script `~/bin/wol-f3s` on the Fedora laptop (`earth`) controls power: + +```bash +wol-f3s # wake all three +wol-f3s f0 # wake only f0 +wol-f3s shutdown # graceful SSH shutdown of all three +``` + +MAC addresses: + +| Host | MAC | +|------|-----| +| f0 | e8:ff:1e:d7:1c:ac | +| f1 | e8:ff:1e:d7:1e:44 | +| f2 | e8:ff:1e:d7:1c:a0 | + +BIOS requirements for WoL: enable "Wake on LAN", disable "ERP Support", enable "Power on by PCI-E". + +### IP Addresses (LAN) + +| Host | LAN IP | Hostname | +|------|--------|----------| +| f0 | 192.168.1.130 | f0.lan.buetow.org | +| f1 | 192.168.1.131 | f1.lan.buetow.org | +| f2 | 192.168.1.132 | f2.lan.buetow.org | + +Static IPs configured at FreeBSD install time. Also in `/etc/hosts` on all nodes. + +## Network + +- **Switch**: TP-Link EAP615-Wall (OpenWrt Wi-Fi hotspot with 3 Ethernet ports) +- **Uplink**: 100 Mbit/s down / 50 Mbit/s up fiber (was previously 400 Mbit/s) +- UPS also connected to the switch so Wi-Fi stays up during power outages + +## Physical Location + +All infrastructure lives behind the TV (spouse acceptance factor). UPS is on the left, 3 Beelinks stacked on the right. diff --git a/prompts/skills/f3s/references/k3s-setup.md b/prompts/skills/f3s/references/k3s-setup.md new file mode 100644 index 0000000..3351eb9 --- /dev/null +++ b/prompts/skills/f3s/references/k3s-setup.md @@ -0,0 +1,281 @@ +# k3s Setup + +## Overview + +3-node HA k3s cluster running on Rocky Linux VMs (r0, r1, r2). All nodes act as both control-plane and etcd members (no separate worker nodes). + +- k3s version: **v1.32.6+k3s1** (as of Part 7) +- etcd mode: **embedded HA** (`--cluster-init`) +- All control-plane traffic goes over **WireGuard** (192.168.2.x IPs) + +## Prerequisites + +- All Rocky Linux VMs (r0, r1, r2) updated and running +- WireGuard mesh fully configured (see wireguard.md) +- NVMe disk emulation in place (see rocky-linux-vms.md) — critical for etcd performance + +## Installation + +### Generate shared token + +```sh +# On Fedora laptop +pwgen -n 32 +# Copy output to all r nodes: +echo -n SECRET_TOKEN > ~/.k3s_token # on r0, r1, r2 +``` + +### Bootstrap first node (r0) + +```sh +[root@r0 ~]# curl -sfL https://get.k3s.io | K3S_TOKEN=$(cat ~/.k3s_token) \ + sh -s - server --cluster-init \ + --node-ip=192.168.2.120 \ + --advertise-address=192.168.2.120 \ + --tls-san=r0.wg0.wan.buetow.org +``` + +`--node-ip` and `--advertise-address` bind etcd to the WireGuard interface so all control-plane traffic is encrypted. + +### Join remaining nodes (r1, r2) + +```sh +[root@r1 ~]# curl -sfL https://get.k3s.io | K3S_TOKEN=$(cat ~/.k3s_token) \ + sh -s - server --server https://r0.wg0.wan.buetow.org:6443 \ + --node-ip=192.168.2.121 \ + --advertise-address=192.168.2.121 \ + --tls-san=r1.wg0.wan.buetow.org + +[root@r2 ~]# curl -sfL https://get.k3s.io | K3S_TOKEN=$(cat ~/.k3s_token) \ + sh -s - server --server https://r0.wg0.wan.buetow.org:6443 \ + --node-ip=192.168.2.122 \ + --advertise-address=192.168.2.122 \ + --tls-san=r2.wg0.wan.buetow.org +``` + +### Verify cluster + +```sh +kubectl get nodes +# Expected: r0, r1, r2 all Ready with role control-plane,etcd,master +``` + +## kubeconfig + +```sh +# Copy from any r node to laptop +scp root@r0.lan.buetow.org:/etc/rancher/k3s/k3s.yaml ~/.kube/config +# Edit: replace server address with r0.lan.buetow.org +# (repeat with r1 or r2 if r0 is down) +``` + +## k3s config.yaml — expose etcd and controller-manager metrics + +For Prometheus to scrape etcd and controller-manager metrics, add to `/etc/rancher/k3s/config.yaml` on each r node: + +```sh +cat >> /etc/rancher/k3s/config.yaml << 'EOF' +kube-controller-manager-arg: + - bind-address=0.0.0.0 +etcd-expose-metrics: true +EOF +systemctl restart k3s +``` + +Verify: `curl -s http://127.0.0.1:2381/metrics | grep etcd_server_has_leader` + +## Built-in Components + +| Component | Purpose | +|-----------|---------| +| CoreDNS | DNS for pods | +| Traefik | Ingress controller | +| local-path-provisioner | Local PVC storage | +| metrics-server | Resource metrics | +| svclb-traefik | ServiceLB for Traefik | + +### Scale Traefik to 2 replicas (faster failover) + +```sh +kubectl -n kube-system scale deployment traefik --replicas=2 +``` + +## NFS Persistent Volumes + +Persistent volumes use `hostPath` pointing to NFS-mounted paths: + +``` +/data/nfs/k3svolumes/<app>/ +``` + +NFS is mounted on all r nodes at `/data/nfs/k3svolumes` via stunnel → CARP VIP → freeBSD NFS (see storage.md). + +Example PV: + +```yaml +apiVersion: v1 +kind: PersistentVolume +metadata: + name: example-pv +spec: + capacity: + storage: 1Gi + accessModes: + - ReadWriteOnce + persistentVolumeReclaimPolicy: Retain + hostPath: + path: /data/nfs/k3svolumes/example-volume + type: Directory +``` + +Create the directory on the NFS share before deploying: `mkdir /data/nfs/k3svolumes/<app>/` + +## Deployment: GitOps with ArgoCD + +Config repository: `https://codeberg.org/snonux/conf` (directory: `f3s/`) + +ArgoCD app structure: +``` +argocd-apps/ + monitoring/ # Prometheus, Grafana, Loki, etc. + services/ # User-facing services + infra/ # Infrastructure components + test/ # Test deployments +``` + +**To view pre-ArgoCD state** (how things were in Part 7): +```sh +git clone https://codeberg.org/snonux/conf.git +cd conf && git checkout 15a86f3 # last commit before ArgoCD migration +cd f3s/ +``` + +## Node IP Summary + +| Node | LAN IP | WireGuard IP | k3s API | +|------|--------|-------------|---------| +| r0 | 192.168.1.120 | 192.168.2.120 | r0.wg0.wan.buetow.org:6443 | +| r1 | 192.168.1.121 | 192.168.2.121 | r1.wg0.wan.buetow.org:6443 | +| r2 | 192.168.1.122 | 192.168.2.122 | r2.wg0.wan.buetow.org:6443 | + +## External Connectivity: OpenBSD relayd + +Traffic flow for public access: `Internet → OpenBSD relayd (TLS, Let's Encrypt) → WireGuard → k3s Traefik :80 → Service` + +### relayd.conf on blowfish/fishfinger + +``` +table <f3s> { + 192.168.2.120 + 192.168.2.121 + 192.168.2.122 +} + +http protocol "https" { + tls keypair f3s.foo.zone + # ... all f3s service TLS keypairs ... + # Non-f3s hosts explicitly forwarded to localhost: + match request header "Host" value "foo.zone" forward to <localhost> + # f3s hosts have NO match rules — use relay-level failover +} + +relay "https4" { + listen on <PUBLIC_IP> port 443 tls + protocol "https" + forward to <f3s> port 80 check tcp # primary + forward to <localhost> port 8080 # fallback when f3s down +} +``` + +When all f3s nodes are down, relayd falls back to `localhost:8080` (OpenBSD httpd serving a "Server turned off" page). + +## LAN Ingress: FreeBSD relayd on CARP VIP + +For LAN access without going through internet gateways: +`LAN → CARP VIP (192.168.1.138) → FreeBSD relayd → k3s Traefik :443 → Service` + +### FreeBSD relayd config (`/usr/local/etc/relayd.conf`) + +``` +table <k3s_nodes> { 192.168.1.120 192.168.1.121 192.168.1.122 } + +relay "lan_http" { + listen on 192.168.1.138 port 80 + forward to <k3s_nodes> port 80 check tcp +} + +relay "lan_https" { + listen on 192.168.1.138 port 443 + forward to <k3s_nodes> port 443 check tcp +} +``` + +Minimal `/etc/pf.conf` (PF required for relayd): + +``` +set skip on lo0 +pass in quick +pass out quick +``` + +```sh +doas pkg install -y relayd +doas sysrc pf_enable=YES pflog_enable=YES relayd_enable=YES +doas service pf start && doas service pflog start && doas service relayd start +``` + +Run on both f0 and f1. Only CARP MASTER responds to VIP traffic. + +### cert-manager for LAN TLS + +LAN services use `*.f3s.lan.foo.zone` with a self-signed CA managed by cert-manager: + +```sh +cd conf/f3s/cert-manager && just install +# Creates: selfsigned ClusterIssuer, CA cert, wildcard cert (f3s-lan-tls) +``` + +Copy secret to service namespace: +```sh +kubectl get secret f3s-lan-tls -n cert-manager -o yaml | \ + sed 's/namespace: cert-manager/namespace: services/' | \ + kubectl apply -f - +``` + +### LAN ingress pattern + +```yaml +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: ingress-lan + namespace: services + annotations: + spec.ingressClassName: traefik + traefik.ingress.kubernetes.io/router.entrypoints: web,websecure +spec: + tls: + - hosts: + - myservice.f3s.lan.foo.zone + secretName: f3s-lan-tls + rules: + - host: myservice.f3s.lan.foo.zone + http: + paths: + - path: / + pathType: Prefix + backend: + service: + name: myservice + port: + number: 8080 +``` + +## Useful Commands + +```sh +kubectl get nodes # cluster status +kubectl get pods --all-namespaces # all running pods +kubectl get namespaces +kubectl config set-context --current --namespace=<ns> +``` diff --git a/prompts/skills/f3s/references/observability.md b/prompts/skills/f3s/references/observability.md new file mode 100644 index 0000000..0cfa7a9 --- /dev/null +++ b/prompts/skills/f3s/references/observability.md @@ -0,0 +1,273 @@ +# Observability Stack + +## Overview + +Complete observability stack deployed into the `monitoring` namespace of the k3s cluster. + +Stack: **PLG + Tempo** (Prometheus, Loki, Grafana + Tempo for distributed tracing) + +## Components + +| Component | Purpose | +|-----------|---------| +| **Prometheus** | Time-series metrics, alerting rules, Alertmanager | +| **Grafana** | Visualisation and dashboarding | +| **Loki** | Log aggregation (single-binary mode) | +| **Alloy** | Telemetry collector (DaemonSet) — ships logs to Loki, traces to Tempo | +| **Tempo** | Distributed tracing backend | +| **Node Exporter** | Host-level metrics (on k3s nodes AND FreeBSD hosts) | + +## Deployment + +All components deployed via **ArgoCD** (GitOps). Manifests: +``` +https://codeberg.org/snonux/conf/src/branch/master/f3s +argocd-apps/monitoring/ +``` + +Deployment tool: `just` (Justfile in each component directory). + +### Namespaces + +```sh +kubectl create namespace monitoring +``` + +## Installing Prometheus + Grafana + +Uses `kube-prometheus-stack` Helm chart: + +```sh +helm repo add prometheus-community https://prometheus-community.github.io/helm-charts +helm repo update + +# Create NFS storage directories first +mkdir -p /data/nfs/k3svolumes/prometheus/data +mkdir -p /data/nfs/k3svolumes/grafana/data + +cd conf/f3s/prometheus && just install +``` + +### Enable etcd and controller-manager scraping + +Add to `persistence-values.yaml`: + +```yaml +kubeEtcd: + enabled: true + endpoints: [192.168.2.120, 192.168.2.121, 192.168.2.122] + service: + port: 2381 + targetPort: 2381 + +kubeControllerManager: + enabled: true + endpoints: [192.168.2.120, 192.168.2.121, 192.168.2.122] + service: + port: 10257 + targetPort: 10257 + serviceMonitor: + enabled: true + https: true + insecureSkipVerify: true +``` + +Also requires k3s config changes on each r node — see k3s-setup.md. + +### Grafana credentials + +Default: `admin` / `prom-operator` — change immediately after first login. + +Grafana accessible at `grafana.f3s.foo.zone` via Traefik ingress. + +## Installing Loki + Alloy + +```sh +mkdir -p /data/nfs/k3svolumes/loki/data +cd conf/f3s/loki && just install +# installs both loki and alloy +``` + +Loki URL (internal): `http://loki.monitoring.svc.cluster.local:3100` + +Add Loki as Grafana data source: Configuration → Data Sources → Loki → URL above. + +### Alloy configuration (`alloy-values.yaml`) + +``` +discovery.kubernetes "pods" { + role = "pod" +} + +discovery.relabel "pods" { + targets = discovery.kubernetes.pods.targets + rule { source_labels = ["__meta_kubernetes_namespace"]; target_label = "namespace" } + rule { source_labels = ["__meta_kubernetes_pod_name"]; target_label = "pod" } + rule { source_labels = ["__meta_kubernetes_pod_container_name"]; target_label = "container" } + rule { source_labels = ["__meta_kubernetes_pod_label_app"]; target_label = "app" } +} + +loki.source.kubernetes "pods" { + targets = discovery.relabel.pods.output + forward_to = [loki.write.default.receiver] +} + +loki.write "default" { + endpoint { + url = "http://loki.monitoring.svc.cluster.local:3100/loki/api/v1/push" + } +} +``` + +## Installing Tempo + +```sh +mkdir -p /data/nfs/k3svolumes/tempo/data +cd conf/f3s/tempo && just install +``` + +Add Tempo as Grafana data source: Grafana → Configuration → Data Sources → Tempo. + +## Monitoring FreeBSD Hosts (f0, f1, f2) + +### Install node_exporter on FreeBSD + +```sh +# On each FreeBSD host +doas pkg install -y node_exporter +doas sysrc node_exporter_enable=YES +# Bind to WireGuard interface (f0=192.168.2.130, f1=192.168.2.131, f2=192.168.2.132) +doas sysrc node_exporter_args='--web.listen-address=192.168.2.130:9100' +doas service node_exporter start +``` + +### Prometheus scrape config for FreeBSD + +`additional-scrape-configs.yaml`: + +```yaml +- job_name: 'node-exporter' + static_configs: + - targets: + - '192.168.2.130:9100' # f0 via WireGuard + - '192.168.2.131:9100' # f1 via WireGuard + - '192.168.2.132:9100' # f2 via WireGuard + labels: + os: freebsd +``` + +```sh +kubectl create secret generic additional-scrape-configs \ + --from-file=additional-scrape-configs.yaml -n monitoring +``` + +Add to `persistence-values.yaml`: + +```yaml +prometheus: + prometheusSpec: + additionalScrapeConfigsSecret: + enabled: true + name: additional-scrape-configs + key: additional-scrape-configs.yaml +``` + +### FreeBSD memory compatibility rules + +FreeBSD uses different metric names than Linux. PrometheusRule to create Linux-compatible metrics: + +```yaml +apiVersion: monitoring.coreos.com/v1 +kind: PrometheusRule +metadata: + name: freebsd-memory-rules + namespace: monitoring + labels: + release: prometheus +spec: + groups: + - name: freebsd-memory + rules: + - record: node_memory_MemTotal_bytes + expr: node_memory_size_bytes{os="freebsd"} + - record: node_memory_MemAvailable_bytes + expr: | + node_memory_free_bytes{os="freebsd"} + + node_memory_inactive_bytes{os="freebsd"} + + node_memory_cache_bytes{os="freebsd"} + - record: node_memory_MemFree_bytes + expr: node_memory_free_bytes{os="freebsd"} + - record: node_memory_Buffers_bytes + expr: node_memory_buffer_bytes{os="freebsd"} + - record: node_memory_Cached_bytes + expr: node_memory_cache_bytes{os="freebsd"} +``` + +Note: Disk I/O metrics (`node_disk_*`) are not available on FreeBSD — use ZFS-specific dashboards instead. + +### ZFS monitoring recording rules + +```yaml +apiVersion: monitoring.coreos.com/v1 +kind: PrometheusRule +metadata: + name: freebsd-zfs-rules + namespace: monitoring + labels: + release: prometheus +spec: + groups: + - name: freebsd-zfs-arc + interval: 30s + rules: + - record: node_zfs_arc_hit_rate_percent + expr: | + 100 * ( + rate(node_zfs_arcstats_hits_total{os="freebsd"}[5m]) / + (rate(node_zfs_arcstats_hits_total{os="freebsd"}[5m]) + + rate(node_zfs_arcstats_misses_total{os="freebsd"}[5m])) + ) + - record: node_zfs_arc_memory_usage_percent + expr: | + 100 * ( + node_zfs_arcstats_size_bytes{os="freebsd"} / + node_zfs_arcstats_c_max_bytes{os="freebsd"} + ) +``` + +## Alerting + +Prometheus → Alertmanager → **Gogios** (custom lightweight monitoring tool running on OpenBSD gateway `blowfish`/`fishfinger`). + +Gogios scrapes Alertmanager at regular intervals and sends email notifications. Reaches Alertmanager via WireGuard mesh. + +## Monitoring Scope + +- Kubernetes workloads (pod health, resource usage) +- Node-level metrics (CPU, memory, disk) — both k3s and FreeBSD nodes +- ZFS ARC statistics on FreeBSD hosts +- Application performance metrics +- Log aggregation from all pods (via Alloy → Loki) +- Distributed traces (via Alloy → Tempo) + +## Useful LogQL Queries + +``` +# All logs from services namespace +{namespace="services"} + +# Filter by log content +{namespace="services"} |= "error" + +# Parse JSON logs +{namespace="services"} | json | level="error" +``` + +## NFS Storage Paths for Observability + +``` +/data/nfs/k3svolumes/prometheus/data +/data/nfs/k3svolumes/grafana/data +/data/nfs/k3svolumes/loki/data +/data/nfs/k3svolumes/tempo/data +``` diff --git a/prompts/skills/f3s/references/rocky-linux-vms.md b/prompts/skills/f3s/references/rocky-linux-vms.md new file mode 100644 index 0000000..2be47c2 --- /dev/null +++ b/prompts/skills/f3s/references/rocky-linux-vms.md @@ -0,0 +1,160 @@ +# Rocky Linux Bhyve VMs + +## Why Rocky Linux 9 + +- Long-term support: EOL 2032 — no major upgrades needed +- RHEL-family compatible (consistent with work and Fedora laptop) +- Supports Kubernetes (k3s), eBPF, systemd + +## Bhyve Setup on FreeBSD Hosts + +Tool: **vm-bhyve** (not built into FreeBSD, installed via pkg). + +### Install and initialise (run on each of f0, f1, f2) +```sh +doas pkg install vm-bhyve bhyve-firmware +doas sysrc vm_enable=YES +doas sysrc vm_dir=zfs:zroot/bhyve +doas zfs create zroot/bhyve +doas vm init +doas vm switch create public +doas vm switch add public re0 # re0 = Realtek GbE interface +doas ln -s /zroot/bhyve/ /bhyve # convenience symlink +``` + +### Verify CPU virtualisation support +```sh +dmesg | grep 'Features2=.*POPCNT' # must show POPCNT +``` + +## VM Configuration + +### Download ISO and create VM +```sh +doas vm iso https://download.rockylinux.org/pub/rocky/9/isos/x86_64/Rocky-9.5-x86_64-minimal.iso +doas vm create rocky +doas truncate -s 100G /zroot/bhyve/rocky/disk0.img # expand before install +``` + +### VM config (`/zroot/bhyve/rocky/rocky.conf`) +``` +guest="linux" +loader="uefi" +uefi_vars="yes" +cpu=4 +memory=14G +network0_type="virtio-net" +network0_switch="public" +disk0_type="nvme" # NVMe emulation (see below) +disk0_name="disk0.img" +graphics="yes" +graphics_vga=io +uuid="<unique per host>" +network0_mac="<unique per host>" +``` + +The `uuid` and `network0_mac` differ for each of the three VMs. + +### Install from ISO (interactive via VNC) +```sh +doas vm install rocky Rocky-9.5-x86_64-minimal.iso +# VNC address: vnc://f0:5900, vnc://f1:5900, vnc://f2:5900 +``` + +Use GNOME VNC client from `earth` (Fedora laptop) to complete the graphical installer. + +## After Install + +### Auto-start VM on host reboot +```sh +cat <<END | doas tee -a /etc/rc.conf +vm_list="rocky" +vm_delay="5" +END +``` + +### VM IP and hostname (Rocky Linux side) +```sh +nmcli connection modify enp0s5 ipv4.address 192.168.1.120/24 +nmcli connection modify enp0s5 ipv4.gateway 192.168.1.1 +nmcli connection modify enp0s5 ipv4.DNS 192.168.1.1 +nmcli connection modify enp0s5 ipv4.method manual +nmcli connection down enp0s5 && nmcli connection up enp0s5 +hostnamectl set-hostname r0.lan.buetow.org +``` + +VM IPs: + +| VM | LAN IP | Runs on | +|----|--------|---------| +| r0 | 192.168.1.120 | f0 | +| r1 | 192.168.1.121 | f1 | +| r2 | 192.168.1.122 | f2 | + +VM names inside bhyve are all called `rocky` (one per host). + +### SSH access +```sh +# Enable root login (VMs are not internet-reachable) +# Add to /etc/ssh/sshd_config: PermitRootLogin yes + +# Copy SSH keys from laptop +for i in 0 1 2; do ssh-copy-id root@r$i.lan.buetow.org; done + +# Disable password auth after keys are in place +# Set PasswordAuthentication no in /etc/ssh/sshd_config +``` + +### Update Rocky Linux +```sh +dnf update -y && reboot +``` + +## Critical: NVMe Disk Emulation for etcd + +**Problem**: Default `virtio-blk` disk gives ~258 kB/s sync write speed, causing etcd leader elections and "apply request took too long" warnings. + +**Symptom in k3s logs**: +``` +{"level":"warn","msg":"slow fdatasync","took":"1.328469363s","expected-duration":"1s"} +``` + +**Solution**: Switch to NVMe emulation (~100x faster: 24.8 MB/s vs 258 kB/s). + +### Step 1: Prepare guest OS (while still on virtio-blk) +```sh +# Add NVMe drivers to initramfs +cat > /etc/dracut.conf.d/nvme.conf << EOF +add_drivers+=" nvme nvme_core " +hostonly=no +EOF + +# Allow LVM to scan all devices (device path changes from /dev/vda to /dev/nvme0n1) +sed -i 's/# use_devicesfile = 1/use_devicesfile = 0/' /etc/lvm/lvm.conf + +dracut -f +shutdown -h now +``` + +### Step 2: Update VM config on FreeBSD host +```sh +doas vm stop rocky +# Edit rocky.conf: change disk0_type="virtio-blk" to disk0_type="nvme" +doas vm configure rocky +doas vm start rocky +``` + +### Caveats +- Do NOT add `disk0_opts="nocache,direct"` with NVMe — makes performance worse +- NVMe drivers must be in initramfs before switching +- LVM `use_devicesfile` must be 0 (disabled) — device path changes + +## VM Management Commands + +```sh +doas vm list # list all VMs and state +doas vm start rocky # start VM +doas vm stop rocky # graceful stop +doas vm reset rocky # force reset +doas sockstat -4 | grep 5900 # check VNC port +``` diff --git a/prompts/skills/f3s/references/storage.md b/prompts/skills/f3s/references/storage.md new file mode 100644 index 0000000..857645c --- /dev/null +++ b/prompts/skills/f3s/references/storage.md @@ -0,0 +1,492 @@ +# Storage + +## Architecture Overview + +Persistent storage for k3s is served via **NFS over stunnel** from the FreeBSD hosts, backed by **ZFS** (`zdata` pool) with **CARP** for high availability and **zrepl** for continuous replication. + +Note: Original plan was HAST, replaced by **zrepl** (ZFS send/receive) — more reliable, avoids ZFS corruption during failover that HAST caused. + +## Physical Disks + +- **f0**: 512GB M.2 (OS/zroot) + Samsung SSD 870 EVO 1TB (zdata) +- **f1**: 512GB M.2 (OS/zroot) + Crucial CT1000BX500SSD1 1TB (zdata) +- **f2**: No second drive (no zdata pool) + +## ZFS: zdata Pool Setup + +On f0 and f1, create the zdata pool on the second SSD: + +```sh +# Pool setup (f0 and f1 only) +doas zpool create zdata ada1 # ada1 = second SSD +``` + +## ZFS Encryption Keys (USB Key Storage) + +Encryption keys are stored on USB flash drives (UFS-formatted, mounted at `/keys`). + +```sh +# Format and mount USB key (on each node) +doas newfs /dev/da0 +echo '/dev/da0 /keys ufs rw 0 2' | doas tee -a /etc/fstab +doas mkdir /keys +doas mount /keys + +# Generate keys (on f0, then copy to f1 and f2) +doas openssl rand -out /keys/f0.lan.buetow.org:bhyve.key 32 +doas openssl rand -out /keys/f1.lan.buetow.org:bhyve.key 32 +doas openssl rand -out /keys/f2.lan.buetow.org:bhyve.key 32 +doas openssl rand -out /keys/f0.lan.buetow.org:zdata.key 32 +doas openssl rand -out /keys/f1.lan.buetow.org:zdata.key 32 +doas openssl rand -out /keys/f2.lan.buetow.org:zdata.key 32 +doas chown root /keys/* && doas chmod 400 /keys/* +# Copy to f1 and f2 via tarball +``` + +## ZFS Encryption Setup + +```sh +# On f0 - create encrypted zdata dataset +doas zfs create -o encryption=on -o keyformat=raw \ + -o keylocation=file:///keys/f0.lan.buetow.org:zdata.key zdata/enc + +# Create the NFS data dataset (replicated to f1) +doas zfs create zdata/enc/nfsdata +doas zfs set mountpoint=/data/nfs zdata/enc/nfsdata +doas mkdir -p /data/nfs/k3svolumes + +# Encrypt Bhyve VM dataset (zroot/bhyve) +# Stop VMs first, rename old, create new encrypted, zfs send snapshot, then destroy old +doas vm stop rocky +doas zfs rename zroot/bhyve zroot/bhyve_old +doas zfs set mountpoint=/mnt zroot/bhyve_old +doas zfs snapshot zroot/bhyve_old/rocky@hamburger +doas zfs create -o encryption=on -o keyformat=raw \ + -o keylocation=file:///keys/f0.lan.buetow.org:bhyve.key zroot/bhyve +doas zfs send zroot/bhyve_old/rocky@hamburger | doas zfs recv zroot/bhyve/rocky +# Copy vm-bhyve metadata: .config, .img, .templates, .iso +doas zfs destroy -R zroot/bhyve_old +``` + +### Auto-load encryption keys on boot + +```sh +# On f0 +doas sysrc zfskeys_enable=YES +doas sysrc zfskeys_datasets="zdata/enc zdata/enc/nfsdata zroot/bhyve" + +# On f1 +doas sysrc zfskeys_enable=YES +doas sysrc zfskeys_datasets="zdata/enc zroot/bhyve zdata/sink/f0/zdata/enc/nfsdata" +doas zfs set keylocation=file:///keys/f0.lan.buetow.org:zdata.key \ + zdata/sink/f0/zdata/enc/nfsdata +``` + +## zrepl: Continuous ZFS Replication (f0 → f1) + +Install on both f0 and f1: +```sh +doas pkg install -y zrepl +``` + +### f0 configuration (`/usr/local/etc/zrepl/zrepl.yml`) + +```yaml +global: + logging: + - type: stdout + level: info + format: human + +jobs: + - name: f0_to_f1_nfsdata + type: push + connect: + type: tcp + address: "192.168.2.131:8888" # f1 WireGuard IP + filesystems: + "zdata/enc/nfsdata": true + send: + encrypted: true + snapshotting: + type: periodic + prefix: zrepl_ + interval: 1m # every minute + pruning: + keep_sender: + - type: last_n + count: 10 + - type: grid + grid: 4x7d | 6x30d + regex: "^zrepl_.*" + keep_receiver: + - type: last_n + count: 10 + - type: grid + grid: 4x7d | 6x30d + regex: "^zrepl_.*" + + - name: f0_to_f1_freebsd + type: push + connect: + type: tcp + address: "192.168.2.131:8888" + filesystems: + "zroot/bhyve/freebsd": true # development FreeBSD VM + send: + encrypted: true + snapshotting: + type: periodic + prefix: zrepl_ + interval: 10m # every 10 minutes + pruning: + keep_sender: + - type: last_n + count: 10 + - type: grid + grid: 4x7d + regex: "^zrepl_.*" + keep_receiver: + - type: last_n + count: 10 + - type: grid + grid: 4x7d + regex: "^zrepl_.*" +``` + +### f1 configuration (sink) + +```sh +doas zfs create zdata/sink # receive dataset +``` + +`/usr/local/etc/zrepl/zrepl.yml`: + +```yaml +global: + logging: + - type: stdout + level: info + format: human + +jobs: + - name: sink + type: sink + serve: + type: tcp + listen: "192.168.2.131:8888" + clients: + "192.168.2.130": "f0" + recv: + placeholder: + encryption: inherit + root_fs: "zdata/sink" +``` + +### Enable and start + +```sh +doas sysrc zrepl_enable=YES +doas service zrepl start +doas zrepl status # monitor replication +``` + +Replicated paths: `zdata/enc/nfsdata` → `zdata/sink/f0/zdata/enc/nfsdata` + +### Mount replica on f1 (read-only standby) + +```sh +doas zfs load-key -L file:///keys/f0.lan.buetow.org:zdata.key \ + zdata/sink/f0/zdata/enc/nfsdata +doas mkdir -p /data/nfs +doas zfs set mountpoint=/data/nfs zdata/sink/f0/zdata/enc/nfsdata +doas zfs mount zdata/sink/f0/zdata/enc/nfsdata +doas zfs set readonly=on zdata/sink/f0/zdata/enc/nfsdata # prevent replication breakage +``` + +### Failover design: intentionally read-only replica + +The standby replica is read-only by design. Manual failover (not automatic) to prevent split-brain. To fix broken replication after accidental writes: `doas zfs rollback <snapshot>`. + +### zrepl troubleshooting + +```sh +# Signal manual replication +doas zrepl signal wakeup f0_to_f1_nfsdata + +# Fix "no common snapshot" — destroy and re-replicate +doas zfs destroy -r zdata/sink/f0/zdata/enc/nfsdata + +# Test network connectivity +nc -zv 192.168.2.131 8888 + +# Monitor progress +doas zrepl status --mode raw | grep BytesReplicated +``` + +## CARP: High-Availability VIP + +CARP (Common Address Redundancy Protocol) provides **VIP 192.168.1.138** that floats between f0 (primary) and f1 (standby). + +### /etc/rc.conf configuration + +```sh +# On f0 (default advskew=0, wins elections) +ifconfig_re0_alias0="inet vhid 1 pass YOURPASSWORD alias 192.168.1.138/32" + +# On f1 (advskew=100, loses elections to f0) +ifconfig_re0_alias0="inet vhid 1 advskew 100 pass YOURPASSWORD alias 192.168.1.138/32" +``` + +### Load CARP module + +```sh +echo 'carp_load="YES"' | doas tee -a /boot/loader.conf +# or immediately: doas kldload carp +``` + +### /etc/hosts for CARP VIP + +``` +192.168.1.138 f3s-storage-ha f3s-storage-ha.lan f3s-storage-ha.lan.buetow.org +192.168.2.138 f3s-storage-ha.wg0 f3s-storage-ha.wg0.wan.buetow.org +``` + +### devd: CARP state change hook + +Add to `/etc/devd.conf` on f0 and f1: + +``` +notify 0 { + match "system" "CARP"; + match "subsystem" "[0-9]+@[0-9a-z.]+"; + match "type" "(MASTER|BACKUP)"; + action "/usr/local/bin/carpcontrol.sh $subsystem $type"; +}; +``` + +```sh +doas service devd restart +``` + +### carpcontrol.sh — start/stop NFS+stunnel on failover + +```sh +#!/bin/sh +HOSTNAME=`hostname` + +if [ ! -f /data/nfs/nfs.DO_NOT_REMOVE ]; then + logger '/data/nfs not mounted, mounting it now!' + if [ "$HOSTNAME" = 'f0.lan.buetow.org' ]; then + zfs load-key -L file:///keys/f0.lan.buetow.org:zdata.key zdata/enc/nfsdata + zfs set mountpoint=/data/nfs zdata/enc/nfsdata + else + zfs load-key -L file:///keys/f0.lan.buetow.org:zdata.key zdata/sink/f0/zdata/enc/nfsdata + zfs set mountpoint=/data/nfs zdata/sink/f0/zdata/enc/nfsdata + zfs mount zdata/sink/f0/zdata/enc/nfsdata + zfs set readonly=on zdata/sink/f0/zdata/enc/nfsdata + fi + service nfsd stop 2>&1 + service mountd stop 2>&1 +fi + +case "$2" in + MASTER) + logger "CARP state changed to MASTER, starting services" + service rpcbind start >/dev/null 2>&1 + service mountd start >/dev/null 2>&1 + service nfsd start >/dev/null 2>&1 + service nfsuserd start >/dev/null 2>&1 + service stunnel restart >/dev/null 2>&1 + ;; + BACKUP) + logger "CARP state changed to BACKUP, stopping services" + service stunnel stop >/dev/null 2>&1 + service nfsd stop >/dev/null 2>&1 + service mountd stop >/dev/null 2>&1 + service nfsuserd stop >/dev/null 2>&1 + ;; +esac +``` + +Install: `doas chmod +x /usr/local/bin/carpcontrol.sh` (copy to f1 too) + +### CARP management script (`/usr/local/bin/carp`) + +```sh +doas carp # show current state +doas carp master # force MASTER (e.g. reclaim after maintenance) +doas carp backup # force BACKUP (trigger failover to f1) +doas carp auto-failback disable # prevent auto-failback (for maintenance) +doas carp auto-failback enable # re-enable auto-failback +``` + +### Auto-failback from f1 to f0 + +Script `/usr/local/bin/carp-auto-failback.sh` runs every minute via cron on f0. Checks: currently BACKUP? `/data/nfs` mounted? Marker file exists? Failback not blocked? If all conditions met, promotes f0 to MASTER. + +```sh +echo "* * * * * /usr/local/bin/carp-auto-failback.sh" | doas crontab - +doas touch /data/nfs/nfs.DO_NOT_REMOVE # marker file required for auto-failback +``` + +Logs to `/var/log/carp-auto-failback.log`. + +## NFS Server Configuration (f0 and f1) + +```sh +doas sysrc nfs_server_enable=YES +doas sysrc nfsv4_server_enable=YES +doas sysrc nfsuserd_enable=YES +doas sysrc nfsuserd_flags="-domain lan.buetow.org" +doas sysrc mountd_enable=YES +doas sysrc rpcbind_enable=YES + +doas mkdir -p /data/nfs/k3svolumes +doas chmod 755 /data/nfs/k3svolumes +``` + +`/etc/exports` (stunnel clients appear as localhost): + +``` +V4: /data/nfs -sec=sys +/data/nfs -alldirs -maproot=root -network 127.0.0.1 -mask 255.255.255.255 +``` + +Start services: + +```sh +doas service rpcbind start +doas service mountd start +doas service nfsd start +doas service nfsuserd start +``` + +## stunnel: Encrypted NFS over TLS + +stunnel binds to the CARP VIP (192.168.1.138), so only the CARP MASTER accepts connections. Uses mutual TLS with client certificate authentication. + +### Create CA and certificates (on f0) + +```sh +doas mkdir -p /usr/local/etc/stunnel/ca +cd /usr/local/etc/stunnel/ca +doas openssl genrsa -out ca-key.pem 4096 +doas openssl req -new -x509 -days 3650 -key ca-key.pem -out ca-cert.pem \ + -subj '/C=US/ST=State/L=City/O=F3S Storage/CN=F3S Stunnel CA' + +cd /usr/local/etc/stunnel +doas openssl genrsa -out server-key.pem 4096 +doas openssl req -new -key server-key.pem -out server.csr \ + -subj '/C=US/ST=State/L=City/O=F3S Storage/CN=f3s-storage-ha.lan' +doas openssl x509 -req -days 3650 -in server.csr -CA ca/ca-cert.pem \ + -CAkey ca/ca-key.pem -CAcreateserial -out server-cert.pem + +# Client certs for r0, r1, r2, earth +for client in r0 r1 r2 earth; do + openssl genrsa -out ca/${client}-key.pem 4096 + openssl req -new -key ca/${client}-key.pem -out ca/${client}.csr \ + -subj "/C=US/ST=State/L=City/O=F3S Storage/CN=${client}.lan.buetow.org" + openssl x509 -req -days 3650 -in ca/${client}.csr -CA ca/ca-cert.pem \ + -CAkey ca/ca-key.pem -CAcreateserial -out ca/${client}-cert.pem + cat ca/${client}-cert.pem ca/${client}-key.pem > ca/${client}-stunnel.pem +done +``` + +### stunnel server config (`/usr/local/etc/stunnel/stunnel.conf`) + +``` +cert = /usr/local/etc/stunnel/server-cert.pem +key = /usr/local/etc/stunnel/server-key.pem +setuid = stunnel +setgid = stunnel + +[nfs-tls] +accept = 192.168.1.138:2323 +connect = 127.0.0.1:2049 +CAfile = /usr/local/etc/stunnel/ca/ca-cert.pem +verify = 2 +requireCert = yes +``` + +```sh +doas pkg install -y stunnel +doas sysrc stunnel_enable=YES +doas service stunnel start +# Copy certs to f1 via tarball, configure identically +``` + +## NFS Client Configuration (Rocky Linux r0, r1, r2) + +```sh +dnf install -y stunnel nfs-utils + +# Copy client cert and CA from f0 +scp f0:/usr/local/etc/stunnel/ca/r0-stunnel.pem /etc/stunnel/ +scp f0:/usr/local/etc/stunnel/ca/ca-cert.pem /etc/stunnel/ +``` + +`/etc/stunnel/stunnel.conf` (r0 example): + +``` +cert = /etc/stunnel/r0-stunnel.pem +CAfile = /etc/stunnel/ca-cert.pem +client = yes +verify = 2 + +[nfs-ha] +accept = 127.0.0.1:2323 +connect = 192.168.1.138:2323 +``` + +```sh +systemctl enable --now stunnel +``` + +### NFSv4 user mapping + +`/etc/idmapd.conf` on r0, r1, r2: + +``` +[General] +Domain = lan.buetow.org +``` + +Fix inotify limit: + +```sh +echo 'fs.inotify.max_user_instances = 512' > /etc/sysctl.d/99-inotify.conf +sysctl -w fs.inotify.max_user_instances=512 +systemctl enable --now nfs-client.target nfs-idmapd +``` + +### Mount NFS + +```sh +mkdir -p /data/nfs/k3svolumes +mount -t nfs4 -o port=2323 127.0.0.1:/k3svolumes /data/nfs/k3svolumes +``` + +`/etc/fstab`: + +``` +127.0.0.1:/k3svolumes /data/nfs/k3svolumes nfs4 port=2323,_netdev,soft,timeo=10,retrans=2,intr 0 0 +``` + +NFS path structure on k3s nodes: `/data/nfs/k3svolumes/<app>/` + +## AWS S3 Glacier Deep Archive Backups + +Encrypted incremental ZFS snapshots from `zdata` pool backed up daily to **AWS S3 Glacier Deep Archive** via cron. Scripts adapted from FreeBSD Home NAS setup. Also performs periodic zpool scrubbing. + +## Storage Summary + +| Layer | Technology | Role | +|-------|-----------|------| +| Block | M.2+2.5" SSD (f0/f1) | Physical storage | +| Filesystem | ZFS (`zdata/enc`) | Data integrity, AES-256-GCM encryption | +| Replication | `zrepl` | Continuous ZFS replication f0→f1 (1min NFS, 10min VM) | +| HA | CARP VIP 192.168.1.138 | Automatic failover for NFS/stunnel | +| Network | NFS over stunnel | Encrypted shared storage, mutual TLS auth | +| LAN access | FreeBSD relayd on CARP VIP | TCP forwarding to k3s :80/:443 | +| Backup | S3 Glacier Deep Archive | Off-site encrypted backup | diff --git a/prompts/skills/f3s/references/ups-power.md b/prompts/skills/f3s/references/ups-power.md new file mode 100644 index 0000000..019887f --- /dev/null +++ b/prompts/skills/f3s/references/ups-power.md @@ -0,0 +1,77 @@ +# UPS and Power Protection + +## Hardware + +**APC Back-UPS BX750MI** (750VA / 410W) + +- ~65 minutes runtime for the f3s cluster at idle load +- USB connectivity to `f0` for monitoring +- 4 outlets: 3× Beelinks + 1× TP-Link switch +- Silent (no noise when on mains power) +- User-replaceable batteries + +## FreeBSD: `apcupsd` on f0 (USB master) + +`f0` is directly connected to the UPS via USB. + +### Detection +``` +ugen0.2: <American Power Conversion Back-UPS BX750MI> at usbus0 +``` + +### Install +```sh +doas pkg install apcupsd +doas sysrc apcupsd_enable=YES +doas service apcupsd start +``` + +### Config (`/usr/local/etc/apcupsd/apcupsd.conf` diff from sample) +``` +UPSCABLE usb +UPSTYPE usb +DEVICE # (empty — auto-detect USB) +BATTERYLEVEL 5 # shutdown when battery < 5% +MINUTES 3 # shutdown when < 3 min runtime left +``` + +### Status check +```sh +apcaccess # full status +apcaccess -p TIMELEFT # remaining minutes +``` + +## `apcupsd` on f1 and f2 (network clients) + +`f1` and `f2` query the UPS status from `f0` over the network (port 3551). +They are configured to shut down *earlier* than `f0` to avoid losing the UPS status feed. + +### Config diff from sample (f1 and f2) +``` +UPSCABLE ether +UPSTYPE net +DEVICE f0.lan.buetow.org:3551 +BATTERYLEVEL 10 # higher than f0's 5% +MINUTES 6 # higher than f0's 3 min +``` + +### Enable +```sh +doas sysrc apcupsd_enable=YES +doas service apcupsd start +apcaccess | grep Percent # verify +``` + +## Shutdown Order + +On power failure, the expected graceful shutdown sequence is: +1. **f1 and f2** — shut down first (BATTERYLEVEL 10, MINUTES 6) +2. **f0** — shuts down last (BATTERYLEVEL 5, MINUTES 3) + +This ensures f1/f2 can still reach f0's apcupsd to learn the UPS status before f0 shuts down. + +## Logs + +```sh +grep apcupsd /var/log/daemon.log +``` diff --git a/prompts/skills/f3s/references/wireguard.md b/prompts/skills/f3s/references/wireguard.md new file mode 100644 index 0000000..4cebc86 --- /dev/null +++ b/prompts/skills/f3s/references/wireguard.md @@ -0,0 +1,210 @@ +# WireGuard Mesh Network + +## Topology + +Full-mesh VPN network connecting all f3s infrastructure hosts plus two roaming clients. + +**Infrastructure hosts** (full mesh — every host connects to every other): +- `f0`, `f1`, `f2` — FreeBSD physical nodes (home LAN) +- `r0`, `r1`, `r2` — Rocky Linux Bhyve VMs +- `blowfish`, `fishfinger` — OpenBSD internet gateways (OpenBSD Amsterdam and Hetzner) + +**Roaming clients** (connect only to gateways): +- `earth` — Fedora laptop (192.168.2.200) +- `pixel7pro` — Android phone (192.168.2.201) + +Even `fN <-> rN` tunnels exist (technically redundant since the VM runs on the host) to keep config uniform. + +## WireGuard IP Assignments + +| Host | WireGuard IPv4 | WireGuard IPv6 | Role | +|------|----------------|----------------|------| +| f0 | 192.168.2.130 | fd42:beef:cafe:2::130 | FreeBSD host | +| f1 | 192.168.2.131 | fd42:beef:cafe:2::131 | FreeBSD host | +| f2 | 192.168.2.132 | fd42:beef:cafe:2::132 | FreeBSD host | +| r0 | 192.168.2.120 | fd42:beef:cafe:2::120 | Rocky VM (k3s node) | +| r1 | 192.168.2.121 | fd42:beef:cafe:2::121 | Rocky VM (k3s node) | +| r2 | 192.168.2.122 | fd42:beef:cafe:2::122 | Rocky VM (k3s node) | +| blowfish | 192.168.2.110 | fd42:beef:cafe:2::110 | OpenBSD internet GW | +| fishfinger | 192.168.2.111 | fd42:beef:cafe:2::111 | OpenBSD internet GW | +| earth | 192.168.2.200 | fd42:beef:cafe:2::200 | Fedora laptop (roaming) | +| pixel7pro | 192.168.2.201 | fd42:beef:cafe:2::201 | Android phone (roaming) | + +**Listen port: 56709** (all hosts) + +WireGuard hostnames: `<host>.wg0.wan.buetow.org` (e.g. `f0.wg0.wan.buetow.org`) + +## FreeBSD Setup (f0, f1, f2) + +```sh +doas pkg install wireguard-tools +doas sysrc wireguard_interfaces=wg0 +doas sysrc wireguard_enable=YES +doas mkdir -p /usr/local/etc/wireguard +doas touch /usr/local/etc/wireguard/wg0.conf +doas service wireguard start +doas wg show # check public key and listen port +``` + +## Rocky Linux Setup (r0, r1, r2) + +```sh +dnf install -y wireguard-tools +mkdir -p /etc/wireguard +touch /etc/wireguard/wg0.conf +systemctl enable wg-quick@wg0.service +systemctl start wg-quick@wg0.service +systemctl disable firewalld + +# Fix SELinux blocking WireGuard: +dnf install -y policycoreutils-python-utils +semanage permissive -a wireguard_t +reboot +``` + +## OpenBSD Setup (blowfish, fishfinger) + +```sh +doas pkg_add wireguard-tools +doas mkdir /etc/wireguard +doas touch /etc/wireguard/wg0.conf +cat <<END | doas tee /etc/hostname.wg0 +inet 192.168.2.110 255.255.255.0 NONE +up +!/usr/local/bin/wg setconf wg0 /etc/wireguard/wg0.conf +END +``` + +(Use `192.168.2.111` on fishfinger) + +### OpenBSD pf.conf — NAT for roaming clients + +```sh +# NAT for WireGuard clients to access internet +match out on vio0 from 192.168.2.0/24 to any nat-to (vio0) + +# Allow inbound traffic on WireGuard interface +pass in on wg0 + +# Allow all UDP traffic on WireGuard port +pass in inet proto udp from any to any port 56709 +``` + +Apply with: `doas pfctl -f /etc/pf.conf` + +## Example wg0.conf (f0) + +``` +[Interface] +# f0.wg0.wan.buetow.org +Address = 192.168.2.130 +PrivateKey = ************************** +ListenPort = 56709 + +[Peer] +# f1.lan.buetow.org as f1.wg0.wan.buetow.org +PublicKey = ************************** +PresharedKey = ************************** +AllowedIPs = 192.168.2.131/32 +Endpoint = 192.168.1.131:56709 + +[Peer] +# blowfish.buetow.org as blowfish.wg0.wan.buetow.org +PublicKey = ************************** +PresharedKey = ************************** +AllowedIPs = 192.168.2.110/32 +Endpoint = 23.88.35.144:56709 +PersistentKeepalive = 25 + +[Peer] +# fishfinger.buetow.org as fishfinger.wg0.wan.buetow.org +PublicKey = ************************** +PresharedKey = ************************** +AllowedIPs = 192.168.2.111/32 +Endpoint = 46.23.94.99:56709 +PersistentKeepalive = 25 +# ... all other mesh peers ... +``` + +Notes: +- `PersistentKeepalive = 25` is required for peers behind NAT (blowfish/fishfinger/roaming clients) +- Infrastructure hosts (fN, rN) do NOT need keepalive for peers on the same LAN +- A PSK (preshared key) is used per-pair for extra security + +## Roaming Client wg0.conf (pixel7pro / earth) + +``` +[Interface] +# pixel7pro.wg0.wan.buetow.org +Address = 192.168.2.201 +PrivateKey = ************************** +ListenPort = 56709 +DNS = 1.1.1.1, 8.8.8.8 + +[Peer] +# blowfish.buetow.org +PublicKey = ************************** +PresharedKey = ************************** +AllowedIPs = 0.0.0.0/0, ::/0 +Endpoint = 23.88.35.144:56709 +PersistentKeepalive = 25 + +[Peer] +# fishfinger.buetow.org +PublicKey = ************************** +PresharedKey = ************************** +AllowedIPs = 0.0.0.0/0, ::/0 +Endpoint = 46.23.94.99:56709 +PersistentKeepalive = 25 +``` + +Roaming clients route all traffic (`0.0.0.0/0`) through gateways, only connect to blowfish/fishfinger, and cannot be directly reached by LAN hosts. + +## /etc/hosts Entries for WireGuard + +Add to `/etc/hosts` on each host (FreeBSD and Rocky Linux): + +``` +192.168.2.130 f0.wg0 f0.wg0.wan.buetow.org +192.168.2.131 f1.wg0 f1.wg0.wan.buetow.org +192.168.2.132 f2.wg0 f2.wg0.wan.buetow.org +192.168.2.120 r0.wg0 r0.wg0.wan.buetow.org +192.168.2.121 r1.wg0 r1.wg0.wan.buetow.org +192.168.2.122 r2.wg0 r2.wg0.wan.buetow.org +192.168.2.110 blowfish.wg0 blowfish.wg0.wan.buetow.org +192.168.2.111 fishfinger.wg0 fishfinger.wg0.wan.buetow.org +fd42:beef:cafe:2::130 f0.wg0.wan.buetow.org +fd42:beef:cafe:2::131 f1.wg0.wan.buetow.org +fd42:beef:cafe:2::132 f2.wg0.wan.buetow.org +fd42:beef:cafe:2::120 r0.wg0.wan.buetow.org +fd42:beef:cafe:2::121 r1.wg0.wan.buetow.org +fd42:beef:cafe:2::122 r2.wg0.wan.buetow.org +fd42:beef:cafe:2::110 blowfish.wg0.wan.buetow.org +fd42:beef:cafe:2::111 fishfinger.wg0.wan.buetow.org +``` + +## WireGuard Mesh Generator + +Manually creating 8+ wg0.conf files is error-prone. A Ruby script automates this: + +```sh +git clone https://codeberg.org/snonux/wireguardmeshgenerator +cd wireguardmeshgenerator +bundle install +sudo dnf install -y wireguard-tools +``` + +Config file: `wireguardmeshgenerator.yaml` — defines all hosts, their LAN/WG IPs, SSH details, and excluded peers (infrastructure nodes exclude roaming clients). + +The script generates all configs and can push them via SSH. + +## Traffic Flows + +| Flow | Purpose | +|------|---------| +| fN ↔ rN | NFS storage (FreeBSD hosts serve NFS to VMs via stunnel) | +| rN ↔ blowfish/fishfinger | k3s service traffic via `relayd` | +| fN ↔ blowfish/fishfinger | Remote management | +| rN ↔ rM | k3s intra-cluster traffic | +| fN ↔ fM | zrepl storage replication | +| earth/pixel7pro ↔ gateways | Remote access (all traffic routed through VPN) | |
