diff options
| author | Paul Buetow <paul@buetow.org> | 2026-03-19 09:22:20 +0200 |
|---|---|---|
| committer | Paul Buetow <paul@buetow.org> | 2026-03-19 09:22:20 +0200 |
| commit | dbffb321ec6e03bb2eb263db94cf56ad07acdcbe (patch) | |
| tree | 2fe115a3b67ed7b2b7f30c5bfbb0b939ed11b084 | |
| parent | 53da530eeb5016ec064d231d6be7aba08bd844d7 (diff) | |
Update
| -rw-r--r-- | fish/conf.d/update.fish | 17 | ||||
| -rw-r--r-- | prompts/skills/f3s/references/freebsd-setup.md | 2 | ||||
| -rw-r--r-- | prompts/skills/f3s/references/k3s-setup.md | 17 | ||||
| -rw-r--r-- | prompts/skills/f3s/references/storage.md | 44 |
4 files changed, 63 insertions, 17 deletions
diff --git a/fish/conf.d/update.fish b/fish/conf.d/update.fish index 4f8d252..9e31e4f 100644 --- a/fish/conf.d/update.fish +++ b/fish/conf.d/update.fish @@ -17,30 +17,23 @@ function update::tools go install golang.org/x/tools/cmd/goimports@latest & set -a pids $last_pid - for prog in hexai hexai-lsp-server hexai-tmux-action hexai-tmux-edit hexai-mcp-server + for prog in hexai hexai-lsp-server hexai-tmux-action hexai-tmux-edit hexai-mcp-server ask echo "Installing/updating $prog from codeberg.org/snonux/hexai/cmd/$prog@latest" go install codeberg.org/snonux/hexai/cmd/$prog@latest & set -a pids $last_pid end - # Renamed to hexai-lsp-server - if test -f ~/go/bin/hexai-lsp - rm ~/go/bin/hexai-lsp - end - if test -f ~/scripts/taskwarriorfeeder.rb - rm ~/scripts/taskwarriorfeeder.rb + # Obsolete moved to keepass + if test -f ~/go/bin/foostore + rm ~/go/bin/foostore end - for prog in tasksamurai timesamurai perc loadbars foostore + for prog in tasksamurai timesamurai perc loadbars echo "Installing/updating $prog from codeberg.org/snonux/$prog/cmd/$prog@latest" go install codeberg.org/snonux/$prog/cmd/$prog@latest & set -a pids $last_pid end - if test -f ~/git/bin/timr - rm ~/git/bin/timr - end - if test (uname) = Darwin echo 'Updating cursor-agent on macOS' cursor-agent update & diff --git a/prompts/skills/f3s/references/freebsd-setup.md b/prompts/skills/f3s/references/freebsd-setup.md index d55fb8f..d98e1e1 100644 --- a/prompts/skills/f3s/references/freebsd-setup.md +++ b/prompts/skills/f3s/references/freebsd-setup.md @@ -81,7 +81,7 @@ This was observed on `freebsd.lan` (FreeBSD bhyve VM on f3): `/etc/resolv.conf` ## Breaking Changes in 15.0 to Watch For - **bhyve PCI BARs**: if VM fails to boot, add `pci.enable_bars='true'` to `/zroot/bhyve/rocky/rocky.conf` -- **NFS privileged ports**: FreeBSD 15.0 sets `vfs.nfsd.nfs_privport=1` by default, blocking NFS clients connecting via stunnel (unprivileged ports). Fix: add `vfs.nfsd.nfs_privport=0` to `/etc/sysctl.conf` on each f-host, then `doas sysctl vfs.nfsd.nfs_privport=0` to apply immediately, and remount NFS on r-hosts with `mount -a`. +- **NFS privileged ports**: FreeBSD 15.0 sets `nfs_reserved_port_only=YES` in `/etc/defaults/rc.conf`, which causes the nfsd rc script to set `vfs.nfsd.nfs_privport=1` at startup — blocking NFS clients connecting via stunnel (unprivileged ports). **Important**: setting `vfs.nfsd.nfs_privport=0` in `/etc/sysctl.conf` or `/boot/loader.conf` does NOT work because the nfsd rc script overrides it. The correct fix on **each f-host**: `doas sysrc nfs_reserved_port_only=NO` - **WireGuard interface address**: FreeBSD 15.0 requires a prefix length when setting interface addresses. Add `/32` to IPv4 `Address` lines in `/usr/local/etc/wireguard/wg0.conf` (e.g. `Address = 192.168.2.130/32`). Without this, `service wireguard start` fails with "setting interface address without mask is no longer supported". Current version: **FreeBSD 15.0-RELEASE** (as of Part 8, upgraded from 14.3). diff --git a/prompts/skills/f3s/references/k3s-setup.md b/prompts/skills/f3s/references/k3s-setup.md index ab9eb0d..61b3bf5 100644 --- a/prompts/skills/f3s/references/k3s-setup.md +++ b/prompts/skills/f3s/references/k3s-setup.md @@ -130,6 +130,23 @@ spec: Create the directory on the NFS share before deploying: `mkdir /data/nfs/k3svolumes/<app>/` +### NFS Mount Health Monitor (on r0, r1, r2) + +Each Rocky Linux node runs `/usr/local/bin/check-nfs-mount.sh` via cron (every minute) to detect and fix stale/missing NFS mounts. After a successful remount, the script also **force-deletes stuck pods** on the local node (status Unknown, Pending, or ContainerCreating) so Kubernetes reschedules them with the healthy mount. + +```sh +# Cron entry (on all r-nodes, as root) +* * * * * /usr/local/bin/check-nfs-mount.sh >> /var/log/nfs-mount-check.log 2>&1 +``` + +The script: +1. Checks if `/data/nfs/k3svolumes` is a mountpoint and responsive (2s timeout) +2. If stale/missing: force-unmounts + remounts NFS +3. After successful remount: uses `kubectl` to find and delete stuck pods on this node +4. Uses a lock file (`/var/run/nfs-mount-check.lock`) to prevent concurrent runs + +**Important**: If NFS goes down cluster-wide, the root cause is usually on the FreeBSD NFS server side (f0/f1). Check CARP state, stunnel, nfsd, and `vfs.nfsd.nfs_privport` (see storage.md). + ## Deployment: GitOps with ArgoCD Config repository: `https://codeberg.org/snonux/conf` (directory: `f3s/`) diff --git a/prompts/skills/f3s/references/storage.md b/prompts/skills/f3s/references/storage.md index a43c2e8..ebe749a 100644 --- a/prompts/skills/f3s/references/storage.md +++ b/prompts/skills/f3s/references/storage.md @@ -397,17 +397,18 @@ doas sysrc nfsuserd_enable=YES doas sysrc nfsuserd_flags="-domain lan.buetow.org" doas sysrc mountd_enable=YES doas sysrc rpcbind_enable=YES +doas sysrc nfs_reserved_port_only=NO # Required for NFS over stunnel (unprivileged ports) doas mkdir -p /data/nfs/k3svolumes doas chmod 755 /data/nfs/k3svolumes ``` -> **FreeBSD 15.0 note**: FreeBSD 15.0 changed the default for `vfs.nfsd.nfs_privport` from `0` to `1`, requiring NFS clients to connect from privileged ports (<1024). NFS over stunnel uses unprivileged ports, so this breaks all NFS mounts on the r-hosts. Fix on **each f-host**: +> **FreeBSD 15.0 note**: FreeBSD 15.0 sets `nfs_reserved_port_only=YES` by default in `/etc/defaults/rc.conf`. The nfsd rc script (`/etc/rc.d/nfsd`) checks this variable and explicitly runs `sysctl vfs.nfsd.nfs_privport=1` at startup, overriding any value set in `/etc/sysctl.conf` or `/boot/loader.conf`. This blocks NFS clients connecting via stunnel (unprivileged ports). Fix on **each f-host**: > ```sh -> # Apply immediately +> # The ONLY correct fix — setting sysctl.conf does NOT work +> doas sysrc nfs_reserved_port_only=NO +> # Apply immediately without reboot > doas sysctl vfs.nfsd.nfs_privport=0 -> # Persist across reboots -> echo "vfs.nfsd.nfs_privport=0" | doas tee -a /etc/sysctl.conf > # Remount on each r-host > mount -a > ``` @@ -541,6 +542,41 @@ mount -t nfs4 -o port=2323 127.0.0.1:/k3svolumes /data/nfs/k3svolumes NFS path structure on k3s nodes: `/data/nfs/k3svolumes/<app>/` +## NFS Troubleshooting + +### All r-nodes show "access denied" when mounting NFS + +**Most likely cause**: `vfs.nfsd.nfs_privport=1` on the CARP MASTER. This happens after f-host reboots if `nfs_reserved_port_only` is not set to `NO` in rc.conf. The nfsd rc script (`/etc/rc.d/nfsd`) explicitly sets the sysctl based on this variable, overriding `/etc/sysctl.conf`. Fix: `doas sysrc nfs_reserved_port_only=NO` on both f0 and f1. + +### stunnel appears not running but port 2323 is bound + +`carpcontrol.sh` starts stunnel on CARP MASTER transition, but doesn't write a PID file. So `service stunnel status` reports "not running" even though stunnel is actually serving connections. Check with `doas sockstat -l | grep 2323`. If there's a stale stunnel process, kill it and restart: `doas kill <pid> && doas service stunnel start`. + +### Pods stuck in ContainerCreating/Unknown after NFS recovery + +After NFS is restored on the server side, the r-nodes' cron job (`check-nfs-mount.sh`) will auto-remount within 1 minute and force-delete stuck pods. If immediate recovery is needed: `mount /data/nfs/k3svolumes` on each r-node, then delete the stuck pods manually. + +### Checklist for NFS outage on CARP MASTER (f0 or f1) + +```sh +# 1. Check which host is CARP MASTER +ssh paul@f0 'ifconfig re0 | grep carp' +ssh paul@f1 'ifconfig re0 | grep carp' + +# 2. On the MASTER, verify: +doas sysctl vfs.nfsd.nfs_privport # must be 0 +doas service nfsd status # must be running +doas sockstat -l | grep 2323 # stunnel must be listening +ls /data/nfs/nfs.DO_NOT_REMOVE # ZFS dataset must be mounted + +# 3. Fix if needed: +doas sysrc nfs_reserved_port_only=NO # persist the fix +doas sysctl vfs.nfsd.nfs_privport=0 # apply immediately +doas service nfsd restart +# For stunnel, kill stale process if needed, then: +doas service stunnel start +``` + ## AWS S3 Glacier Deep Archive Backups Encrypted incremental ZFS snapshots from `zdata` pool backed up daily to **AWS S3 Glacier Deep Archive** via cron. Scripts adapted from FreeBSD Home NAS setup. Also performs periodic zpool scrubbing. |
