| Age | Commit message (Collapse) | Author |
|
A hard NFS mount that fails enters uninterruptible kernel sleep (D-state)
which SIGKILL cannot wake, so the recovery script hangs forever and the
lockfile stays — silently disabling all subsequent health checks. Switch
the remount to explicit soft,timeo=50,retrans=3 so the kernel gives up
after ~15s, and detect/remove lockfiles older than 90s left behind by a
SIGKILL'd predecessor.
|
|
node_exporter runs as uid 65534 (nobody); mktemp creates files with
mode 600 (root-only). Add chmod 644 before the atomic mv so the
node_exporter process can read nfs_mount_monitor.prom on its scrape.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
- check-nfs-mount.sh: write nfs_mount_monitor_consecutive_failures gauge
to /var/lib/node_exporter/textfile_collector/nfs_mount_monitor.prom on
every run (via write_textfile_metric helper, called from write_fail_count
and directly on healthy runs); atomic tmp+mv write prevents partial reads
- Rexfile: create /var/lib/node_exporter/textfile_collector dir on r-nodes
- prometheus.yaml (ArgoCD app): enable textfile_collector in node_exporter
DaemonSet via extraArgs/extraVolumes/extraVolumeMounts; mount host path
/var/lib/node_exporter/textfile_collector into container
- persistence-values.yaml: sync node_exporter textfile_collector config
- nfs-mount-monitor-alerts.yaml: PrometheusRule with two alerts:
NfsMountAutoRepairWarning (>= 3 consecutive failures, severity: warning)
NfsMountAutoRepairCritical (>= 5 consecutive failures, severity: critical)
wired into new 'nfs-alerts' Alertmanager receiver with 30m repeat_interval
Tested: rex deploy succeeded, .prom files present on r0/r1/r2, timer clean.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Persist a consecutive-failure counter to /var/lib/nfs-mount-monitor/fail-count.
Increment on every fix_mount failure; reset to 0 on any successful repair or
when all three probes pass cleanly. After NFS_FAIL_THRESHOLD (default 5, ~50s)
consecutive failures the node is cordoned via kubectl and rebooted with
'systemctl reboot' so the cluster stops routing pods to a silently broken node.
NFS_FAIL_THRESHOLD is configurable via /etc/default/nfs-mount-monitor (deployed
as EnvironmentFile in the .service unit) without touching the script.
Also fix Rexfile path resolution: __FILE__ inside a Rex task resolves to the
internal Rex loader path, not the Rexfile itself; use realpath($::rexfile)
instead.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Add lazy umount fallback, D-state process killer, stunnel restart, and
60-second hard deadline to prevent fix_mount from looping forever when
processes are stuck in D state on a stale NFSv4-over-stunnel mount.
Recovery sequence is now:
1. mount -o remount -f (cheap, no disruption)
2. kill_pinning_processes (SIGKILL D-state procs with nfs_ wchan)
3. umount -f (force unmount)
4. umount -l (lazy detach VFS node if -f failed)
5. systemctl restart stunnel + 2s sleep (refresh TLS transport)
6. mount (fresh mount)
The 60s deadline uses bash $SECONDS so fix_mount can never outlast its
own 10-second timer interval by an unbounded amount. Deployed to all
three r-nodes (r0/r1/r2) via rex nfs_mount_monitor.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Stunnel-wrapped NFSv4 can enter a half-broken state where mountpoint(1)
returns true and stat(1) completes from cache, but ALL writes hang
indefinitely. This was observed on r2 on 2026-05-10 causing navidrome
to be unschedulable. The existing two probes passed while writes were
dead.
Add a third probe (write-probe) after the stat probe: write the shell's
PID to a per-host .healthcheck.<hostname> file and immediately remove it,
wrapped in a 5-second timeout. The per-host filename prevents r0/r1/r2
from racing on the same file. 5s gives one full NFS retransmit window
(timeo=10 deciseconds = 1s, retrans=2) plus margin without making the
10-second timer run too long.
Deployed to r0/r1/r2 via rex nfs_mount_monitor; all three nodes
confirmed running the new script (journalctl shows clean exits).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Pull check-nfs-mount.sh, nfs-mount-monitor.service, and
nfs-mount-monitor.timer from r0/r1/r2 (confirmed identical on all
three nodes) into f3s/r-nodes/nfs-mount-monitor/. Add
f3s/r-nodes/Rexfile with an idempotent nfs_mount_monitor task that
pushes the files to all three r-nodes as root and reloads systemd when
content changes. Wire the new Rexfile into the repo root Rexfile.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|