conf - Configuration files for the automation of my personal infrastructure (servers, laptops, workstations, phones)!

Age	Commit message (Collapse)	Author
2025-12-28	Clean up Tempo datasource ConfigMap formatting	Paul Buetow
	Remove unnecessary quotes and comments from the Tempo datasource ConfigMap. This file is now deprecated in favor of the unified grafana-datasources-all.yaml approach, but keeping it cleaned up for historical reference. Changes: - Remove quotes from string values (datasourceUid, spanStartTimeShift, etc.) - Remove inline comments - Format tags array properly - Standardize YAML formatting Note: This ConfigMap is no longer used. Datasources are now provisioned via direct ConfigMap mounting using grafana-datasources-all.yaml. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-28	Fix distributed tracing by excluding health checks from instrumentation	Paul Buetow
	Problem: - Only health check traces appeared in Tempo - API endpoint traces (/api/process) were not visible - Alloy OTLP receivers were not listening (needed restart) Root Causes: 1. Health check endpoints were creating massive trace volume from Kubernetes probes 2. Batch processor (100 spans) was filling with health checks before API traces could export 3. Alloy DaemonSet needed restart to activate OTLP receivers after configuration update Solution: 1. Restarted Alloy to activate OTLP gRPC (4317) and HTTP (4318) receivers 2. Excluded /health endpoint from Flask auto-instrumentation in all three services: - frontend: FlaskInstrumentor().instrument_app(app, excluded_urls="/health") - middleware: FlaskInstrumentor().instrument_app(app, excluded_urls="/health") - backend: FlaskInstrumentor().instrument_app(app, excluded_urls="/health") Result: ✅ Distributed traces now visible in Tempo with full span chains ✅ Single /api/process request creates 8 spans across 3 services: - Frontend: GET /api/process, frontend-process, POST (200ms) - Middleware: POST /api/transform, middleware-transform, GET (180ms) - Backend: GET /api/data, backend-get-data (100ms) ✅ Complete request flow traced: frontend → middleware → backend ✅ Node graph will now show service dependencies ✅ Traces-to-logs and traces-to-metrics correlation enabled 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-28	Fix Grafana datasource provisioning by switching to direct ConfigMap mounting	Paul Buetow
	After extensive debugging (documented in problem.md), resolved the issue where Tempo and Loki datasources would not appear in Grafana despite correct configuration. Root Cause: - Sidecar-based provisioning with label discovery was not triggering the provisioner module - Multi-step indirection (sidecar → watch → write → reload) had silent failures Solution (following x-rag pattern): - Disabled sidecar datasource provisioning - Created unified grafana-datasources-all.yaml with all datasources - Mount ConfigMap directly to /etc/grafana/provisioning/datasources/ - Grafana now reads datasources on startup via built-in provisioning Changes: - NEW: grafana-datasources-all.yaml - Unified datasource configuration (Prometheus, Alertmanager, Loki, Tempo) - MODIFIED: persistence-values.yaml - Disabled sidecar, added extraVolumes/extraVolumeMounts - MODIFIED: Justfile - Updated to use unified ConfigMap, removed patch script - MODIFIED: README.md - Documented new provisioning approach - NEW: problem.md - Complete debugging journey with 16 attempts documented - DEPRECATED: loki-datasource.yaml, tempo-datasource.yaml, patch-datasources.sh (kept for history) Result: ✅ All datasources now successfully provision on Grafana startup ✅ Tempo datasource (uid=tempo) appears in Grafana with traces-to-logs correlation ✅ Loki datasource (uid=loki) appears in Grafana ✅ Simple, maintainable approach without sidecar complexity 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-28	Add Grafana Tempo distributed tracing with demo application	Paul Buetow
	- Deploy Grafana Tempo in monolithic mode for distributed tracing - Configure Tempo with OTLP receivers (gRPC:4317, HTTP:4318) - Set up 10Gi filesystem storage with 7-day retention - Integrate Tempo datasource in Grafana with traces-to-logs and traces-to-metrics correlation - Update Grafana Alloy to collect and forward traces - Add OTLP receiver configuration to alloy-values.yaml - Configure batch processor for efficient trace forwarding to Tempo - Patch Alloy service to expose OTLP ports 4317/4318 - Create demo tracing application (frontend, middleware, backend) - Implement three-tier Python Flask application with OpenTelemetry instrumentation - Auto-instrument with OpenTelemetry for Flask and requests libraries - Push Docker images to private registry (registry.lan.buetow.org:30001) - Deploy via Helm chart with Traefik ingress at tracing-demo.f3s.buetow.org - Update Grafana configuration in prometheus/persistence-values.yaml - Add Tempo to additionalDataSources for automatic provisioning Files added: - tempo/values.yaml: Tempo Helm chart configuration - tempo/persistent-volumes.yaml: Storage configuration (10Gi PV/PVC) - tempo/datasource-configmap.yaml: Grafana datasource with correlations - tempo/Justfile: Installation automation - tempo/README.md: Documentation - tracing-demo/docker/frontend/: Python Flask frontend with OTel - tracing-demo/docker/middleware/: Python Flask middleware with OTel - tracing-demo/docker/backend/: Python Flask backend with OTel - tracing-demo/helm-chart/: Kubernetes deployments, services, ingress - tracing-demo/docker-image-Justfile: Docker build/push automation - tracing-demo/Justfile: Helm deployment automation - tracing-demo/README.md: Documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-28	Add comprehensive ZFS monitoring for FreeBSD servers	Paul Buetow
	Implemented complete ZFS monitoring solution including ARC cache statistics, pool health/capacity metrics, dataset usage, and I/O throughput monitoring. Changes: - Add ZFS recording rules (9 calculated metrics for ARC hit rates, memory usage, etc.) - Add comprehensive Grafana dashboard with 19 panels across 5 rows: * Pool Overview: capacity, health, size, free space, usage trends * I/O Throughput: read/write operations and bytes per second * Dataset Statistics: table showing all datasets with usage details * ARC Cache Statistics: hit rates, size, memory usage * ARC Breakdown: data vs metadata, MRU vs MFU with pie charts - Update Justfile to deploy ZFS recording rules - Add textfile collector script on FreeBSD servers (f0, f1, f2) for pool/dataset metrics Metrics collected: - Pool: size, allocated, free, capacity %, health status - I/O: read/write operations and throughput (via zpool iostat) - Dataset: used, available, referenced space per filesystem - ARC: hit rate, size, memory usage, data/metadata breakdown Fixes: - Pool health panel properly displays ONLINE/DEGRADED/FAULTED status - All stat panels have correct options configuration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-26	add webdav	Paul Buetow

2025-12-26	move	Paul Buetow

2025-12-26	fix	Paul Buetow

2025-12-26	jo	Paul Buetow

2025-12-26	fix	Paul Buetow

2025-12-26	delete filrerise	Paul Buetow

2025-12-25	observability: enable etcd metrics scraping	Paul Buetow
	- Enable etcd metrics on port 2381 - Add blog post draft documenting the changes
2025-12-25	revert: undo all observability changes from today	Paul Buetow
	Reverts hostname relabeling and etcd metrics changes
2025-12-25	observability: node-exporter hostnames + etcd metrics	Paul Buetow
	- Add relabel_configs to show hostnames for node-exporter targets - Enable etcd metrics scraping on port 2381 - Update blog post draft
2025-12-25	observability: display hostnames instead of IPs, enable etcd metrics	Paul Buetow
	- Add relabel_configs to additional-scrape-configs.yaml for FreeBSD/OpenBSD hosts - Add node name relabeling for node-exporter on k3s nodes - Enable etcd metrics scraping with hostname relabeling - Add DRAFT blog post documenting the changes Amp-Thread-ID: https://ampcode.com/threads/T-019b571c-4afc-7789-becf-bc8a3c4e1e1f Co-authored-by: Amp <amp@ampcode.com>
2025-12-25	use hosts not IPs	Paul Buetow

2025-12-07	add openbsd routing rules	Paul Buetow

2025-12-06	add openbsd node exporters	Paul Buetow

2025-12-06	add more	Paul Buetow

2025-12-06	more on this	Paul Buetow

2025-12-05	Fix Loki to use NFS persistent volume	Paul Buetow

2025-12-05	Add Grafana Loki with Alloy for log collection	Paul Buetow

2025-12-05	Fix Loki URL in README	Paul Buetow

2025-12-05	Add Grafana Loki deployment	Paul Buetow

2025-12-05	Add keybr.com typing tutor deployment	Paul Buetow
	Amp-Thread-ID: https://ampcode.com/threads/T-ccf9cd44-5adf-4633-9f3d-d822f733af4d Co-authored-by: Amp <amp@ampcode.com>
2025-12-05	add keybr.com	Paul Buetow

2025-12-03	add html	Paul Buetow

2025-12-03	initial f3s fallback	Paul Buetow

2025-11-22	add filebrowser	Paul Buetow

2025-11-21	works now	Paul Buetow

2025-11-21	initial filerise	Paul Buetow

2025-11-07	Update	Paul Buetow

2025-11-04	remove fotos.buetow.org	Paul Buetow

2025-11-02	use www.* as alt name in certs	Paul Buetow

2025-10-27	use new gogios	Paul Buetow

2025-10-27	change to directory	Paul Buetow

2025-10-26	odd bug workaround	Paul Buetow

2025-10-24	fix	Paul Buetow

2025-10-24	add persistent volumes to prometheus/grafana	Paul Buetow

2025-10-22	add grafana ingress	Paul Buetow

2025-10-22	add prometheus	Paul Buetow

2025-10-18	added koreader-sync-server	Paul Buetow

2025-10-08	also redirect stderr	Paul Buetow

2025-09-27	more in this	Paul Buetow

2025-09-24	Update	Paul Buetow

2025-09-23	new fooodds	Paul Buetow

2025-09-14	fix	Paul Buetow

2025-09-14	add	Paul Buetow

2025-09-13	rename to ex	Paul Buetow

2025-09-13	add xxx mail domain	Paul Buetow