diff options
| author | Paul Buetow <paul@buetow.org> | 2026-02-14 13:54:54 +0200 |
|---|---|---|
| committer | Paul Buetow <paul@buetow.org> | 2026-02-14 13:54:54 +0200 |
| commit | 3a6e01c1abd4a68810f1d85c9aa75293af47f579 (patch) | |
| tree | 2e3c066392cf2a292e89c90f259d039ce0afcb9b /docs/reference | |
| parent | f3ea9a7a1f466b6109271c76eb58189d2a799998 (diff) | |
docs: restructure documentation and move scripts to scripts/
- Add docs/ hierarchy: guides, backends, operations, reference, design
- Slim root README; add documentation index and links to docs/
- Add missing docs: csv-format-flexibility, dns-resolution, dtail-metrics-example, magefile
- Document Prometheus/VictoriaMetrics and ClickHouse backends
- Move all helper shell scripts to scripts/; update Magefile and doc references
- Add ASCII diagrams for watch mode (CSV watcher), auto mode, and ingestion paths
- Add .gitignore
Co-authored-by: Cursor <cursoragent@cursor.com>
Diffstat (limited to 'docs/reference')
| -rw-r--r-- | docs/reference/cli.md | 57 | ||||
| -rw-r--r-- | docs/reference/example-queries.md | 66 | ||||
| -rw-r--r-- | docs/reference/grafana-dashboard.md | 50 | ||||
| -rw-r--r-- | docs/reference/magefile.md | 67 | ||||
| -rw-r--r-- | docs/reference/test-metrics.md | 35 |
5 files changed, 275 insertions, 0 deletions
diff --git a/docs/reference/cli.md b/docs/reference/cli.md new file mode 100644 index 0000000..83d02b0 --- /dev/null +++ b/docs/reference/cli.md @@ -0,0 +1,57 @@ +# CLI Reference + +All flags and defaults. Modes: `realtime`, `historic`, `backfill`, `auto`, `watch`. + +## Global + +| Flag | Default | Description | +|------|---------|-------------| +| `-version` | — | Print version and exit | +| `-mode` | `realtime` | Mode: realtime, historic, backfill, auto, or watch | + +## Realtime + +| Flag | Default | Description | +|------|---------|-------------| +| `-pushgateway` | `http://localhost:9091` | Pushgateway URL | +| `-job` | `example_metrics_pusher` | Job name for metrics | +| `-continuous` | `false` | Push every 15s | + +## Historic + +| Flag | Default | Description | +|------|---------|-------------| +| `-prometheus` | `http://localhost:9090/api/v1/write` | Prometheus Remote Write URL | +| `-hours-ago` | `24` | Hours in the past (single datapoint) | + +## Backfill + +| Flag | Default | Description | +|------|---------|-------------| +| `-prometheus` | `http://localhost:9090/api/v1/write` | Prometheus Remote Write URL | +| `-start-hours` | `48` | Start time in hours ago | +| `-end-hours` | `0` | End time in hours ago (0 = now) | +| `-interval` | `1` | Interval between points in hours | + +## Auto + +| Flag | Default | Description | +|------|---------|-------------| +| `-file` | — | Input file path (required) | +| `-format` | `csv` | Input format: csv or json | +| `-pushgateway` | `http://localhost:9091` | Pushgateway URL | +| `-prometheus` | `http://localhost:9090/api/v1/write` | Prometheus Remote Write URL | + +## Watch + +| Flag | Default | Description | +|------|---------|-------------| +| `-file` | — | CSV file(s) to watch (comma-separated for multiple); required | +| `-metric-name` | — | Base metric name (e.g. myapp, food); required | +| `-prometheus` | `http://localhost:9090/api/v1/write` | Prometheus Remote Write URL (set to empty to disable) | +| `-clickhouse` | — | ClickHouse HTTP URL (e.g. http://localhost:8123) | +| `-clickhouse-table` | `epimetheus_metrics` | ClickHouse table name | +| `-job` | `example_metrics_pusher` | Job name for metrics | +| `-resolve-ip-labels` | (ip only) | Comma-separated additional IP labels to resolve via DNS | + +Watch mode requires at least one of `-prometheus` or `-clickhouse`. Use `-prometheus=` to ingest only to ClickHouse. diff --git a/docs/reference/example-queries.md b/docs/reference/example-queries.md new file mode 100644 index 0000000..e78aaec --- /dev/null +++ b/docs/reference/example-queries.md @@ -0,0 +1,66 @@ +# Example Queries + +PromQL and curl examples for Epimetheus test metrics. Use your Prometheus (or Prometheus-compatible) query URL; after port-forward, that is often http://localhost:9090. + +## Basic PromQL + +```promql +# Total requests +epimetheus_test_requests_total + +# Request rate (last 5 minutes) +rate(epimetheus_test_requests_total[5m]) + +# Active connections +epimetheus_test_active_connections + +# Temperature +epimetheus_test_temperature_celsius +``` + +## Histogram + +```promql +# 95th percentile request duration +histogram_quantile(0.95, rate(epimetheus_test_request_duration_seconds_bucket[5m])) + +# Median (50th percentile) +histogram_quantile(0.50, rate(epimetheus_test_request_duration_seconds_bucket[5m])) + +# Average request duration +rate(epimetheus_test_request_duration_seconds_sum[5m]) / +rate(epimetheus_test_request_duration_seconds_count[5m]) +``` + +## Labeled counter + +```promql +# Failed jobs by type +epimetheus_test_jobs_processed_total{status="failed"} + +# Job success rate +rate(epimetheus_test_jobs_processed_total{status="success"}[5m]) / +rate(epimetheus_test_jobs_processed_total[5m]) + +# Total jobs by type +sum by (job_type) (epimetheus_test_jobs_processed_total) +``` + +## Curl (HTTP API) + +```bash +# Port-forward if needed +kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 & + +# Total requests +curl -s "http://localhost:9090/api/v1/query?query=epimetheus_test_requests_total" | jq . + +# Temperature +curl -s "http://localhost:9090/api/v1/query?query=epimetheus_test_temperature_celsius" | jq . + +# Request rate +curl -s "http://localhost:9090/api/v1/query?query=rate(epimetheus_test_requests_total[5m])" | jq . + +# Histogram p95 +curl -s "http://localhost:9090/api/v1/query?query=histogram_quantile(0.95,rate(epimetheus_test_request_duration_seconds_bucket[5m]))" | jq . +``` diff --git a/docs/reference/grafana-dashboard.md b/docs/reference/grafana-dashboard.md new file mode 100644 index 0000000..b7f2030 --- /dev/null +++ b/docs/reference/grafana-dashboard.md @@ -0,0 +1,50 @@ +# Grafana Dashboard + +A dashboard is provided that shows all Epimetheus test metrics. + +## Panels + +1. Request Rate (line graph) +2. Total Requests (stat) +3. Active Connections (gauge with thresholds) +4. Temperature (gauge with thresholds) +5. Request Duration Histogram (p50, p90, p99) +6. Average Request Duration (stat) +7. Jobs Processed by Type (bar gauge) +8. Jobs Status Breakdown (table) + +Auto-refresh: 10 seconds. Time range: last 15 minutes (configurable). Optimized for dark theme. + +## Deployment + +### Option 1: Kubernetes ConfigMap (recommended) + +If you have a manifest that defines the dashboard as a ConfigMap with Grafana’s discovery label: + +```bash +kubectl apply -f ../prometheus/epimetheus-dashboard.yaml +``` + +Grafana will pick it up automatically. + +### Option 2: Manual import + +1. Port-forward Grafana: `kubectl port-forward -n monitoring svc/prometheus-grafana 3000:80` +2. Open http://localhost:3000 +3. Dashboards → Import → Upload `grafana-dashboard.json` + +### Option 3: Deploy script + +```bash +./scripts/deploy-dashboard.sh +# Or with credentials: +GRAFANA_URL="http://localhost:3000" GRAFANA_USER="admin" GRAFANA_PASSWORD="yourpassword" ./scripts/deploy-dashboard.sh +``` + +## Datasource + +Use Prometheus (or a Prometheus-compatible backend such as VictoriaMetrics) as the datasource. Point it at the same instance Epimetheus writes to (e.g. http://localhost:9090 after port-forward). + +## Panel guidelines + +When creating or updating Grafana panels, follow the project’s [AGENT.md](../../AGENT.md) (Grafana dashboard guidelines): e.g. sort time series by last value descending, use `sort_desc()` in bar gauges, set table sort options as specified. diff --git a/docs/reference/magefile.md b/docs/reference/magefile.md new file mode 100644 index 0000000..0ce0b0d --- /dev/null +++ b/docs/reference/magefile.md @@ -0,0 +1,67 @@ +# Magefile Reference + +Epimetheus uses [Mage](https://magefile.org/) for build, test, and run targets. The build logic lives in `Magefile.go` at the repo root. + +## Prerequisites + +```bash +go install github.com/magefile/mage@latest +``` + +## Default Target + +Running `mage` with no arguments runs **Build**. + +## Targets + +| Target | Description | Example | +|--------|-------------|---------| +| `build` | Compile the epimetheus binary | `mage build` | +| `install` | Install binary to `$GOPATH/bin` | `mage install` | +| `run` | Build and run in realtime mode (continuous) | `mage run` | +| `runHistoric` | Build and run historic mode (24h ago) | `mage runHistoric` | +| `runAuto <file>` | Build and run auto mode with a file | `mage runAuto test-all-ages.csv` | +| `runWatchClickHouse [file]` | Build and run watch mode with ClickHouse only | `mage runWatchClickHouse` or `mage runWatchClickHouse my.csv` | +| `test` | Run all tests | `mage test` | +| `testCoverage` | Run tests and open coverage report | `mage testCoverage` | +| `testRace` | Run tests with race detector | `mage testRace` | +| `benchmark` | Run Go benchmarks | `mage benchmark` | +| `lint` | Run golangci-lint | `mage lint` | +| `fmt` | Format all Go code | `mage fmt` | +| `vet` | Run go vet | `mage vet` | +| `tidy` | Run go mod tidy | `mage tidy` | +| `clean` | Remove binary and coverage artifacts | `mage clean` | +| `generate` | Run go generate | `mage generate` | +| `version` | Build and print version | `mage version` | +| `all` | Run fmt, vet, test, and build | `mage all` | +| `ci` | Tidy, vet, test, and build (CI pipeline) | `mage ci` | +| `dev` | Build, port-forward Pushgateway, run realtime mode | `mage dev` | +| `generateTestData` | Generate test data files | `mage generateTestData` | +| `backfill` | Run backfill for last 48 hours | `mage backfill` | +| `benchmark100MB` | Run 100MB benchmark script | `mage benchmark100MB` | +| `benchmark1GB` | Run 1GB benchmark script | `mage benchmark1GB` | +| `cleanupBenchmarkData` | Clean benchmark data from Prometheus | `mage cleanupBenchmarkData` | +| `cleanupBenchmarkMetrics` | Clean benchmark metric files | `mage cleanupBenchmarkMetrics` | +| `deployDashboard` | Deploy Grafana dashboard via script | `mage deployDashboard` | +| `help` | Print list of targets | `mage help` | + +## Examples + +```bash +# Build and run realtime mode +mage run + +# Run tests with coverage +mage testCoverage + +# Run watch mode with ClickHouse (default test file) +mage runWatchClickHouse + +# Run watch mode with your CSV +mage runWatchClickHouse /path/to/data.csv + +# Full CI checks +mage ci +``` + +See [Quick Start](../guides/quickstart.md) and [CLI Reference](cli.md) for more on running Epimetheus. diff --git a/docs/reference/test-metrics.md b/docs/reference/test-metrics.md new file mode 100644 index 0000000..a1af41e --- /dev/null +++ b/docs/reference/test-metrics.md @@ -0,0 +1,35 @@ +# Test Metrics + +Generated metrics use the `epimetheus_test_` prefix so they are easy to identify as test data. + +## Counter: `epimetheus_test_requests_total` + +- **Type:** Counter (monotonically increasing) +- **Description:** Total number of requests processed +- **Use case:** Total events, requests, errors + +## Gauge: `epimetheus_test_active_connections` + +- **Type:** Gauge (can increase or decrease) +- **Description:** Current number of active connections (0–100) +- **Use case:** Current state, capacity + +## Gauge: `epimetheus_test_temperature_celsius` + +- **Type:** Gauge +- **Description:** Current temperature in Celsius (0–50°C) +- **Use case:** Environmental monitoring + +## Histogram: `epimetheus_test_request_duration_seconds` + +- **Type:** Histogram (distribution) +- **Description:** Request duration distribution +- **Buckets:** 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10 seconds +- **Use case:** Latency, SLO tracking + +## Labeled counter: `epimetheus_test_jobs_processed_total` + +- **Type:** Counter with labels +- **Description:** Jobs processed by type and status +- **Labels:** `job_type` (email, report, backup), `status` (success, failed) +- **Use case:** Categorized counting, multi-dimensional metrics |
