diff options
Diffstat (limited to 'f3s/prometheus-pusher/README.md')
| -rw-r--r-- | f3s/prometheus-pusher/README.md | 602 |
1 files changed, 504 insertions, 98 deletions
diff --git a/f3s/prometheus-pusher/README.md b/f3s/prometheus-pusher/README.md index d107299..415cdde 100644 --- a/f3s/prometheus-pusher/README.md +++ b/f3s/prometheus-pusher/README.md @@ -1,151 +1,557 @@ # Prometheus Pusher -A standalone Go binary that pushes metrics to Prometheus via Pushgateway. +A versatile Go tool for pushing metrics to Prometheus with support for both realtime and historic data ingestion. + +## Overview + +**prometheus-pusher** is a standalone binary that: +- **Generates** realistic example metrics simulating production applications +- **Pushes** metrics via Pushgateway (realtime) or Remote Write API (historic) +- **Automatically detects** timestamp age and chooses the optimal ingestion method +- **Supports** multiple data formats (CSV, JSON) and all Prometheus metric types +- **Provides** Grafana dashboard for visualizing test metrics ## Quick Start +### 1. Deploy Pushgateway (one-time setup) + ```bash -# 1. Deploy Pushgateway (one-time - see /home/paul/git/conf/f3s/pushgateway/) cd /home/paul/git/conf/f3s/pushgateway/helm-chart -helm upgrade --install pushgateway . -n monitoring +helm upgrade --install pushgateway . -n monitoring --create-namespace +``` -# 2. Run the binary +### 2. Run in Realtime Mode + +```bash +# Port-forward Pushgateway +kubectl port-forward -n monitoring svc/pushgateway 9091:9091 & + +# Push test metrics continuously cd /home/paul/git/conf/f3s/prometheus-pusher -./run.sh +./prometheus-pusher -mode=realtime -continuous ``` -That's it! The binary will push metrics every 15 seconds. Press Ctrl+C to stop. +The binary pushes metrics every 15 seconds. Press Ctrl+C to stop. -## Overview +### 3. View Metrics -This project consists of: -1. **Pushgateway** - A Kubernetes service that receives pushed metrics -2. **prometheus-pusher** - A standalone Go binary that generates and pushes example metrics +```bash +# Pushgateway UI +open http://localhost:9091 -## Metric Types Demonstrated +# Prometheus UI +kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 & +open http://localhost:9090 +``` -The application pushes the following types of metrics: +## Operating Modes -### Counter (`app_requests_total`) -- Monotonically increasing value -- Example: Total number of requests processed -- Use case: Counting events, total requests, errors, etc. +### π Realtime Mode (Default) +Push current metrics to Pushgateway with "now" timestamp. -### Gauge (`app_active_connections`, `app_temperature_celsius`) -- Value that can increase or decrease -- Examples: Active connections, temperature, memory usage -- Use case: Current state measurements +```bash +./prometheus-pusher -mode=realtime -continuous +``` -### Histogram (`app_request_duration_seconds`) -- Samples observations and counts them in configurable buckets -- Example: Request duration distribution -- Use case: Latency measurements, response times +**Options:** +- `-pushgateway` - Pushgateway URL (default: http://localhost:9091) +- `-job` - Job name (default: example_metrics_pusher) +- `-continuous` - Keep pushing every 15 seconds -### Counter with Labels (`app_jobs_processed_total`) -- Counter with dimensional labels -- Labels: `job_type` (email, report, backup), `status` (success, failed) -- Use case: Categorized counting +### β° Historic Mode +Push a single datapoint from the past using Remote Write API. -## Project Structure +```bash +# Port-forward Prometheus +kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 & +# Push data from 24 hours ago +./prometheus-pusher -mode=historic -hours-ago=24 ``` -prometheus-pusher/ -βββ main.go # Go source code -βββ go.mod / go.sum # Go dependencies -βββ prometheus-pusher # Compiled binary (standalone executable) -βββ run.sh # Helper script to run the binary -βββ example-metrics.txt # Example of metrics format -βββ USAGE.md # Detailed usage guide -βββ README.md # This file -Note: Pushgateway Helm chart is located at /home/paul/git/conf/f3s/pushgateway/ +**Options:** +- `-prometheus` - Prometheus URL (default: http://localhost:9090/api/v1/write) +- `-hours-ago` - Hours in the past (default: 24) + +### π¦ Backfill Mode +Import a range of historic data points. + +```bash +# Backfill last 48 hours with 1-hour intervals +./prometheus-pusher -mode=backfill -start-hours=48 -end-hours=0 -interval=1 + +# Backfill last week with 6-hour intervals +./prometheus-pusher -mode=backfill -start-hours=168 -end-hours=0 -interval=6 +``` + +**Options:** +- `-start-hours` - Start time in hours ago +- `-end-hours` - End time in hours ago (0 = now) +- `-interval` - Interval between points in hours + +### π€ Auto Mode (Recommended!) +Automatically detect timestamp age and route to the correct ingestion method. + +```bash +# Generate test data +./generate-test-data.sh + +# Import mixed current and historic data +./prometheus-pusher -mode=auto -file=test-all-ages.csv ``` -## What It Does +**Detection Logic:** +- Data < 5 minutes old β Pushgateway (realtime) +- Data β₯ 5 minutes old β Remote Write (historic) + +**Options:** +- `-file` - Input file path +- `-format` - Data format: csv or json (default: csv) +- `-pushgateway` - Pushgateway URL +- `-prometheus` - Prometheus Remote Write URL + +## Data Formats + +### CSV Format + +```csv +# Format: metric_name,labels,value,timestamp_ms +# Labels: key1=value1;key2=value2 +prometheus_pusher_test_requests_total,instance=web1;env=prod,100,1767125148000 +prometheus_pusher_test_temperature_celsius,instance=web2,22.5,1767038748000 + +# Timestamp is optional (uses "now" if omitted) +prometheus_pusher_test_active_connections,instance=web3,42, +``` + +### JSON Format + +```json +[ + { + "metric": "prometheus_pusher_test_requests_total", + "labels": {"instance": "web1", "env": "prod"}, + "value": 100, + "timestamp_ms": 1767125148000 + }, + { + "metric": "prometheus_pusher_test_temperature_celsius", + "labels": {"instance": "web2"}, + "value": 22.5, + "timestamp_ms": 1767038748000 + } +] +``` + +## Test Metrics + +All generated metrics use the `prometheus_pusher_test_` prefix to clearly identify them as test data. + +### Counter: `prometheus_pusher_test_requests_total` +- **Type:** Counter (monotonically increasing) +- **Description:** Total number of requests processed +- **Use case:** Counting total events, requests, errors + +### Gauge: `prometheus_pusher_test_active_connections` +- **Type:** Gauge (can increase or decrease) +- **Description:** Current number of active connections (0-100) +- **Use case:** Current state measurements, capacity -The `prometheus-pusher` binary: -- **Generates** realistic example metrics simulating a production application -- **Pushes** metrics to Pushgateway every 15 seconds using HTTP POST -- **Demonstrates** all major Prometheus metric types with practical examples +### Gauge: `prometheus_pusher_test_temperature_celsius` +- **Type:** Gauge +- **Description:** Current temperature in Celsius (0-50Β°C) +- **Use case:** Environmental monitoring -The metrics flow: `Go Binary β Pushgateway β Prometheus β Grafana` +### Histogram: `prometheus_pusher_test_request_duration_seconds` +- **Type:** Histogram (distribution) +- **Description:** Request duration distribution +- **Buckets:** 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10 seconds +- **Use case:** Latency measurements, SLO tracking -## Example Metrics Format +### Labeled Counter: `prometheus_pusher_test_jobs_processed_total` +- **Type:** Counter with labels +- **Description:** Jobs processed by type and status +- **Labels:** + - `job_type`: email, report, backup + - `status`: success, failed +- **Use case:** Categorized counting, multi-dimensional metrics -The pusher sends metrics in Prometheus format to the Pushgateway. Here's what the data looks like: +## Grafana Dashboard +A comprehensive dashboard is available showcasing all test metrics. + +### Dashboard Features + +- **8 Panels:** + 1. Request Rate (line graph) + 2. Total Requests (stat panel) + 3. Active Connections (gauge with thresholds) + 4. Temperature (gauge with thresholds) + 5. Request Duration Histogram (p50, p90, p99) + 6. Average Request Duration (stat) + 7. Jobs Processed by Type (bar gauge) + 8. Jobs Status Breakdown (table) + +- **Auto-refresh:** Every 10 seconds +- **Time range:** Last 15 minutes (customizable) +- **Dark theme optimized** + +### Deploy Dashboard + +#### Option 1: Helm/Kubernetes ConfigMap (Recommended) + +```bash +# Deploy via Kubernetes ConfigMap +kubectl apply -f ../prometheus/prometheus-pusher-dashboard.yaml ``` -# HELP app_requests_total Total number of requests processed -# TYPE app_requests_total counter -app_requests_total{instance="example-app",job="example_metrics_pusher"} 42 -# HELP app_active_connections Number of currently active connections -# TYPE app_active_connections gauge -app_active_connections{instance="example-app",job="example_metrics_pusher"} 67 +The dashboard will be automatically discovered by Grafana. -# HELP app_temperature_celsius Current temperature in Celsius -# TYPE app_temperature_celsius gauge -app_temperature_celsius{instance="example-app",job="example_metrics_pusher"} 23.5 +#### Option 2: Manual Import -# HELP app_request_duration_seconds Histogram of request duration in seconds -# TYPE app_request_duration_seconds histogram -app_request_duration_seconds_bucket{instance="example-app",job="example_metrics_pusher",le="0.005"} 2 -app_request_duration_seconds_bucket{instance="example-app",job="example_metrics_pusher",le="0.01"} 3 -app_request_duration_seconds_bucket{instance="example-app",job="example_metrics_pusher",le="+Inf"} 10 -app_request_duration_seconds_sum{instance="example-app",job="example_metrics_pusher"} 8.5 -app_request_duration_seconds_count{instance="example-app",job="example_metrics_pusher"} 10 +```bash +# Port-forward Grafana +kubectl port-forward -n monitoring svc/prometheus-grafana 3000:80 -# HELP app_jobs_processed_total Total number of jobs processed by type -# TYPE app_jobs_processed_total counter -app_jobs_processed_total{instance="example-app",job="example_metrics_pusher",job_type="email",status="success"} 15 -app_jobs_processed_total{instance="example-app",job="example_metrics_pusher",job_type="email",status="failed"} 2 -app_jobs_processed_total{instance="example-app",job="example_metrics_pusher",job_type="report",status="success"} 8 -app_jobs_processed_total{instance="example-app",job="example_metrics_pusher",job_type="backup",status="success"} 12 +# Open Grafana +open http://localhost:3000 + +# Go to Dashboards β Import β Upload grafana-dashboard.json ``` -## Querying Metrics in Prometheus +#### Option 3: Automated Script -Once configured, you can query these metrics in Prometheus: +```bash +# Deploy via API +./deploy-dashboard.sh + +# Or with custom credentials +GRAFANA_URL="http://localhost:3000" \ +GRAFANA_USER="admin" \ +GRAFANA_PASSWORD="yourpassword" \ +./deploy-dashboard.sh +``` + +## Example Queries + +### Basic Queries ```promql -# View request rate -rate(app_requests_total[5m]) +# View total requests +prometheus_pusher_test_requests_total + +# View request rate over last 5 minutes +rate(prometheus_pusher_test_requests_total[5m]) # View current active connections -app_active_connections +prometheus_pusher_test_active_connections + +# View current temperature +prometheus_pusher_test_temperature_celsius +``` + +### Histogram Queries + +```promql +# 95th percentile request duration +histogram_quantile(0.95, rate(prometheus_pusher_test_request_duration_seconds_bucket[5m])) + +# 50th percentile (median) +histogram_quantile(0.50, rate(prometheus_pusher_test_request_duration_seconds_bucket[5m])) + +# Average request duration +rate(prometheus_pusher_test_request_duration_seconds_sum[5m]) / +rate(prometheus_pusher_test_request_duration_seconds_count[5m]) +``` + +### Labeled Counter Queries + +```promql +# Failed jobs by type +prometheus_pusher_test_jobs_processed_total{status="failed"} + +# Job success rate +rate(prometheus_pusher_test_jobs_processed_total{status="success"}[5m]) / +rate(prometheus_pusher_test_jobs_processed_total[5m]) + +# Total jobs by type +sum by (job_type) (prometheus_pusher_test_jobs_processed_total) +``` + +### Curl Examples + +```bash +# Port-forward Prometheus +kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 & + +# Query total requests +curl -s "http://localhost:9090/api/v1/query?query=prometheus_pusher_test_requests_total" | jq . + +# Query temperature +curl -s "http://localhost:9090/api/v1/query?query=prometheus_pusher_test_temperature_celsius" | jq . + +# Query request rate +curl -s "http://localhost:9090/api/v1/query?query=rate(prometheus_pusher_test_requests_total[5m])" | jq . + +# Query histogram p95 +curl -s "http://localhost:9090/api/v1/query?query=histogram_quantile(0.95,rate(prometheus_pusher_test_request_duration_seconds_bucket[5m]))" | jq . +``` + +## Time Range Limitations + +### β
Supported Time Ranges -# View 95th percentile request duration -histogram_quantile(0.95, rate(app_request_duration_seconds_bucket[5m])) +| Time Range | Status | Method | +|------------|--------|--------| +| Current (< 5 min) | β
Works | Pushgateway | +| 1 hour old | β
Works | Remote Write | +| 1 day old | β
Works | Remote Write | +| 1 week old | β
Works | Remote Write | +| 1 month old | β
Works | Remote Write | -# View failed jobs by type -app_jobs_processed_total{status="failed"} +### β οΈ Potential Issues -# View job success rate -rate(app_jobs_processed_total{status="success"}[5m]) / rate(app_jobs_processed_total[5m]) +- **Future timestamps:** Rejected (> 5 minutes in future) +- **Very old data (6+ months):** May be rejected depending on Prometheus retention +- **Years old:** Likely rejected - use `promtool tsdb create-blocks-from` instead +- **Out-of-order samples:** Can't insert older data into existing time series (use different labels) + +### Prometheus Configuration + +Check your retention settings: + +```bash +# View retention +kubectl get prometheus -n monitoring prometheus-kube-prometheus-prometheus \ + -o jsonpath='{.spec.retention}' + +# Default is typically 15 days +``` + +For very old data: +- Increase retention in Prometheus config +- Enable out-of-order ingestion (experimental) +- Use `promtool` for direct TSDB block creation + +## Project Structure + +``` +prometheus-pusher/ +βββ cmd/ +β βββ prometheus-pusher/ +β βββ main.go # Main entry point +βββ internal/ +β βββ config/ # Configuration +β βββ metrics/ # Metric generators +β βββ parser/ # CSV/JSON parsers +β βββ ingester/ # Pushgateway & Remote Write ingesters +βββ prometheus-pusher # Compiled binary +βββ grafana-dashboard.json # Grafana dashboard definition +βββ deploy-dashboard.sh # Dashboard deployment script +βββ generate-test-data.sh # Test data generator +βββ run.sh # Helper script +βββ README.md # This file +``` + +## Setup Requirements + +### 1. Enable Prometheus Remote Write Receiver + +For historic data ingestion, Prometheus needs the remote write receiver enabled: + +```yaml +# In prometheus/persistence-values.yaml +prometheus: + prometheusSpec: + enableFeatures: + - remote-write-receiver +``` + +### 2. Update Prometheus Scrape Config + +Ensure Pushgateway is in scrape targets: + +```yaml +# additional-scrape-configs.yaml +- job_name: 'pushgateway' + honor_labels: true + static_configs: + - targets: + - 'pushgateway.monitoring.svc.cluster.local:9091' +``` + +Apply the configuration: + +```bash +kubectl create secret generic additional-scrape-configs \ + --from-file=/home/paul/git/conf/f3s/prometheus/additional-scrape-configs.yaml \ + --dry-run=client -o yaml -n monitoring | kubectl apply -f - +``` + +## Building from Source + +```bash +# Build binary +go build -o prometheus-pusher cmd/prometheus-pusher/main.go + +# Run tests +go test ./... -v + +# Check test coverage +go test ./... -cover +``` + +## Troubleshooting + +### Binary can't connect to Pushgateway + +```bash +# Check port-forward is running +ps aux | grep "port-forward.*9091" + +# Restart port-forward +kubectl port-forward -n monitoring svc/pushgateway 9091:9091 +``` + +### Metrics not appearing in Prometheus + +```bash +# Check Pushgateway has metrics +curl http://localhost:9091/metrics | grep "prometheus_pusher_test" + +# Check Prometheus scrape targets +# Open http://localhost:9090/targets - look for "pushgateway" job + +# Check Prometheus logs +kubectl logs -n monitoring -l app.kubernetes.io/name=prometheus +``` + +### "Remote write receiver not enabled" error + +```bash +# Verify feature is enabled +kubectl logs -n monitoring prometheus-prometheus-kube-prometheus-prometheus-0 | grep "remote-write-receiver" + +# Should see: msg="Experimental features enabled" features=[remote-write-receiver] ``` -## Configuration +### "Out of order sample" error -The pusher is configured to: -- Push metrics every 15 seconds -- Use job name: `example_metrics_pusher` -- Use instance label: `example-app` -- Connect to Pushgateway at: `http://pushgateway.monitoring.svc.cluster.local:9091` +This occurs when trying to insert data older than existing data for the same time series. -## How It Works +**Solutions:** +- Use different job labels for historic data (e.g., `job="historic_data"`) +- Enable out-of-order ingestion in Prometheus (experimental) +- Ensure backfill goes from oldest to newest -1. The Go application generates random example metrics simulating a real application -2. Metrics are pushed to the Pushgateway via HTTP POST -3. Prometheus scrapes the Pushgateway periodically -4. Metrics become available in Prometheus for querying and alerting -5. Grafana can visualize these metrics +### Dashboard not appearing in Grafana + +```bash +# Check ConfigMap exists +kubectl get configmap -n monitoring | grep prometheus-pusher + +# Check labels +kubectl get configmap prometheus-pusher-dashboard -n monitoring -o yaml | grep "grafana_dashboard" + +# Restart Grafana to force reload +kubectl rollout restart deployment/prometheus-grafana -n monitoring +``` + +## Architecture + +``` +βββββββββββββββββββ +β Go Binary β +β (prometheus- βββPush realtimeβββ +β pusher) β β +βββββββββββββββββββ βΌ + β ββββββββββββββββββββ + β β Pushgateway ββββScrapeβββ + β β (Port 9091) β β + β ββββββββββββββββββββ β + β β + βββPush historicβββββββββββββββββββ β + βΌ β + βββββββββββββββββββ β + β Prometheus βββββββ + β (Port 9090) β + β Remote Write APIβ + βββββββββββββββββββ + β + β Datasource + βΌ + βββββββββββββββββββ + β Grafana β + β (Port 3000) β + β Dashboards β + βββββββββββββββββββ +``` ## Best Practices -- Use Pushgateway for batch jobs, short-lived processes, or service-level metrics -- For long-running applications, prefer exposing a `/metrics` endpoint for Prometheus to scrape -- Include meaningful labels but avoid high-cardinality labels (e.g., user IDs, timestamps) -- Use appropriate metric types: - - Counter for cumulative values - - Gauge for point-in-time values - - Histogram/Summary for distributions +### When to Use Pushgateway vs. Remote Write + +**Use Pushgateway (realtime mode):** +- Short-lived batch jobs +- Service-level metrics +- Jobs behind firewalls +- Current/recent data (< 5 minutes old) + +**Use Remote Write (historic mode):** +- Historic data import +- Backfilling gaps +- Data migration +- Data older than 5 minutes + +**Use Auto Mode:** +- Mixed current and historic data +- Importing from files +- Unknown timestamp ages +- General-purpose ingestion + +### Metric Design + +- **Use appropriate metric types:** + - Counter for cumulative values (requests, errors) + - Gauge for point-in-time values (temperature, connections) + - Histogram for distributions (latency, sizes) + +- **Label cardinality:** + - Include meaningful labels + - Avoid high-cardinality labels (user IDs, timestamps) + - Keep label combinations reasonable (< 1000 per metric) + +- **Naming conventions:** + - Use descriptive names + - Include units in gauge names (\_celsius, \_bytes) + - Use \_total suffix for counters + +## Cleanup + +```bash +# Stop port-forwards +pkill -f "port-forward.*9091" +pkill -f "port-forward.*9090" +pkill -f "port-forward.*3000" + +# Delete test metrics from Pushgateway +curl -X DELETE http://localhost:9091/metrics/job/example_metrics_pusher + +# Uninstall Pushgateway (if needed) +helm uninstall pushgateway -n monitoring +``` + +## Additional Resources + +- [Prometheus Documentation](https://prometheus.io/docs/) +- [Pushgateway Documentation](https://github.com/prometheus/pushgateway) +- [Prometheus Remote Write Spec](https://prometheus.io/docs/concepts/remote_write_spec/) +- [Grafana Documentation](https://grafana.com/docs/) + +## Version + +Current version: 0.0.0 + +## License + +See LICENSE file for details. |
