diff options
| author | Paul Buetow <paul@buetow.org> | 2025-12-30 23:16:33 +0200 |
|---|---|---|
| committer | Paul Buetow <paul@buetow.org> | 2025-12-30 23:16:33 +0200 |
| commit | 097767ed8eb3b7193c0737e2a99c23ff89074769 (patch) | |
| tree | c25617cccb4a0f1dbf7a07808d154bcb3347ce39 | |
| parent | 873957d4e818a6117836486d6ea7258c3f79e1d2 (diff) | |
Consolidate all documentation into single comprehensive README.md
- Merged content from 10 separate .md files into README.md
- Removed: ANSWER.md, AUTO-MODE.md, DASHBOARD.md, HISTORIC.md, LIMITATIONS.md,
QUERY_EXAMPLES.md, QUICK-START.md, SETUP-COMPLETE.md, SUMMARY.md, USAGE.md
- README.md now includes:
* Quick start guide
* All operating modes (realtime, historic, backfill, auto)
* Data formats (CSV, JSON)
* Test metrics documentation
* Grafana dashboard setup
* Example queries and curl commands
* Time range limitations
* Troubleshooting guide
* Architecture diagram
* Best practices
π€ Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
| -rw-r--r-- | f3s/prometheus-pusher/ANSWER.md | 201 | ||||
| -rw-r--r-- | f3s/prometheus-pusher/AUTO-MODE.md | 297 | ||||
| -rw-r--r-- | f3s/prometheus-pusher/DASHBOARD.md | 151 | ||||
| -rw-r--r-- | f3s/prometheus-pusher/HISTORIC.md | 231 | ||||
| -rw-r--r-- | f3s/prometheus-pusher/LIMITATIONS.md | 267 | ||||
| -rw-r--r-- | f3s/prometheus-pusher/QUERY_EXAMPLES.md | 316 | ||||
| -rw-r--r-- | f3s/prometheus-pusher/QUICK-START.md | 105 | ||||
| -rw-r--r-- | f3s/prometheus-pusher/README.md | 602 | ||||
| -rw-r--r-- | f3s/prometheus-pusher/SETUP-COMPLETE.md | 275 | ||||
| -rw-r--r-- | f3s/prometheus-pusher/SUMMARY.md | 215 | ||||
| -rw-r--r-- | f3s/prometheus-pusher/USAGE.md | 231 |
11 files changed, 504 insertions, 2387 deletions
diff --git a/f3s/prometheus-pusher/ANSWER.md b/f3s/prometheus-pusher/ANSWER.md deleted file mode 100644 index a6339cb..0000000 --- a/f3s/prometheus-pusher/ANSWER.md +++ /dev/null @@ -1,201 +0,0 @@ -# Can the Tool Import All These Time Ranges? - -## Question -Can you import current data, data 1 hour old, data 1 day old, data 1 week old, and data 1 month old with the tool? - -## Answer: YES! β
- -The tool can now import data from **ALL** these time ranges automatically! - -## How It Works - -### Before (Manual Mode) -You had to: -1. Calculate how old your data is -2. Choose the right mode manually -3. Specify `-hours-ago` for each time range -4. Run the tool multiple times for different ages - -### After (AUTO Mode) π€ -You just: -1. Provide data with timestamps -2. Run: `./prometheus-pusher -mode=auto -file=yourdata.csv` -3. **Done!** The tool automatically detects ages and routes correctly - -## Demonstration - -I've created test data for you with **all 5 time ranges**: - -```bash -cd /home/paul/git/conf/f3s/prometheus-pusher - -# View the generated test data -cat test-all-ages.csv -``` - -**Contents** (generated with actual timestamps): -```csv -# CURRENT data (< 5min old) -app_requests_total,instance=current;env=prod,100,1767125148000 -app_temperature_celsius,instance=current;zone=us-east,22.5,1767125148000 -app_active_connections,instance=current;env=prod,50,1767125148000 - -# 1 HOUR OLD data -app_requests_total,instance=1h_ago;env=prod,95,1767121548000 -app_active_connections,instance=1h_ago;env=prod,45,1767121548000 -app_temperature_celsius,instance=1h_ago;zone=us-east,21.8,1767121548000 - -# 1 DAY OLD data -app_requests_total,instance=1d_ago;env=prod,150,1767038748000 -app_temperature_celsius,instance=1d_ago;zone=eu-west,18.3,1767038748000 -app_active_connections,instance=1d_ago;env=prod,60,1767038748000 - -# 1 WEEK OLD data -app_requests_total,instance=1w_ago;env=prod,200,1766520348000 -app_jobs_processed_total,instance=1w_ago;env=prod;job_type=email;status=success,75,1766520348000 -app_temperature_celsius,instance=1w_ago;zone=asia,25.2,1766520348000 - -# 1 MONTH OLD data -app_requests_total,instance=1m_ago;env=prod,180,1764533148000 -app_active_connections,instance=1m_ago;env=prod,30,1764533148000 -app_temperature_celsius,instance=1m_ago;zone=africa,28.7,1764533148000 -``` - -## Test It Yourself - -Once Prometheus is configured with remote write receiver: - -```bash -# 1. Port-forward services -kubectl port-forward -n monitoring svc/pushgateway 9091:9091 & -kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 & - -# 2. Import ALL time ranges in one command! -./prometheus-pusher \ - -mode=auto \ - -file=test-all-ages.csv \ - -pushgateway=http://localhost:9091 \ - -prometheus=http://localhost:9090/api/v1/write -``` - -**Expected Output**: -``` -π€ AUTO mode: Automatically detecting timestamp age and choosing ingestion method - -π Reading metrics from: test-all-ages.csv (format: csv) -π Auto-ingest summary: - Total samples: 15 - Realtime samples (< 5min old): 3 - Historic samples (> 5min old): 12 - -π Ingesting 3 REALTIME samples via Pushgateway... - Note: Pushgateway ingestion uses current timestamp -β
Successfully ingested 3 realtime samples - -β° Ingesting 12 HISTORIC samples via Remote Write... - [1/12] app_requests_total (age: 1.0 hours) - [2/12] app_active_connections (age: 1.0 hours) - [3/12] app_temperature_celsius (age: 1.0 hours) - [4/12] app_requests_total (age: 1.0 days) - [5/12] app_temperature_celsius (age: 1.0 days) - [6/12] app_active_connections (age: 1.0 days) - [7/12] app_requests_total (age: 7.0 days) - [8/12] app_jobs_processed_total (age: 7.0 days) - [9/12] app_temperature_celsius (age: 7.0 days) - [10/12] app_requests_total (age: 30.0 days) - [11/12] app_active_connections (age: 30.0 days) - [12/12] app_temperature_celsius (age: 30.0 days) -β
Successfully ingested 12 historic samples - -π Auto-ingest complete! -``` - -## Verification - -After import, query the data in Prometheus: - -```bash -# Query current data -curl 'http://localhost:9090/api/v1/query?query={instance="current"}' - -# Query 1 hour old data -curl 'http://localhost:9090/api/v1/query?query={instance="1h_ago"}' - -# Query 1 day old data -curl 'http://localhost:9090/api/v1/query?query={instance="1d_ago"}' - -# Query 1 week old data -curl 'http://localhost:9090/api/v1/query?query={instance="1w_ago"}' - -# Query 1 month old data -curl 'http://localhost:9090/api/v1/query?query={instance="1m_ago"}' - -# See all imported data -curl 'http://localhost:9090/api/v1/query?query={env="prod"}' -``` - -## Summary Table - -| Time Range | Status | Ingestion Method | Notes | -|------------|--------|------------------|-------| -| **Current** (now) | β
YES | Pushgateway | Uses "now" timestamp | -| **1 hour old** | β
YES | Remote Write | Preserves original timestamp | -| **1 day old** | β
YES | Remote Write | Preserves original timestamp | -| **1 week old** | β
YES | Remote Write | Preserves original timestamp | -| **1 month old** | β
YES | Remote Write | Preserves original timestamp | - -## Key Features - -β
**Automatic Detection** - Tool detects age, you don't calculate -β
**Smart Routing** - Chooses Pushgateway or Remote Write automatically -β
**Clear Logging** - See exactly what's happening for each metric -β
**Batch Import** - Import all ages in one go -β
**Format Support** - CSV and JSON formats -β
**No Manual Work** - Just provide timestamps, tool handles the rest - -## Pending Setup - -To use the historic data features (1h, 1d, 1w, 1m old): - -1. **Enable Remote Write Receiver** in Prometheus: - ```bash - cd /home/paul/git/conf/f3s/prometheus - helm upgrade prometheus prometheus-community/kube-prometheus-stack \ - -n monitoring -f persistence-values.yaml - ``` - -2. **Wait for Prometheus** to restart with the new flag enabled - -3. **Run the test** as shown above - -## Documentation - -- **AUTO-MODE.md** - Complete guide to auto mode -- **HISTORIC.md** - Guide to historic data ingestion -- **SETUP-COMPLETE.md** - Setup instructions -- **test-all-ages.csv** - Ready-to-use test data - -## Conclusion - -**YES**, the tool can import data from: -- β
Current time -- β
1 hour ago -- β
1 day ago -- β
1 week ago -- β
1 month ago - -And it does this **automatically** - you don't need to think about it! π - -Just run: -```bash -./prometheus-pusher -mode=auto -file=your-data.csv -``` - -The tool will: -1. Read your timestamps -2. Calculate age for each metric -3. Route to appropriate ingestion method -4. Log what it's doing -5. Complete the import - -**All changes committed and pushed to git!** diff --git a/f3s/prometheus-pusher/AUTO-MODE.md b/f3s/prometheus-pusher/AUTO-MODE.md deleted file mode 100644 index 03db0b5..0000000 --- a/f3s/prometheus-pusher/AUTO-MODE.md +++ /dev/null @@ -1,297 +0,0 @@ -# Auto Mode - Automatic Timestamp Detection - -## Overview - -The **AUTO mode** is a smart ingestion mode that: -1. **Reads metrics** with timestamps from a file or stdin -2. **Automatically detects** how old each metric is -3. **Chooses the right ingestion method**: - - Realtime data (< 5 minutes old) β Pushgateway - - Historic data (> 5 minutes old) β Remote Write API -4. **Logs what it's doing** so you can see which method is used - -**No manual timestamp calculation needed!** Just provide data with timestamps. - -## Why Use Auto Mode? - -### Problem -Previously, you had to: -- Manually calculate how old your data is -- Choose between `--mode=realtime` or `--mode=historic` -- Specify `-hours-ago` for each datapoint - -### Solution -Now you can: -- Provide data with timestamps in any format (CSV or JSON) -- The tool automatically detects age and chooses ingestion method -- Batch import mixed data (some current, some old) - -## Usage - -### From File - -```bash -# CSV format -./prometheus-pusher -mode=auto -file=metrics.csv -format=csv - -# JSON format -./prometheus-pusher -mode=auto -file=metrics.json -format=json -``` - -### From Stdin - -```bash -# Pipe CSV data -cat metrics.csv | ./prometheus-pusher -mode=auto -format=csv - -# Interactive input -./prometheus-pusher -mode=auto -format=csv -# (then paste data and press Ctrl+D) -``` - -## Input Formats - -### CSV Format - -``` -# Format: metric_name,labels,value,timestamp_ms -# Labels: key1=value1;key2=value2 - -app_requests_total,instance=web1;env=prod,100,1767125148000 -app_temperature_celsius,instance=web2;zone=us,22.5,1767038748000 -``` - -**Fields**: -1. `metric_name`: Prometheus metric name -2. `labels`: Semicolon-separated label pairs (optional) -3. `value`: Metric value (float) -4. `timestamp_ms`: Unix timestamp in milliseconds (optional, defaults to now) - -**Example**: -```csv -# Current data (no timestamp = uses now) -app_requests_total,instance=web1,100, - -# 1 hour ago -app_requests_total,instance=web2,95,1767121548000 - -# 1 day ago -app_requests_total,instance=web3,150,1767038748000 -``` - -### JSON Format - -```json -[ - { - "metric": "app_requests_total", - "labels": {"instance": "web1", "env": "prod"}, - "value": 100, - "timestamp_ms": 1767125148000 - }, - { - "metric": "app_temperature_celsius", - "labels": {"instance": "web2", "zone": "us"}, - "value": 22.5, - "timestamp_ms": 1767038748000 - } -] -``` - -**Fields**: -- `metric`: Metric name (required) -- `labels`: Object with label key-value pairs (optional) -- `value`: Metric value (required) -- `timestamp_ms`: Unix timestamp in milliseconds (optional) - -## Generating Test Data - -Use the provided script to generate test data for all time ranges: - -```bash -./generate-test-data.sh -``` - -This creates `test-all-ages.csv` with: -- Current data (< 5 min old) -- 1 hour old data -- 1 day old data -- 1 week old data -- 1 month old data - -## Example: Import All Time Ranges - -```bash -# 1. Generate test data -./generate-test-data.sh - -# 2. Port-forward Prometheus (for historic data) -kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 & - -# 3. Port-forward Pushgateway (for current data) -kubectl port-forward -n monitoring svc/pushgateway 9091:9091 & - -# 4. Auto-import all data -./prometheus-pusher \ - -mode=auto \ - -file=test-all-ages.csv \ - -format=csv \ - -pushgateway=http://localhost:9091 \ - -prometheus=http://localhost:9090/api/v1/write -``` - -**Expected Output**: -``` -π€ AUTO mode: Automatically detecting timestamp age and choosing ingestion method - -π Reading metrics from: test-all-ages.csv (format: csv) -π Auto-ingest summary: - Total samples: 15 - Realtime samples (< 5min old): 3 - Historic samples (> 5min old): 12 - -π Ingesting 3 REALTIME samples via Pushgateway... - Note: Pushgateway ingestion uses current timestamp -β
Successfully ingested 3 realtime samples - -β° Ingesting 12 HISTORIC samples via Remote Write... - [1/12] app_requests_total (age: 1.0 hours) - [2/12] app_active_connections (age: 1.0 hours) - [3/12] app_temperature_celsius (age: 1.0 hours) - [4/12] app_requests_total (age: 1.0 days) - [5/12] app_temperature_celsius (age: 1.0 days) - [6/12] app_active_connections (age: 1.0 days) - [7/12] app_requests_total (age: 7.0 days) - [8/12] app_jobs_processed_total (age: 7.0 days) - [9/12] app_temperature_celsius (age: 7.0 days) - [10/12] app_requests_total (age: 30.0 days) - [11/12] app_active_connections (age: 30.0 days) - [12/12] app_temperature_celsius (age: 30.0 days) -β
Successfully ingested 12 historic samples - -π Auto-ingest complete! -``` - -## Detection Logic - -The tool uses a **5-minute threshold**: - -| Data Age | Ingestion Method | Reason | -|----------|------------------|---------| -| < 5 minutes | Pushgateway (realtime) | Recent enough to use "now" timestamp | -| β₯ 5 minutes | Remote Write (historic) | Too old, needs preserved timestamp | - -**Why 5 minutes?** -- Allows for clock skew and processing delays -- Prometheus scrapes Pushgateway every 15-30s -- Gives buffer for network delays - -## Query Imported Data - -After import, query in Prometheus: - -```promql -# View current data (from Pushgateway) -{instance="current"} - -# View 1 hour old data -{instance="1h_ago"} - -# View 1 day old data -{instance="1d_ago"} - -# View 1 week old data -{instance="1w_ago"} - -# View 1 month old data -{instance="1m_ago"} - -# All imported data -{env="prod"} -``` - -## Flags - -``` --mode=auto Enable auto mode --file=<path> Input file (CSV or JSON) --format=<fmt> Format: csv or json (default: csv) --pushgateway=<url> Pushgateway URL (default: http://localhost:9091) --prometheus=<url> Prometheus remote write URL (default: http://localhost:9090/api/v1/write) --job=<name> Job name for metrics (default: example_metrics_pusher) -``` - -## Supported Time Ranges - -β
**Current data** (< 5 min): Works perfectly -β
**1 hour old**: Works via Remote Write -β
**1 day old**: Works via Remote Write -β
**1 week old**: Works via Remote Write -β
**1 month old**: Works via Remote Write -β οΈ **Very old data** (months/years): May hit Prometheus limits - -For very old data (> few months), consider: -- Using `promtool tsdb create-blocks-from` instead -- Increasing Prometheus retention settings -- Using long-term storage solutions - -## Benefits - -1. **No timestamp math** - Tool calculates age automatically -2. **Mixed data** - Import both current and historic data in one go -3. **Visual feedback** - See exactly which ingestion method is used -4. **Batch import** - Process large CSV/JSON files easily -5. **Error handling** - Clear messages if ingestion fails - -## Comparison with Other Modes - -| Mode | Use Case | Timestamp Handling | -|------|----------|-------------------| -| `realtime` | Live monitoring | Always uses "now" | -| `historic` | Single old datapoint | Manually specify `-hours-ago` | -| `backfill` | Range of datapoints | Manually specify range | -| `auto` | **Any mix of data** | **Automatic detection** | - -## Advanced Example: Import from Multiple Sources - -```bash -# Generate various test data -./generate-test-data.sh - -# Import yesterday's backup -./prometheus-pusher -mode=auto -file=backup_yesterday.csv - -# Import last week's logs -./prometheus-pusher -mode=auto -file=logs_lastweek.json -format=json - -# Import current metrics -./prometheus-pusher -mode=auto -file=current_metrics.csv -``` - -All data is automatically routed to the correct ingestion method! - -## Troubleshooting - -### "No valid samples found" -- Check CSV/JSON format -- Ensure timestamps are in milliseconds -- Check for syntax errors in labels - -### "Remote write receiver not enabled" -- Ensure Prometheus has `--web.enable-remote-write-receiver` flag -- Check prometheus/persistence-values.yaml configuration - -### "Pushgateway connection refused" -- Verify port-forward: `kubectl port-forward -n monitoring svc/pushgateway 9091:9091` -- Check Pushgateway is running: `kubectl get pods -n monitoring | grep pushgateway` - -## Summary - -Auto mode makes importing data effortless: -- π₯ Read data from file or stdin -- π Automatically detect timestamp age -- π― Choose optimal ingestion method -- π Clear logging of what's happening -- β
Support for all time ranges (current β 1 month old) - -No more manual timestamp calculations - just provide your data! diff --git a/f3s/prometheus-pusher/DASHBOARD.md b/f3s/prometheus-pusher/DASHBOARD.md deleted file mode 100644 index 47888ff..0000000 --- a/f3s/prometheus-pusher/DASHBOARD.md +++ /dev/null @@ -1,151 +0,0 @@ -# Grafana Dashboard for Prometheus Pusher Test Metrics - -This document describes the Grafana dashboard for visualizing metrics generated by prometheus-pusher. - -## Dashboard Overview - -The dashboard displays all test metrics with the `prometheus_pusher_test_` prefix, making it clear they are generated by the prometheus-pusher testing/demo functionality. - -## Metrics Displayed - -### 1. Request Rate -- **Type**: Line graph -- **Metric**: `rate(prometheus_pusher_test_requests_total[5m])` -- **Description**: Shows the rate of requests per second over the last 5 minutes -- **Use**: Monitor request throughput - -### 2. Total Requests -- **Type**: Stat panel -- **Metric**: `prometheus_pusher_test_requests_total` -- **Description**: Counter showing total number of requests processed -- **Display**: Large number with area graph background - -### 3. Active Connections -- **Type**: Gauge -- **Metric**: `prometheus_pusher_test_active_connections` -- **Description**: Current number of active connections (0-100) -- **Thresholds**: - - Green: 0-50 - - Yellow: 50-80 - - Red: 80-100 - -### 4. Temperature -- **Type**: Gauge -- **Metric**: `prometheus_pusher_test_temperature_celsius` -- **Description**: Current temperature in Celsius (0-50Β°C) -- **Thresholds**: - - Blue: 0-20Β°C - - Green: 20-30Β°C - - Yellow: 30-35Β°C - - Red: 35-50Β°C - -### 5. Request Duration Histogram -- **Type**: Line graph -- **Metrics**: - - `histogram_quantile(0.50, rate(prometheus_pusher_test_request_duration_seconds_bucket[5m]))` - p50 - - `histogram_quantile(0.90, rate(prometheus_pusher_test_request_duration_seconds_bucket[5m]))` - p90 - - `histogram_quantile(0.99, rate(prometheus_pusher_test_request_duration_seconds_bucket[5m]))` - p99 -- **Description**: Shows request duration percentiles over time -- **Use**: Identify latency trends and outliers - -### 6. Average Request Duration -- **Type**: Stat panel -- **Metric**: `rate(prometheus_pusher_test_request_duration_seconds_sum[5m]) / rate(prometheus_pusher_test_request_duration_seconds_count[5m])` -- **Description**: Average request duration in seconds -- **Display**: Number with area graph, 3 decimal places - -### 7. Jobs Processed by Type -- **Type**: Bar gauge -- **Metric**: `sum by (job_type) (prometheus_pusher_test_jobs_processed_total)` -- **Description**: Total jobs processed grouped by job type (email, report, backup) -- **Display**: Horizontal gradient bars - -### 8. Jobs Status Breakdown -- **Type**: Table -- **Metric**: `prometheus_pusher_test_jobs_processed_total` -- **Description**: Detailed breakdown showing job type, status, and count -- **Columns**: Job Type, Status, Count - -## Deployment - -### Prerequisites -- Grafana instance running and accessible -- Prometheus as a data source in Grafana -- Metrics being pushed to Prometheus via prometheus-pusher - -### Deploy via Script -```bash -# Port-forward to Grafana (if running in Kubernetes) -kubectl port-forward -n monitoring svc/prometheus-grafana 3000:80 - -# Deploy dashboard (in another terminal) -./deploy-dashboard.sh -``` - -### Deploy Manually -1. Open Grafana UI -2. Go to Dashboards β Import -3. Upload `grafana-dashboard.json` -4. Select Prometheus data source -5. Click Import - -### Custom Deployment -```bash -# With custom Grafana URL and credentials -GRAFANA_URL="http://grafana.example.com" \ -GRAFANA_USER="admin" \ -GRAFANA_PASSWORD="secret" \ -./deploy-dashboard.sh -``` - -## Dashboard Features - -- **Auto-refresh**: Updates every 10 seconds -- **Time range**: Last 15 minutes by default -- **Refresh intervals**: 5s, 10s, 30s, 1m, 5m -- **Shared tooltips**: Hover over graphs to see all series values -- **Dark theme**: Optimized for dark mode viewing - -## Metric Naming Convention - -All metrics use the `prometheus_pusher_test_` prefix to: -- Clearly identify them as test/demo metrics -- Distinguish from production application metrics -- Make cleanup and filtering easier -- Prevent confusion with real application data - -## Testing the Dashboard - -1. **Push test metrics**: - ```bash - ./prometheus-pusher -mode realtime -pushgateway http://localhost:9091 - ``` - -2. **Continuous updates** (for live dashboard testing): - ```bash - ./prometheus-pusher -mode realtime -pushgateway http://localhost:9091 -continuous - ``` - -3. **View in Grafana**: - - Navigate to the "Prometheus Pusher Test Metrics" dashboard - - Watch metrics update in real-time - - Interact with time ranges and refresh intervals - -## Cleanup - -To remove old test metrics from Pushgateway: -```bash -# Delete all metrics for the example_metrics_pusher job -curl -X DELETE http://localhost:9091/metrics/job/example_metrics_pusher -``` - -## Customization - -The dashboard JSON file can be customized: -- Adjust panel sizes via `gridPos` -- Change colors and thresholds -- Add new panels for additional metrics -- Modify queries and aggregations -- Update refresh intervals - -After making changes, redeploy using the deployment script. diff --git a/f3s/prometheus-pusher/HISTORIC.md b/f3s/prometheus-pusher/HISTORIC.md deleted file mode 100644 index 22c397d..0000000 --- a/f3s/prometheus-pusher/HISTORIC.md +++ /dev/null @@ -1,231 +0,0 @@ -# Historic Data Ingestion - -This document explains how to ingest historic data into Prometheus using the prometheus-pusher tool. - -## Problem - -The standard Pushgateway approach has a limitation: it doesn't support custom timestamps. When you push metrics to Pushgateway, Prometheus scrapes them with the current timestamp. This means you cannot backfill historic data (e.g., data from yesterday or last week). - -## Solution - -Prometheus supports the **Remote Write API** which accepts timestamped samples. By enabling the `remote-write-receiver` feature flag, Prometheus can accept historic data with custom timestamps via HTTP POST. - -### Limitations - -- **Out-of-order samples**: By default, Prometheus rejects samples that are older than the most recent sample for that time series -- **Time window**: Prometheus typically accepts data within a certain time window (default: up to 1 hour in the past for new series) -- **Feature flag required**: The remote write receiver must be enabled with `--enable-feature=remote-write-receiver` - -## Setup - -### 1. Enable Remote Write Receiver - -The Prometheus instance needs to be configured with the remote write receiver feature: - -```yaml -# In prometheus/persistence-values.yaml -prometheus: - prometheusSpec: - enableFeatures: - - remote-write-receiver -``` - -This has been configured and applied to the monitoring namespace Prometheus instance. - -### 2. Verify Feature is Enabled - -```bash -kubectl logs -n monitoring prometheus-prometheus-kube-prometheus-prometheus-0 | grep "remote-write-receiver" -``` - -You should see: `msg="Experimental features enabled" features=[remote-write-receiver]` - -## Usage - -The `prometheus-pusher` binary supports three modes: - -### Mode 1: Realtime (Default) - -Push current metrics to Pushgateway (same as before): - -```bash -./prometheus-pusher -mode=realtime -continuous -``` - -Options: -- `-pushgateway`: Pushgateway URL (default: http://localhost:9091) -- `-job`: Job name (default: example_metrics_pusher) -- `-continuous`: Keep pushing every 15 seconds - -### Mode 2: Historic (Single Datapoint) - -Push a single datapoint from X hours ago: - -```bash -# Port-forward Prometheus -kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 & - -# Push data from 24 hours ago -./prometheus-pusher -mode=historic -hours-ago=24 - -# Push data from 3 hours ago -./prometheus-pusher -mode=historic -hours-ago=3 - -# Push data from yesterday (48 hours ago) -./prometheus-pusher -mode=historic -hours-ago=48 -``` - -Options: -- `-prometheus`: Prometheus remote write URL (default: http://localhost:9090/api/v1/write) -- `-hours-ago`: How many hours in the past (default: 24) - -### Mode 3: Backfill (Multiple Datapoints) - -Backfill a range of historic data: - -```bash -# Backfill last 48 hours with 1-hour intervals -./prometheus-pusher -mode=backfill -start-hours=48 -end-hours=0 -interval=1 - -# Backfill last week with 6-hour intervals -./prometheus-pusher -mode=backfill -start-hours=168 -end-hours=0 -interval=6 - -# Backfill specific range (24h ago to 12h ago, every 2 hours) -./prometheus-pusher -mode=backfill -start-hours=24 -end-hours=12 -interval=2 -``` - -Options: -- `-start-hours`: Start time in hours ago (e.g., 48 = 2 days ago) -- `-end-hours`: End time in hours ago (e.g., 0 = now) -- `-interval`: Interval between datapoints in hours - -## Data Format - -Historic data is sent using the Prometheus Remote Write protocol (Protobuf): - -1. **Protocol**: HTTP POST with Protobuf payload -2. **Encoding**: Snappy compression -3. **Headers**: - - Content-Type: application/x-protobuf - - Content-Encoding: snappy - - X-Prometheus-Remote-Write-Version: 0.1.0 - -4. **Payload**: TimeSeries with custom timestamps - -Example time series: -```protobuf -TimeSeries { - Labels: [ - {Name: "__name__", Value: "app_requests_total"}, - {Name: "instance", Value: "example-app"}, - {Name: "job", Value: "historic_data"} - ], - Samples: [ - {Value: 42, Timestamp: 1735516800000} // milliseconds since epoch - ] -} -``` - -## Example: Backfill Last 24 Hours - -```bash -#!/bin/bash - -# 1. Port-forward Prometheus -kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 & -PF_PID=$! -sleep 2 - -# 2. Backfill data for every hour in the last 24 hours -cd /home/paul/git/conf/f3s/prometheus-pusher -./prometheus-pusher \ - -mode=backfill \ - -prometheus=http://localhost:9090/api/v1/write \ - -start-hours=24 \ - -end-hours=0 \ - -interval=1 - -# 3. Clean up -kill $PF_PID -``` - -## Querying Historic Data - -Once backfilled, the historic data is queryable in Prometheus: - -```promql -# View all historic data -{job="historic_data"} - -# View specific metric from historic data -app_requests_total{job="historic_data"} - -# View data from a specific time range -app_temperature_celsius{job="historic_data"}[24h] - -# Compare realtime vs historic data -app_requests_total{job="example_metrics_pusher"} # realtime -app_requests_total{job="historic_data"} # historic -``` - -## Troubleshooting - -### Error: "remote write receiver not enabled" - -``` -Error: remote write failed with status 404: remote write receiver not enabled -``` - -Solution: Ensure Prometheus has the `remote-write-receiver` feature enabled and has restarted. - -### Error: "out of order sample" - -``` -Error: sample timestamp out of order -``` - -This occurs when trying to insert data older than existing data for the same time series. Solutions: -1. Use a different job label for historic data (already done: `job="historic_data"`) -2. Enable out-of-order ingestion in Prometheus (experimental) -3. Ensure backfill starts from oldest to newest - -### Error: "sample too old" - -``` -Error: sample is too old -``` - -Prometheus has limits on how old data can be. By default: -- For existing series: can't be older than the oldest block -- For new series: typically accepts data up to 1 hour old - -Solution: For very old data (weeks/months), use `promtool tsdb create-blocks-from` instead. - -## Best Practices - -1. **Use different job labels**: Historic data uses `job="historic_data"`, realtime uses `job="example_metrics_pusher"` -2. **Backfill in order**: Always backfill from oldest to newest to avoid out-of-order rejections -3. **Small batches**: Don't overwhelm Prometheus - the tool includes 100ms delays between datapoints -4. **Verify first**: Test with a single datapoint before running large backfills -5. **Monitor errors**: Check Prometheus logs if ingestion fails - -## Limitations - -- **Very old data**: For data older than a few days, consider using `promtool` for TSDB block creation -- **High cardinality**: Be careful with label combinations - they create separate time series -- **Performance**: Large backfills can impact Prometheus performance -- **Out-of-order**: By default, Prometheus rejects out-of-order samples - -## Alternative: Using Promtool - -For very large historic datasets, you can use `promtool` to create TSDB blocks: - -```bash -# 1. Generate OpenMetrics format file -./prometheus-pusher -mode=export -output=metrics.txt - -# 2. Create blocks from the file -promtool tsdb create-blocks-from openmetrics metrics.txt /path/to/prometheus/data -``` - -This method bypasses the API and writes directly to the TSDB, but requires filesystem access. diff --git a/f3s/prometheus-pusher/LIMITATIONS.md b/f3s/prometheus-pusher/LIMITATIONS.md deleted file mode 100644 index e3fe11a..0000000 --- a/f3s/prometheus-pusher/LIMITATIONS.md +++ /dev/null @@ -1,267 +0,0 @@ -# Prometheus Ingestion Limitations - -## Time Range Limits - -### β
What Works (Tested) - -| Time Range | Status | Method | -|------------|--------|--------| -| Current (now) | β
Works | Pushgateway | -| 1 hour old | β
Works | Remote Write | -| 1 day old | β
Works | Remote Write | -| 1 week old | β
Works | Remote Write | -| 1 month old | β
Works | Remote Write | - -### β οΈ Potential Issues - -#### 1. **Future Data** (timestamps in the future) - -**Limit**: Prometheus rejects samples too far in the future - -```bash -# Default: ~5 minutes into the future is allowed -# Controlled by: --storage.tsdb.allow-out-of-order-time-window -``` - -**Example that might fail**: -```csv -# 1 hour in the future - WILL BE REJECTED -app_requests_total,instance=test,100,TIMESTAMP_1H_FUTURE -``` - -**Error**: `sample is too far in the future` - -#### 2. **Very Old Data** (months/years old) - -**Limits depend on**: -- Prometheus retention period -- TSDB block structure -- `--storage.tsdb.min-block-duration` setting - -**Typical limits**: -- **Few months old**: Usually works -- **6+ months old**: May be rejected -- **Years old**: Likely rejected unless using promtool - -**Example that might fail**: -```csv -# 6 months old - MIGHT BE REJECTED -app_requests_total,instance=test,100,TIMESTAMP_6M_AGO - -# 1 year old - LIKELY REJECTED -app_requests_total,instance=test,100,TIMESTAMP_1Y_AGO -``` - -**Error**: `sample is too old` - -#### 3. **Out-of-Order Samples** (for existing time series) - -**Problem**: If a time series already has recent data, you can't insert older data - -**Example**: -```bash -# Step 1: Push current data -echo "app_requests_total,instance=test,100,$NOW" | ./prometheus-pusher -mode=auto - -# Step 2: Try to push older data for SAME time series - WILL BE REJECTED -echo "app_requests_total,instance=test,95,$ONE_HOUR_AGO" | ./prometheus-pusher -mode=auto -``` - -**Error**: `out of order sample` - -**Workaround**: Use different labels (different time series) - -#### 4. **Pushgateway Timestamp Limitations** - -**Problem**: Pushgateway NEVER preserves timestamps - -- All data uses "now" when Prometheus scrapes -- Cannot backfill old data via Pushgateway -- Only suitable for current/recent data - -**Example**: -```csv -# Even with old timestamp, Pushgateway uses "now" -app_requests_total,instance=current,100,TIMESTAMP_1D_AGO -# β This timestamp is IGNORED by Pushgateway -``` - -## Testing Edge Cases - -Let me create a test for various edge cases: - -```bash -#!/bin/bash -# test-limits.sh - -NOW=$(date +%s)000 - -# Test cases -cat > test-limits.csv << EOF -# Edge case tests - -# 1. CURRENT - should work -app_test_current,instance=test1,100,$NOW - -# 2. 5 minutes in future - might work -app_test_future_5m,instance=test2,100,$((NOW + 300000)) - -# 3. 1 hour in future - will likely be rejected -app_test_future_1h,instance=test3,100,$((NOW + 3600000)) - -# 4. 2 months old - might work -app_test_2m_old,instance=test4,100,$((NOW - 5184000000)) - -# 5. 6 months old - might be rejected -app_test_6m_old,instance=test5,100,$((NOW - 15552000000)) - -# 6. 1 year old - likely rejected -app_test_1y_old,instance=test6,100,$((NOW - 31536000000)) - -# 7. 2 years old - very likely rejected -app_test_2y_old,instance=test7,100,$((NOW - 63072000000)) -EOF - -echo "Testing edge cases..." -./prometheus-pusher -mode=auto -file=test-limits.csv -``` - -## Prometheus Configuration Limits - -### Default Settings - -```yaml -# Prometheus default limits ---storage.tsdb.retention.time=15d # Data older than 15 days is deleted ---storage.tsdb.min-block-duration=2h # Minimum block size ---web.enable-remote-write-receiver # Must be enabled for historic data -``` - -### What These Mean for Ingestion - -1. **Retention Time** (`--storage.tsdb.retention.time`) - - Default: 15 days - - Can't ingest data older than retention period - - Check your Prometheus config - -2. **Min Block Duration** (`--storage.tsdb.min-block-duration`) - - Affects how old data can be written - - Default: 2 hours - - Older data needs to align with block boundaries - -3. **Out-of-Order Time Window** - - Default: disabled - - Can be enabled with `--enable-feature=out-of-order-ingestion` - - Allows writing old data to existing series - -## Checking Your Limits - -```bash -# Check Prometheus retention -kubectl get prometheus -n monitoring prometheus-kube-prometheus-prometheus \ - -o jsonpath='{.spec.retention}' - -# Check Prometheus logs for limits -kubectl logs -n monitoring prometheus-prometheus-kube-prometheus-prometheus-0 \ - | grep -i "retention\|block\|sample.*old\|sample.*future" -``` - -## Summary Table - -| Time Range | Ingestion | Notes | -|------------|-----------|-------| -| 1 hour future | β Rejected | "Too far in future" | -| 5 min future | β οΈ Maybe | Depends on config | -| Current | β
Works | Both methods | -| 1 hour old | β
Works | Remote Write | -| 1 day old | β
Works | Remote Write | -| 1 week old | β
Works | Remote Write | -| 1 month old | β
Works | Remote Write | -| 2 months old | β οΈ Maybe | Depends on retention | -| 6 months old | β οΈ Maybe | Likely rejected | -| 1 year old | β Rejected | Too old | -| 2+ years old | β Rejected | Way too old | - -## Solutions for Very Old Data - -If you need to ingest data older than a few months: - -### Option 1: Use promtool (Recommended for very old data) - -```bash -# 1. Export data in OpenMetrics format -cat > old-metrics.txt << EOF -# HELP app_requests_total Total requests -# TYPE app_requests_total counter -app_requests_total{instance="old"} 100 TIMESTAMP_1Y_AGO -EOF - -# 2. Create TSDB blocks directly -promtool tsdb create-blocks-from openmetrics old-metrics.txt /prometheus/data - -# 3. Restart Prometheus to load new blocks -kubectl rollout restart statefulset/prometheus-prometheus-kube-prometheus-prometheus -n monitoring -``` - -### Option 2: Adjust Prometheus Retention - -```yaml -# Increase retention to accept older data -prometheus: - prometheusSpec: - retention: 90d # Keep data for 90 days - retentionSize: 50GB -``` - -### Option 3: Enable Out-of-Order Ingestion - -```yaml -# Allow out-of-order samples -prometheus: - prometheusSpec: - enableFeatures: - - out-of-order-ingestion - additionalArgs: - - --storage.tsdb.out-of-order-time-window=30d -``` - -## Best Practices - -1. β
**Current to 1 month**: Use prometheus-pusher auto mode -2. β οΈ **1-3 months old**: Test first, may need config changes -3. β **6+ months old**: Use promtool instead -4. β **Years old**: Definitely use promtool - -## Testing Your Limits - -To find your exact limits: - -```bash -# Generate test data for various ages -./generate-test-data.sh - -# Try importing and watch for errors -./prometheus-pusher -mode=auto -file=test-limits.csv 2>&1 | tee import.log - -# Check what failed -grep -i "error\|rejected\|failed" import.log -``` - -## Error Messages Guide - -| Error Message | Meaning | Solution | -|---------------|---------|----------| -| `sample is too old` | Beyond retention | Use promtool or increase retention | -| `sample is too far in the future` | Timestamp in future | Check your clock/timestamps | -| `out of order sample` | Older than existing data | Use different labels or enable OOO | -| `remote write receiver not enabled` | Feature not enabled | Enable --web.enable-remote-write-receiver | -| `sample timestamp out of order` | Wrong order in batch | Sort by timestamp | - -## Conclusion - -**Practical Limits for prometheus-pusher**: -- β
**Safe range**: Current to 1 month old -- β οΈ **Test first**: 1-3 months old -- β **Use promtool**: 3+ months old - -The 1 month limit we tested (and works!) is a safe, practical upper bound for most use cases. diff --git a/f3s/prometheus-pusher/QUERY_EXAMPLES.md b/f3s/prometheus-pusher/QUERY_EXAMPLES.md deleted file mode 100644 index 29b4c6d..0000000 --- a/f3s/prometheus-pusher/QUERY_EXAMPLES.md +++ /dev/null @@ -1,316 +0,0 @@ -# Prometheus Query Examples - Data Ingestion Verification - -This document shows actual curl commands and their outputs querying data ingested by prometheus-pusher. - -## Data Ingested - -We ingested the following metrics using realtime mode (Pushgateway): -- Counter: `app_requests_total` -- Gauges: `app_active_connections`, `app_temperature_celsius` -- Histogram: `app_request_duration_seconds` -- Labeled Counter: `app_jobs_processed_total` - ---- - -## Query Examples - -### Query 1: Counter Metric - Total Requests - -**Command:** -```bash -curl -s "http://localhost:9090/api/v1/query?query=app_requests_total" -``` - -**Output:** -```json -{ - "status": "success", - "data": { - "resultType": "vector", - "result": [ - { - "metric": { - "__name__": "app_requests_total", - "instance": "example-app", - "job": "example_metrics_pusher" - }, - "value": [1767127978.666, "4"] - } - ] - } -} -``` - -**Explanation:** Counter showing 4 total requests processed. The timestamp `1767127978.666` is Unix epoch time (seconds since 1970-01-01). - ---- - -### Query 2: Gauge Metric - Temperature - -**Command:** -```bash -curl -s "http://localhost:9090/api/v1/query?query=app_temperature_celsius" -``` - -**Output:** -```json -{ - "status": "success", - "data": { - "resultType": "vector", - "result": [ - { - "metric": { - "__name__": "app_temperature_celsius", - "instance": "example-app", - "job": "example_metrics_pusher" - }, - "value": [1767127980.789, "30.836861056300393"] - } - ] - } -} -``` - -**Explanation:** Gauge showing current temperature of 30.84Β°C. - ---- - -### Query 3: Gauge Metric - Active Connections - -**Command:** -```bash -curl -s "http://localhost:9090/api/v1/query?query=app_active_connections" -``` - -**Output:** -```json -{ - "status": "success", - "data": { - "resultType": "vector", - "result": [ - { - "metric": { - "__name__": "app_active_connections", - "instance": "example-app", - "job": "example_metrics_pusher" - }, - "value": [1767127982.964, "32"] - } - ] - } -} -``` - -**Explanation:** Gauge showing 32 currently active connections. - ---- - -### Query 4: Labeled Counter - Jobs Processed by Type and Status - -**Command:** -```bash -curl -s "http://localhost:9090/api/v1/query?query=app_jobs_processed_total" -``` - -**Output:** -```json -{ - "status": "success", - "data": { - "resultType": "vector", - "result": [ - { - "metric": { - "__name__": "app_jobs_processed_total", - "instance": "example-app", - "job": "example_metrics_pusher", - "job_type": "backup", - "status": "failed" - }, - "value": [1767127993.729, "3"] - }, - { - "metric": { - "__name__": "app_jobs_processed_total", - "instance": "example-app", - "job": "example_metrics_pusher", - "job_type": "email", - "status": "success" - }, - "value": [1767127993.729, "3"] - }, - { - "metric": { - "__name__": "app_jobs_processed_total", - "instance": "example-app", - "job": "example_metrics_pusher", - "job_type": "report", - "status": "success" - }, - "value": [1767127993.729, "1"] - } - ] - } -} -``` - -**Explanation:** Labeled counter with multiple time series showing job processing by type (backup, email, report) and status (success, failed). - ---- - -### Query 5: Histogram - Request Duration (Buckets) - -**Command:** -```bash -curl -s "http://localhost:9090/api/v1/query?query=app_request_duration_seconds_bucket" -``` - -**Output (truncated):** -```json -{ - "status": "success", - "data": { - "resultType": "vector", - "result": [ - { - "metric": { - "__name__": "app_request_duration_seconds_bucket", - "instance": "example-app", - "job": "example_metrics_pusher", - "le": "0.005" - }, - "value": [1767127997.104, "0"] - }, - { - "metric": { - "__name__": "app_request_duration_seconds_bucket", - "instance": "example-app", - "job": "example_metrics_pusher", - "le": "0.01" - }, - "value": [1767127997.104, "0"] - }, - { - "metric": { - "__name__": "app_request_duration_seconds_bucket", - "instance": "example-app", - "job": "example_metrics_pusher", - "le": "0.025" - }, - "value": [1767127997.104, "0"] - } - // ... more buckets: 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10, +Inf - ] - } -} -``` - -**Explanation:** Histogram buckets showing cumulative counts of request durations. Used for percentile calculations. - ---- - -### Query 6: Histogram - Sum and Count - -**Sum Command:** -```bash -curl -s "http://localhost:9090/api/v1/query?query=app_request_duration_seconds_sum" -``` - -**Sum Output:** -```json -{ - "status": "success", - "data": { - "resultType": "vector", - "result": [ - { - "metric": { - "__name__": "app_request_duration_seconds_sum", - "instance": "example-app", - "job": "example_metrics_pusher" - }, - "value": [1767128000.778, "2.4976701293337467"] - } - ] - } -} -``` - -**Count Command:** -```bash -curl -s "http://localhost:9090/api/v1/query?query=app_request_duration_seconds_count" -``` - -**Count Output:** -```json -{ - "status": "success", - "data": { - "resultType": "vector", - "result": [ - { - "metric": { - "__name__": "app_request_duration_seconds_count", - "instance": "example-app", - "job": "example_metrics_pusher" - }, - "value": [1767128000.832, "3"] - } - ] - } -} -``` - -**Explanation:** -- Sum: Total of all request durations = 2.498 seconds -- Count: Total number of requests = 3 -- Average duration = 2.498 / 3 = 0.833 seconds per request - ---- - -## Verification Summary - -β
**All metric types successfully ingested:** -- Counter: `app_requests_total` = 4 -- Gauge: `app_temperature_celsius` = 30.84Β°C -- Gauge: `app_active_connections` = 32 -- Labeled Counter: `app_jobs_processed_total` (3 series with different labels) -- Histogram: `app_request_duration_seconds` (buckets, sum, count) - -β
**Data queryable via Prometheus API** -β
**Timestamps preserved correctly** -β
**Labels attached properly** -β
**All metric types working as expected** - ---- - -## Additional Query Examples - -### Filter by Specific Label -```bash -curl -s 'http://localhost:9090/api/v1/query?query=app_jobs_processed_total{job_type="email"}' -``` - -### Range Query (Last 10 Minutes) -```bash -START=$(date -d '10 minutes ago' +%s) -END=$(date +%s) -curl -s "http://localhost:9090/api/v1/query_range?query=app_requests_total&start=${START}&end=${END}&step=60" -``` - -### Calculate Rate (Requests per Second over 5m) -```bash -curl -s 'http://localhost:9090/api/v1/query?query=rate(app_requests_total[5m])' -``` - -### Sum Aggregation -```bash -curl -s 'http://localhost:9090/api/v1/query?query=sum(app_jobs_processed_total)' -``` - ---- - -Generated: 2025-12-30 -Tool: prometheus-pusher (refactored version with 63.9% test coverage) diff --git a/f3s/prometheus-pusher/QUICK-START.md b/f3s/prometheus-pusher/QUICK-START.md deleted file mode 100644 index 46b0a29..0000000 --- a/f3s/prometheus-pusher/QUICK-START.md +++ /dev/null @@ -1,105 +0,0 @@ -# Quick Start - Single Binary, All Features - -## One Binary: `prometheus-pusher` - -All features in one tool! Choose your mode: - -### π Realtime Mode (Default) -Push current metrics to Pushgateway - -```bash -./prometheus-pusher -mode=realtime -continuous -``` - -### β° Historic Mode -Push single datapoint from the past - -```bash -./prometheus-pusher -mode=historic -hours-ago=24 -``` - -### π¦ Backfill Mode -Import range of historic data - -```bash -./prometheus-pusher -mode=backfill -start-hours=48 -end-hours=0 -interval=1 -``` - -### π€ Auto Mode (Recommended!) -Automatically detect timestamp age and route correctly - -```bash -./prometheus-pusher -mode=auto -file=data.csv -``` - -## Quick Examples - -### Import Current, 1h, 1d, 1w, 1m Old Data (All at Once!) - -```bash -# 1. Generate test data -./generate-test-data.sh - -# 2. Port-forward services -kubectl port-forward -n monitoring svc/pushgateway 9091:9091 & -kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 & - -# 3. Auto-import everything -./prometheus-pusher -mode=auto -file=test-all-ages.csv - -# Output: -# π Auto-ingest summary: -# Realtime samples (< 5min old): 3 -# Historic samples (> 5min old): 12 -# π Ingesting REALTIME via Pushgateway... -# β° Ingesting HISTORIC via Remote Write... -# [1/12] app_requests_total (age: 1.0 hours) -# [4/12] app_temperature_celsius (age: 1.0 days) -# [7/12] app_requests_total (age: 7.0 days) -# [10/12] app_requests_total (age: 30.0 days) -# π Auto-ingest complete! -``` - -## Data Format (CSV) - -```csv -# metric_name,labels,value,timestamp_ms -app_requests_total,instance=web1;env=prod,100,1767125148000 -app_temperature_celsius,instance=web2,22.5,1767038748000 -``` - -## All Modes in One Command - -```bash -# See all options -./prometheus-pusher -help - -# Modes: -# realtime - Push current metrics to Pushgateway -# historic - Push single historic datapoint -# backfill - Backfill range of datapoints -# auto - Automatically detect and route -``` - -## Documentation - -- `ANSWER.md` - Can it import all time ranges? YES! -- `AUTO-MODE.md` - Complete auto mode guide -- `HISTORIC.md` - Historic data ingestion details -- `README.md` - Project overview -- `USAGE.md` - Detailed usage guide - -## Summary - -β
**One binary** - No confusion -β
**Four modes** - All use cases covered -β
**Auto detection** - No manual timestamp calculation -β
**All time ranges** - Current to 1 month old -β
**Clear logging** - See exactly what's happening - -Just run: -```bash -./prometheus-pusher -mode=auto -file=your-data.csv -``` - -Done! π diff --git a/f3s/prometheus-pusher/README.md b/f3s/prometheus-pusher/README.md index d107299..415cdde 100644 --- a/f3s/prometheus-pusher/README.md +++ b/f3s/prometheus-pusher/README.md @@ -1,151 +1,557 @@ # Prometheus Pusher -A standalone Go binary that pushes metrics to Prometheus via Pushgateway. +A versatile Go tool for pushing metrics to Prometheus with support for both realtime and historic data ingestion. + +## Overview + +**prometheus-pusher** is a standalone binary that: +- **Generates** realistic example metrics simulating production applications +- **Pushes** metrics via Pushgateway (realtime) or Remote Write API (historic) +- **Automatically detects** timestamp age and chooses the optimal ingestion method +- **Supports** multiple data formats (CSV, JSON) and all Prometheus metric types +- **Provides** Grafana dashboard for visualizing test metrics ## Quick Start +### 1. Deploy Pushgateway (one-time setup) + ```bash -# 1. Deploy Pushgateway (one-time - see /home/paul/git/conf/f3s/pushgateway/) cd /home/paul/git/conf/f3s/pushgateway/helm-chart -helm upgrade --install pushgateway . -n monitoring +helm upgrade --install pushgateway . -n monitoring --create-namespace +``` -# 2. Run the binary +### 2. Run in Realtime Mode + +```bash +# Port-forward Pushgateway +kubectl port-forward -n monitoring svc/pushgateway 9091:9091 & + +# Push test metrics continuously cd /home/paul/git/conf/f3s/prometheus-pusher -./run.sh +./prometheus-pusher -mode=realtime -continuous ``` -That's it! The binary will push metrics every 15 seconds. Press Ctrl+C to stop. +The binary pushes metrics every 15 seconds. Press Ctrl+C to stop. -## Overview +### 3. View Metrics -This project consists of: -1. **Pushgateway** - A Kubernetes service that receives pushed metrics -2. **prometheus-pusher** - A standalone Go binary that generates and pushes example metrics +```bash +# Pushgateway UI +open http://localhost:9091 -## Metric Types Demonstrated +# Prometheus UI +kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 & +open http://localhost:9090 +``` -The application pushes the following types of metrics: +## Operating Modes -### Counter (`app_requests_total`) -- Monotonically increasing value -- Example: Total number of requests processed -- Use case: Counting events, total requests, errors, etc. +### π Realtime Mode (Default) +Push current metrics to Pushgateway with "now" timestamp. -### Gauge (`app_active_connections`, `app_temperature_celsius`) -- Value that can increase or decrease -- Examples: Active connections, temperature, memory usage -- Use case: Current state measurements +```bash +./prometheus-pusher -mode=realtime -continuous +``` -### Histogram (`app_request_duration_seconds`) -- Samples observations and counts them in configurable buckets -- Example: Request duration distribution -- Use case: Latency measurements, response times +**Options:** +- `-pushgateway` - Pushgateway URL (default: http://localhost:9091) +- `-job` - Job name (default: example_metrics_pusher) +- `-continuous` - Keep pushing every 15 seconds -### Counter with Labels (`app_jobs_processed_total`) -- Counter with dimensional labels -- Labels: `job_type` (email, report, backup), `status` (success, failed) -- Use case: Categorized counting +### β° Historic Mode +Push a single datapoint from the past using Remote Write API. -## Project Structure +```bash +# Port-forward Prometheus +kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 & +# Push data from 24 hours ago +./prometheus-pusher -mode=historic -hours-ago=24 ``` -prometheus-pusher/ -βββ main.go # Go source code -βββ go.mod / go.sum # Go dependencies -βββ prometheus-pusher # Compiled binary (standalone executable) -βββ run.sh # Helper script to run the binary -βββ example-metrics.txt # Example of metrics format -βββ USAGE.md # Detailed usage guide -βββ README.md # This file -Note: Pushgateway Helm chart is located at /home/paul/git/conf/f3s/pushgateway/ +**Options:** +- `-prometheus` - Prometheus URL (default: http://localhost:9090/api/v1/write) +- `-hours-ago` - Hours in the past (default: 24) + +### π¦ Backfill Mode +Import a range of historic data points. + +```bash +# Backfill last 48 hours with 1-hour intervals +./prometheus-pusher -mode=backfill -start-hours=48 -end-hours=0 -interval=1 + +# Backfill last week with 6-hour intervals +./prometheus-pusher -mode=backfill -start-hours=168 -end-hours=0 -interval=6 +``` + +**Options:** +- `-start-hours` - Start time in hours ago +- `-end-hours` - End time in hours ago (0 = now) +- `-interval` - Interval between points in hours + +### π€ Auto Mode (Recommended!) +Automatically detect timestamp age and route to the correct ingestion method. + +```bash +# Generate test data +./generate-test-data.sh + +# Import mixed current and historic data +./prometheus-pusher -mode=auto -file=test-all-ages.csv ``` -## What It Does +**Detection Logic:** +- Data < 5 minutes old β Pushgateway (realtime) +- Data β₯ 5 minutes old β Remote Write (historic) + +**Options:** +- `-file` - Input file path +- `-format` - Data format: csv or json (default: csv) +- `-pushgateway` - Pushgateway URL +- `-prometheus` - Prometheus Remote Write URL + +## Data Formats + +### CSV Format + +```csv +# Format: metric_name,labels,value,timestamp_ms +# Labels: key1=value1;key2=value2 +prometheus_pusher_test_requests_total,instance=web1;env=prod,100,1767125148000 +prometheus_pusher_test_temperature_celsius,instance=web2,22.5,1767038748000 + +# Timestamp is optional (uses "now" if omitted) +prometheus_pusher_test_active_connections,instance=web3,42, +``` + +### JSON Format + +```json +[ + { + "metric": "prometheus_pusher_test_requests_total", + "labels": {"instance": "web1", "env": "prod"}, + "value": 100, + "timestamp_ms": 1767125148000 + }, + { + "metric": "prometheus_pusher_test_temperature_celsius", + "labels": {"instance": "web2"}, + "value": 22.5, + "timestamp_ms": 1767038748000 + } +] +``` + +## Test Metrics + +All generated metrics use the `prometheus_pusher_test_` prefix to clearly identify them as test data. + +### Counter: `prometheus_pusher_test_requests_total` +- **Type:** Counter (monotonically increasing) +- **Description:** Total number of requests processed +- **Use case:** Counting total events, requests, errors + +### Gauge: `prometheus_pusher_test_active_connections` +- **Type:** Gauge (can increase or decrease) +- **Description:** Current number of active connections (0-100) +- **Use case:** Current state measurements, capacity -The `prometheus-pusher` binary: -- **Generates** realistic example metrics simulating a production application -- **Pushes** metrics to Pushgateway every 15 seconds using HTTP POST -- **Demonstrates** all major Prometheus metric types with practical examples +### Gauge: `prometheus_pusher_test_temperature_celsius` +- **Type:** Gauge +- **Description:** Current temperature in Celsius (0-50Β°C) +- **Use case:** Environmental monitoring -The metrics flow: `Go Binary β Pushgateway β Prometheus β Grafana` +### Histogram: `prometheus_pusher_test_request_duration_seconds` +- **Type:** Histogram (distribution) +- **Description:** Request duration distribution +- **Buckets:** 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10 seconds +- **Use case:** Latency measurements, SLO tracking -## Example Metrics Format +### Labeled Counter: `prometheus_pusher_test_jobs_processed_total` +- **Type:** Counter with labels +- **Description:** Jobs processed by type and status +- **Labels:** + - `job_type`: email, report, backup + - `status`: success, failed +- **Use case:** Categorized counting, multi-dimensional metrics -The pusher sends metrics in Prometheus format to the Pushgateway. Here's what the data looks like: +## Grafana Dashboard +A comprehensive dashboard is available showcasing all test metrics. + +### Dashboard Features + +- **8 Panels:** + 1. Request Rate (line graph) + 2. Total Requests (stat panel) + 3. Active Connections (gauge with thresholds) + 4. Temperature (gauge with thresholds) + 5. Request Duration Histogram (p50, p90, p99) + 6. Average Request Duration (stat) + 7. Jobs Processed by Type (bar gauge) + 8. Jobs Status Breakdown (table) + +- **Auto-refresh:** Every 10 seconds +- **Time range:** Last 15 minutes (customizable) +- **Dark theme optimized** + +### Deploy Dashboard + +#### Option 1: Helm/Kubernetes ConfigMap (Recommended) + +```bash +# Deploy via Kubernetes ConfigMap +kubectl apply -f ../prometheus/prometheus-pusher-dashboard.yaml ``` -# HELP app_requests_total Total number of requests processed -# TYPE app_requests_total counter -app_requests_total{instance="example-app",job="example_metrics_pusher"} 42 -# HELP app_active_connections Number of currently active connections -# TYPE app_active_connections gauge -app_active_connections{instance="example-app",job="example_metrics_pusher"} 67 +The dashboard will be automatically discovered by Grafana. -# HELP app_temperature_celsius Current temperature in Celsius -# TYPE app_temperature_celsius gauge -app_temperature_celsius{instance="example-app",job="example_metrics_pusher"} 23.5 +#### Option 2: Manual Import -# HELP app_request_duration_seconds Histogram of request duration in seconds -# TYPE app_request_duration_seconds histogram -app_request_duration_seconds_bucket{instance="example-app",job="example_metrics_pusher",le="0.005"} 2 -app_request_duration_seconds_bucket{instance="example-app",job="example_metrics_pusher",le="0.01"} 3 -app_request_duration_seconds_bucket{instance="example-app",job="example_metrics_pusher",le="+Inf"} 10 -app_request_duration_seconds_sum{instance="example-app",job="example_metrics_pusher"} 8.5 -app_request_duration_seconds_count{instance="example-app",job="example_metrics_pusher"} 10 +```bash +# Port-forward Grafana +kubectl port-forward -n monitoring svc/prometheus-grafana 3000:80 -# HELP app_jobs_processed_total Total number of jobs processed by type -# TYPE app_jobs_processed_total counter -app_jobs_processed_total{instance="example-app",job="example_metrics_pusher",job_type="email",status="success"} 15 -app_jobs_processed_total{instance="example-app",job="example_metrics_pusher",job_type="email",status="failed"} 2 -app_jobs_processed_total{instance="example-app",job="example_metrics_pusher",job_type="report",status="success"} 8 -app_jobs_processed_total{instance="example-app",job="example_metrics_pusher",job_type="backup",status="success"} 12 +# Open Grafana +open http://localhost:3000 + +# Go to Dashboards β Import β Upload grafana-dashboard.json ``` -## Querying Metrics in Prometheus +#### Option 3: Automated Script -Once configured, you can query these metrics in Prometheus: +```bash +# Deploy via API +./deploy-dashboard.sh + +# Or with custom credentials +GRAFANA_URL="http://localhost:3000" \ +GRAFANA_USER="admin" \ +GRAFANA_PASSWORD="yourpassword" \ +./deploy-dashboard.sh +``` + +## Example Queries + +### Basic Queries ```promql -# View request rate -rate(app_requests_total[5m]) +# View total requests +prometheus_pusher_test_requests_total + +# View request rate over last 5 minutes +rate(prometheus_pusher_test_requests_total[5m]) # View current active connections -app_active_connections +prometheus_pusher_test_active_connections + +# View current temperature +prometheus_pusher_test_temperature_celsius +``` + +### Histogram Queries + +```promql +# 95th percentile request duration +histogram_quantile(0.95, rate(prometheus_pusher_test_request_duration_seconds_bucket[5m])) + +# 50th percentile (median) +histogram_quantile(0.50, rate(prometheus_pusher_test_request_duration_seconds_bucket[5m])) + +# Average request duration +rate(prometheus_pusher_test_request_duration_seconds_sum[5m]) / +rate(prometheus_pusher_test_request_duration_seconds_count[5m]) +``` + +### Labeled Counter Queries + +```promql +# Failed jobs by type +prometheus_pusher_test_jobs_processed_total{status="failed"} + +# Job success rate +rate(prometheus_pusher_test_jobs_processed_total{status="success"}[5m]) / +rate(prometheus_pusher_test_jobs_processed_total[5m]) + +# Total jobs by type +sum by (job_type) (prometheus_pusher_test_jobs_processed_total) +``` + +### Curl Examples + +```bash +# Port-forward Prometheus +kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 & + +# Query total requests +curl -s "http://localhost:9090/api/v1/query?query=prometheus_pusher_test_requests_total" | jq . + +# Query temperature +curl -s "http://localhost:9090/api/v1/query?query=prometheus_pusher_test_temperature_celsius" | jq . + +# Query request rate +curl -s "http://localhost:9090/api/v1/query?query=rate(prometheus_pusher_test_requests_total[5m])" | jq . + +# Query histogram p95 +curl -s "http://localhost:9090/api/v1/query?query=histogram_quantile(0.95,rate(prometheus_pusher_test_request_duration_seconds_bucket[5m]))" | jq . +``` + +## Time Range Limitations + +### β
Supported Time Ranges -# View 95th percentile request duration -histogram_quantile(0.95, rate(app_request_duration_seconds_bucket[5m])) +| Time Range | Status | Method | +|------------|--------|--------| +| Current (< 5 min) | β
Works | Pushgateway | +| 1 hour old | β
Works | Remote Write | +| 1 day old | β
Works | Remote Write | +| 1 week old | β
Works | Remote Write | +| 1 month old | β
Works | Remote Write | -# View failed jobs by type -app_jobs_processed_total{status="failed"} +### β οΈ Potential Issues -# View job success rate -rate(app_jobs_processed_total{status="success"}[5m]) / rate(app_jobs_processed_total[5m]) +- **Future timestamps:** Rejected (> 5 minutes in future) +- **Very old data (6+ months):** May be rejected depending on Prometheus retention +- **Years old:** Likely rejected - use `promtool tsdb create-blocks-from` instead +- **Out-of-order samples:** Can't insert older data into existing time series (use different labels) + +### Prometheus Configuration + +Check your retention settings: + +```bash +# View retention +kubectl get prometheus -n monitoring prometheus-kube-prometheus-prometheus \ + -o jsonpath='{.spec.retention}' + +# Default is typically 15 days +``` + +For very old data: +- Increase retention in Prometheus config +- Enable out-of-order ingestion (experimental) +- Use `promtool` for direct TSDB block creation + +## Project Structure + +``` +prometheus-pusher/ +βββ cmd/ +β βββ prometheus-pusher/ +β βββ main.go # Main entry point +βββ internal/ +β βββ config/ # Configuration +β βββ metrics/ # Metric generators +β βββ parser/ # CSV/JSON parsers +β βββ ingester/ # Pushgateway & Remote Write ingesters +βββ prometheus-pusher # Compiled binary +βββ grafana-dashboard.json # Grafana dashboard definition +βββ deploy-dashboard.sh # Dashboard deployment script +βββ generate-test-data.sh # Test data generator +βββ run.sh # Helper script +βββ README.md # This file +``` + +## Setup Requirements + +### 1. Enable Prometheus Remote Write Receiver + +For historic data ingestion, Prometheus needs the remote write receiver enabled: + +```yaml +# In prometheus/persistence-values.yaml +prometheus: + prometheusSpec: + enableFeatures: + - remote-write-receiver +``` + +### 2. Update Prometheus Scrape Config + +Ensure Pushgateway is in scrape targets: + +```yaml +# additional-scrape-configs.yaml +- job_name: 'pushgateway' + honor_labels: true + static_configs: + - targets: + - 'pushgateway.monitoring.svc.cluster.local:9091' +``` + +Apply the configuration: + +```bash +kubectl create secret generic additional-scrape-configs \ + --from-file=/home/paul/git/conf/f3s/prometheus/additional-scrape-configs.yaml \ + --dry-run=client -o yaml -n monitoring | kubectl apply -f - +``` + +## Building from Source + +```bash +# Build binary +go build -o prometheus-pusher cmd/prometheus-pusher/main.go + +# Run tests +go test ./... -v + +# Check test coverage +go test ./... -cover +``` + +## Troubleshooting + +### Binary can't connect to Pushgateway + +```bash +# Check port-forward is running +ps aux | grep "port-forward.*9091" + +# Restart port-forward +kubectl port-forward -n monitoring svc/pushgateway 9091:9091 +``` + +### Metrics not appearing in Prometheus + +```bash +# Check Pushgateway has metrics +curl http://localhost:9091/metrics | grep "prometheus_pusher_test" + +# Check Prometheus scrape targets +# Open http://localhost:9090/targets - look for "pushgateway" job + +# Check Prometheus logs +kubectl logs -n monitoring -l app.kubernetes.io/name=prometheus +``` + +### "Remote write receiver not enabled" error + +```bash +# Verify feature is enabled +kubectl logs -n monitoring prometheus-prometheus-kube-prometheus-prometheus-0 | grep "remote-write-receiver" + +# Should see: msg="Experimental features enabled" features=[remote-write-receiver] ``` -## Configuration +### "Out of order sample" error -The pusher is configured to: -- Push metrics every 15 seconds -- Use job name: `example_metrics_pusher` -- Use instance label: `example-app` -- Connect to Pushgateway at: `http://pushgateway.monitoring.svc.cluster.local:9091` +This occurs when trying to insert data older than existing data for the same time series. -## How It Works +**Solutions:** +- Use different job labels for historic data (e.g., `job="historic_data"`) +- Enable out-of-order ingestion in Prometheus (experimental) +- Ensure backfill goes from oldest to newest -1. The Go application generates random example metrics simulating a real application -2. Metrics are pushed to the Pushgateway via HTTP POST -3. Prometheus scrapes the Pushgateway periodically -4. Metrics become available in Prometheus for querying and alerting -5. Grafana can visualize these metrics +### Dashboard not appearing in Grafana + +```bash +# Check ConfigMap exists +kubectl get configmap -n monitoring | grep prometheus-pusher + +# Check labels +kubectl get configmap prometheus-pusher-dashboard -n monitoring -o yaml | grep "grafana_dashboard" + +# Restart Grafana to force reload +kubectl rollout restart deployment/prometheus-grafana -n monitoring +``` + +## Architecture + +``` +βββββββββββββββββββ +β Go Binary β +β (prometheus- βββPush realtimeβββ +β pusher) β β +βββββββββββββββββββ βΌ + β ββββββββββββββββββββ + β β Pushgateway ββββScrapeβββ + β β (Port 9091) β β + β ββββββββββββββββββββ β + β β + βββPush historicβββββββββββββββββββ β + βΌ β + βββββββββββββββββββ β + β Prometheus βββββββ + β (Port 9090) β + β Remote Write APIβ + βββββββββββββββββββ + β + β Datasource + βΌ + βββββββββββββββββββ + β Grafana β + β (Port 3000) β + β Dashboards β + βββββββββββββββββββ +``` ## Best Practices -- Use Pushgateway for batch jobs, short-lived processes, or service-level metrics -- For long-running applications, prefer exposing a `/metrics` endpoint for Prometheus to scrape -- Include meaningful labels but avoid high-cardinality labels (e.g., user IDs, timestamps) -- Use appropriate metric types: - - Counter for cumulative values - - Gauge for point-in-time values - - Histogram/Summary for distributions +### When to Use Pushgateway vs. Remote Write + +**Use Pushgateway (realtime mode):** +- Short-lived batch jobs +- Service-level metrics +- Jobs behind firewalls +- Current/recent data (< 5 minutes old) + +**Use Remote Write (historic mode):** +- Historic data import +- Backfilling gaps +- Data migration +- Data older than 5 minutes + +**Use Auto Mode:** +- Mixed current and historic data +- Importing from files +- Unknown timestamp ages +- General-purpose ingestion + +### Metric Design + +- **Use appropriate metric types:** + - Counter for cumulative values (requests, errors) + - Gauge for point-in-time values (temperature, connections) + - Histogram for distributions (latency, sizes) + +- **Label cardinality:** + - Include meaningful labels + - Avoid high-cardinality labels (user IDs, timestamps) + - Keep label combinations reasonable (< 1000 per metric) + +- **Naming conventions:** + - Use descriptive names + - Include units in gauge names (\_celsius, \_bytes) + - Use \_total suffix for counters + +## Cleanup + +```bash +# Stop port-forwards +pkill -f "port-forward.*9091" +pkill -f "port-forward.*9090" +pkill -f "port-forward.*3000" + +# Delete test metrics from Pushgateway +curl -X DELETE http://localhost:9091/metrics/job/example_metrics_pusher + +# Uninstall Pushgateway (if needed) +helm uninstall pushgateway -n monitoring +``` + +## Additional Resources + +- [Prometheus Documentation](https://prometheus.io/docs/) +- [Pushgateway Documentation](https://github.com/prometheus/pushgateway) +- [Prometheus Remote Write Spec](https://prometheus.io/docs/concepts/remote_write_spec/) +- [Grafana Documentation](https://grafana.com/docs/) + +## Version + +Current version: 0.0.0 + +## License + +See LICENSE file for details. diff --git a/f3s/prometheus-pusher/SETUP-COMPLETE.md b/f3s/prometheus-pusher/SETUP-COMPLETE.md deleted file mode 100644 index c4d6430..0000000 --- a/f3s/prometheus-pusher/SETUP-COMPLETE.md +++ /dev/null @@ -1,275 +0,0 @@ -# Historic Data Ingestion - Setup Complete - -## β
What Was Done - -### 1. Extended prometheus-pusher for Historic Data - -The application now supports three modes: - -**Mode 1: Realtime** (Original behavior) -- Pushes current metrics to Pushgateway -- Prometheus scrapes with current timestamp -- Use for ongoing monitoring - -**Mode 2: Historic** (NEW) -- Push single datapoint with custom timestamp -- Specify hours ago (e.g., 24 = yesterday) -- Uses Prometheus Remote Write API - -**Mode 3: Backfill** (NEW) -- Backfill range of historic data -- Specify start, end, and interval -- Batch ingestion for large datasets - -### 2. Code Structure - -``` -prometheus-pusher/ -βββ main.go # Main entry point with mode selection -βββ realtime.go # Original Pushgateway functionality -βββ historic.go # NEW: Remote Write with timestamps -βββ prometheus-pusher # NEW: Binary with all modes -βββ HISTORIC.md # Complete documentation -``` - -### 3. Technical Implementation - -**Remote Write Protocol**: -- Format: Protobuf (prompb.WriteRequest) -- Encoding: Snappy compression -- Headers: X-Prometheus-Remote-Write-Version: 0.1.0 -- Endpoint: /api/v1/write - -**Key Insight**: Pushgateway doesn't support timestamps, but Remote Write does! - -### 4. Prometheus Configuration Update - -Updated `/home/paul/git/conf/f3s/prometheus/persistence-values.yaml`: - -```yaml -prometheus: - prometheusSpec: - additionalArgs: - - name: web.enable-remote-write-receiver - value: "true" -``` - -This enables Prometheus to accept historic data via Remote Write API. - -## β οΈ Pending: Cluster Issue - -The Kubernetes cluster became unreachable during the final step. Once the cluster is back: - -### Complete the Setup - -```bash -# 1. Apply the Prometheus configuration -cd /home/paul/git/conf/f3s/prometheus -helm upgrade prometheus prometheus-community/kube-prometheus-stack \ - -n monitoring \ - -f persistence-values.yaml - -# 2. Wait for Prometheus to restart -kubectl rollout status statefulset/prometheus-prometheus-kube-prometheus-prometheus \ - -n monitoring --timeout=120s - -# 3. Verify remote write receiver is enabled -kubectl logs -n monitoring prometheus-prometheus-kube-prometheus-prometheus-0 \ - | grep "enable-remote-write-receiver" - -# Should see: level=INFO msg="Starting Prometheus" ... web.enable-remote-write-receiver=true -``` - -## π§ͺ Testing Historic Data Ingestion - -Once the cluster is back and configured: - -### Test 1: Single Historic Datapoint - -```bash -cd /home/paul/git/conf/f3s/prometheus-pusher - -# Port-forward Prometheus -kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 & - -# Push data from 24 hours ago -./prometheus-pusher \ - -mode=historic \ - -hours-ago=24 \ - -prometheus=http://localhost:9090/api/v1/write - -# Expected output: -# Successfully pushed historic data for 24 hours ago -``` - -### Test 2: Query Historic Data - -```bash -# Query the historic data -curl -s 'http://localhost:9090/api/v1/query?query=app_requests_total{job="historic_data"}' \ - | python3 -m json.tool - -# Should see data with timestamp from 24 hours ago -``` - -### Test 3: Backfill Multiple Datapoints - -```bash -# Backfill last 48 hours with 2-hour intervals -./prometheus-pusher \ - -mode=backfill \ - -start-hours=48 \ - -end-hours=0 \ - -interval=2 \ - -prometheus=http://localhost:9090/api/v1/write - -# Expected output: -# Starting backfill from 48 hours ago to 0 hours ago (interval: 2 hours) -# Successfully pushed historic data for 48 hours ago... -# ... -# Backfill complete: 25 successful, 0 errors -``` - -### Test 4: Visualize in Prometheus UI - -```bash -# Port-forward if not already done -kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 - -# Open http://localhost:9090 -# Query: {job="historic_data"} -# Switch to Graph view to see historic data timeline -``` - -## π Example Queries - -Once historic data is ingested: - -```promql -# All historic data -{job="historic_data"} - -# Compare realtime vs historic -app_requests_total{job="example_metrics_pusher"} # current -app_requests_total{job="historic_data"} # historic - -# View specific metric from past -app_temperature_celsius{job="historic_data"} - -# Rate calculation over historic data -rate(app_requests_total{job="historic_data"}[5m]) - -# Histogram percentiles from historic data -histogram_quantile(0.95, - rate(app_request_duration_seconds_bucket{job="historic_data"}[5m])) -``` - -## π― Use Cases - -Now you can: - -1. **Backfill missing data** during outages -2. **Import historic data** from other systems -3. **Test with specific timestamps** for debugging -4. **Migrate data** from legacy monitoring systems -5. **Generate sample data** for demonstrations - -## βοΈ Command Reference - -### Realtime Mode -```bash -# Single push (original behavior) -./prometheus-pusher -mode=realtime - -# Continuous pushing every 15s -./prometheus-pusher -mode=realtime -continuous - -# Custom Pushgateway URL -./prometheus-pusher \ - -mode=realtime \ - -pushgateway=http://custom-pushgateway:9091 \ - -job=my_app -``` - -### Historic Mode -```bash -# Yesterday's data -./prometheus-pusher -mode=historic -hours-ago=24 - -# 3 hours ago -./prometheus-pusher -mode=historic -hours-ago=3 - -# Last week -./prometheus-pusher -mode=historic -hours-ago=168 - -# Custom Prometheus URL -./prometheus-pusher \ - -mode=historic \ - -hours-ago=24 \ - -prometheus=http://custom-prometheus:9090/api/v1/write -``` - -### Backfill Mode -```bash -# Last 24 hours, hourly -./prometheus-pusher -mode=backfill -start-hours=24 -end-hours=0 -interval=1 - -# Last week, every 6 hours -./prometheus-pusher -mode=backfill -start-hours=168 -end-hours=0 -interval=6 - -# Specific range (48h ago to 24h ago, every 2h) -./prometheus-pusher -mode=backfill -start-hours=48 -end-hours=24 -interval=2 -``` - -## π Documentation - -- **HISTORIC.md**: Complete guide to historic data ingestion -- **USAGE.md**: Original realtime mode documentation -- **README.md**: Project overview -- **SUMMARY.md**: Technical architecture - -## π§ Troubleshooting - -### "remote write receiver not enabled" -``` -Error: remote write failed with status 404: - remote write receiver needs to be enabled -``` - -**Solution**: Complete the "Pending: Cluster Issue" steps above to enable the feature. - -### "out of order sample" -``` -Error: sample timestamp out of order -``` - -**Causes**: -1. Trying to insert data older than existing data for that series -2. Backfilling in wrong order (newest to oldest) - -**Solutions**: -1. Use different job label (already done: `job="historic_data"`) -2. Backfill from oldest to newest (already implemented) -3. Delete existing series first if needed - -### "sample too old" -``` -Error: sample is too old -``` - -**Limitation**: Prometheus has limits on how old data can be (typically a few days). - -**Solution**: For very old data (weeks/months), consider using `promtool tsdb create-blocks-from` to write TSDB blocks directly. - -## π Summary - -You now have a complete solution for: -- β
Realtime metrics (Pushgateway) -- β
Historic data ingestion (Remote Write) -- β
Batch backfilling (automated range) -- β
Flexible timestamp control -- β
All Prometheus metric types supported - -All code is committed and pushed to git! - -**Next Step**: Once cluster is back, run the setup commands above and start testing! diff --git a/f3s/prometheus-pusher/SUMMARY.md b/f3s/prometheus-pusher/SUMMARY.md deleted file mode 100644 index d71138c..0000000 --- a/f3s/prometheus-pusher/SUMMARY.md +++ /dev/null @@ -1,215 +0,0 @@ -# Prometheus Data Ingestion - Summary - -## What Was Created - -A complete Prometheus data ingestion solution consisting of: - -### 1. **Standalone Go Binary** (`prometheus-pusher`) -- **Size**: ~12MB standalone executable -- **Language**: Go 1.21 -- **Dependencies**: Prometheus client library -- **Function**: Generates and pushes metrics to Pushgateway every 15 seconds - -### 2. **Pushgateway Deployment** -- **Type**: Kubernetes deployment in `monitoring` namespace -- **Image**: `prom/pushgateway:v1.10.0` -- **Port**: 9091 -- **Function**: Receives metrics from the Go binary and exposes them for Prometheus to scrape - -### 3. **Prometheus Configuration** -- Updated `/home/paul/git/conf/f3s/prometheus/additional-scrape-configs.yaml` -- Added Pushgateway as a scrape target -- Prometheus automatically scrapes Pushgateway every 15-30 seconds - -## Data Format - -The binary pushes metrics in **Prometheus text format** via HTTP POST to the Pushgateway. This is the standard format for all Prometheus metrics. - -Example: -``` -# HELP app_requests_total Total number of requests processed -# TYPE app_requests_total counter -app_requests_total{instance="example-app",job="example_metrics_pusher"} 42 -``` - -## Metric Types Demonstrated - -### 1. **Counter**: `app_requests_total` -- Monotonically increasing value -- Best for: Total requests, errors, events - -### 2. **Gauge**: `app_active_connections`, `app_temperature_celsius` -- Value that can increase or decrease -- Best for: Current state (connections, temperature, memory) - -### 3. **Histogram**: `app_request_duration_seconds` -- Distribution of values in buckets -- Best for: Latency, response times, sizes -- Automatically provides percentile calculations - -### 4. **Counter with Labels**: `app_jobs_processed_total` -- Counter with multiple dimensions -- Labels: `job_type` (email, report, backup), `status` (success, failed) -- Best for: Categorized counting - -## Why This Format? - -The Prometheus text format was chosen because: - -1. **Standard**: Universal format understood by all Prometheus components -2. **Human-readable**: Easy to debug and understand -3. **Efficient**: Compact representation -4. **Type-safe**: Explicit metric types prevent errors -5. **Labeled**: Supports multi-dimensional data - -## How to Use - -### Quick Start -```bash -cd /home/paul/git/conf/f3s/prometheus-pusher -./run.sh -``` - -### Manual Operation -```bash -# 1. Port-forward Pushgateway -kubectl port-forward -n monitoring svc/pushgateway 9091:9091 - -# 2. Run binary (in another terminal) -./prometheus-pusher -``` - -### Query Metrics -```bash -# Port-forward Prometheus -kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 - -# Open http://localhost:9090 and query: -app_requests_total -app_active_connections -rate(app_requests_total[5m]) -histogram_quantile(0.95, rate(app_request_duration_seconds_bucket[5m])) -``` - -## Example Data - -See `example-metrics.txt` for a complete example of all metric types with sample values. - -## Testing Results - -β
**Binary compilation**: Success (12MB executable) -β
**Pushgateway deployment**: Running in monitoring namespace -β
**Metrics push**: Successfully pushing every 15 seconds -β
**Prometheus scraping**: Confirmed metrics visible in Prometheus -β
**Query testing**: All metric types queryable - -### Sample Query Results - -```json -{ - "status": "success", - "data": { - "result": [{ - "metric": { - "__name__": "app_requests_total", - "instance": "example-app", - "job": "example_metrics_pusher" - }, - "value": [1767121623.654, "10"] - }] - } -} -``` - -## Architecture - -``` -ββββββββββββββββββββββββ -β prometheus-pusher β Standalone Go binary -β (your machine) β - Generates metrics -ββββββββββββ¬ββββββββββββ - Pushes via HTTP POST - β - β HTTP POST - β :9091/metrics/job/<jobname> - βΌ -ββββββββββββββββββββββββ -β Pushgateway β Kubernetes pod -β (monitoring ns) β - Receives pushed metrics -ββββββββββββ¬ββββββββββββ - Exposes /metrics endpoint - β - β HTTP GET (scrape) - β Every 15-30s - βΌ -ββββββββββββββββββββββββ -β Prometheus β Kubernetes pod -β (monitoring ns) β - Scrapes Pushgateway -ββββββββββββ¬ββββββββββββ - Stores time-series data - β - β HTTP API - β PromQL queries - βΌ -ββββββββββββββββββββββββ -β Grafana / Users β Visualization & Alerts -β β - Query metrics -ββββββββββββββββββββββββ - Create dashboards -``` - -## Files Created - -``` -/home/paul/git/conf/f3s/prometheus-pusher/ -βββ main.go # Go source code -βββ go.mod, go.sum # Go dependencies -βββ prometheus-pusher # Compiled binary (12MB) -βββ run.sh # Helper script -βββ README.md # Project overview -βββ USAGE.md # Detailed usage guide -βββ SUMMARY.md # This file -βββ example-metrics.txt # Example metrics format -βββ Dockerfile # Docker build (optional) - -/home/paul/git/conf/f3s/pushgateway/ -βββ helm-chart/ # Kubernetes deployment - βββ Chart.yaml # Helm chart metadata - βββ values.yaml # Configuration values - βββ README.md # Chart documentation - βββ templates/ - βββ deployment.yaml # Pushgateway pod - βββ service.yaml # Pushgateway service - -/home/paul/git/conf/f3s/prometheus/ -βββ additional-scrape-configs.yaml # Prometheus config (updated) -``` - -## Next Steps - -### For Production Use - -1. **Modify metrics** in `main.go` to track your actual application data -2. **Adjust push interval** (currently 15 seconds) -3. **Add authentication** if Pushgateway is exposed externally -4. **Set up Grafana dashboards** to visualize the metrics -5. **Configure alerts** in Prometheus for critical thresholds - -### For Learning - -1. Experiment with different metric types -2. Try querying with different PromQL expressions -3. Create Grafana dashboards -4. Set up alerting rules -5. Compare Pushgateway approach vs. direct scraping - -## Key Concepts - -- **Push vs. Pull**: Pushgateway allows pushing metrics (vs. Prometheus scraping) -- **Labels**: Enable multi-dimensional metrics -- **Metric Types**: Different types for different use cases -- **Aggregation**: Histograms automatically calculate percentiles -- **Time Series**: Prometheus stores timestamped values - -## References - -- Prometheus text format: https://prometheus.io/docs/instrumenting/exposition_formats/ -- Prometheus client library: https://github.com/prometheus/client_golang -- Pushgateway: https://github.com/prometheus/pushgateway -- Metric types: https://prometheus.io/docs/concepts/metric_types/ diff --git a/f3s/prometheus-pusher/USAGE.md b/f3s/prometheus-pusher/USAGE.md deleted file mode 100644 index be5837f..0000000 --- a/f3s/prometheus-pusher/USAGE.md +++ /dev/null @@ -1,231 +0,0 @@ -# Prometheus Pusher - Usage Guide - -## Quick Start - -### 1. Deploy Pushgateway (One-time setup) - -```bash -cd /home/paul/git/conf/f3s/pushgateway/helm-chart -helm upgrade --install pushgateway . -n monitoring --create-namespace -``` - -### 2. Update Prometheus Configuration (One-time setup) - -The Prometheus scrape configuration has already been updated in `/home/paul/git/conf/f3s/prometheus/additional-scrape-configs.yaml` to include: - -```yaml -- job_name: 'pushgateway' - honor_labels: true - static_configs: - - targets: - - 'pushgateway.monitoring.svc.cluster.local:9091' -``` - -Apply it: -```bash -kubectl create secret generic additional-scrape-configs \ - --from-file=/home/paul/git/conf/f3s/prometheus/additional-scrape-configs.yaml \ - --dry-run=client -o yaml -n monitoring | kubectl apply -f - -``` - -### 3. Run the Standalone Binary - -First, port-forward the Pushgateway: -```bash -kubectl port-forward -n monitoring svc/pushgateway 9091:9091 -``` - -In another terminal, run the binary: -```bash -cd /home/paul/git/conf/f3s/prometheus-pusher -./prometheus-pusher -``` - -The binary will: -- Push metrics immediately on startup -- Continue pushing metrics every 15 seconds -- Generate random example data to simulate a real application - -## Viewing Metrics - -### View Pushgateway UI -```bash -kubectl port-forward -n monitoring svc/pushgateway 9091:9091 -# Open http://localhost:9091 -``` - -### Query Prometheus -```bash -kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 -# Open http://localhost:9090 -``` - -Example queries: -```promql -# View total requests -app_requests_total - -# View request rate over last 5 minutes -rate(app_requests_total[5m]) - -# View current active connections -app_active_connections - -# View current temperature -app_temperature_celsius - -# View 95th percentile request duration -histogram_quantile(0.95, rate(app_request_duration_seconds_bucket[5m])) - -# View failed jobs by type -app_jobs_processed_total{status="failed"} - -# View job success rate -rate(app_jobs_processed_total{status="success"}[5m]) / rate(app_jobs_processed_total[5m]) -``` - -## Metric Types Explained - -### Counter: `app_requests_total` -- **Type**: Counter -- **Description**: Total number of requests processed -- **Value behavior**: Only increases (monotonically increasing) -- **Use case**: Counting total events, requests, errors - -### Gauge: `app_active_connections`, `app_temperature_celsius` -- **Type**: Gauge -- **Description**: Current value that can go up or down -- **Value behavior**: Can increase or decrease -- **Use cases**: - - Active connections - - Current temperature - - Memory usage - - Queue length - -### Histogram: `app_request_duration_seconds` -- **Type**: Histogram -- **Description**: Distribution of request durations -- **Value behavior**: Samples observations into buckets -- **Use cases**: - - Request latency - - Response times - - Data sizes -- **Buckets**: .005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10 seconds - -### Counter with Labels: `app_jobs_processed_total` -- **Type**: Counter with labels -- **Description**: Jobs processed by type and status -- **Labels**: - - `job_type`: email, report, backup - - `status`: success, failed -- **Use cases**: Categorized counting, multi-dimensional metrics - -## Prometheus Format Example - -The metrics are sent in Prometheus text format: - -``` -# HELP app_requests_total Total number of requests processed -# TYPE app_requests_total counter -app_requests_total{instance="example-app",job="example_metrics_pusher"} 42 - -# HELP app_active_connections Number of currently active connections -# TYPE app_active_connections gauge -app_active_connections{instance="example-app",job="example_metrics_pusher"} 67 - -# HELP app_temperature_celsius Current temperature in Celsius -# TYPE app_temperature_celsius gauge -app_temperature_celsius{instance="example-app",job="example_metrics_pusher"} 23.5 - -# HELP app_jobs_processed_total Total number of jobs processed by type -# TYPE app_jobs_processed_total counter -app_jobs_processed_total{instance="example-app",job="example_metrics_pusher",job_type="email",status="success"} 15 -app_jobs_processed_total{instance="example-app",job="example_metrics_pusher",job_type="email",status="failed"} 2 -``` - -## Customizing the Binary - -Edit `main.go` to: -1. Change the Pushgateway URL -2. Modify the push interval (currently 15 seconds) -3. Add your own metrics -4. Change label values - -Then rebuild: -```bash -go build -o prometheus-pusher main.go -``` - -## Architecture - -``` -βββββββββββββββββββ -β Go Binary β -β (prometheus- βββPush metricsβββ -β pusher) β β -βββββββββββββββββββ β - βΌ - ββββββββββββββββββββ - β Pushgateway ββββScrapeβββ - β (Port 9091) β β - ββββββββββββββββββββ β - β - βββββββββββββββββββ - β Prometheus β - β (Port 9090) β - βββββββββββββββββββ -``` - -## When to Use Pushgateway vs. Scraping - -**Use Pushgateway (what we're doing) for:** -- Batch jobs -- Short-lived processes -- Service-level metrics -- Jobs behind firewalls - -**Use Prometheus scraping (alternative approach) for:** -- Long-running applications -- Services with consistent endpoints -- Applications that can expose `/metrics` endpoint - -## Troubleshooting - -### Binary can't connect to Pushgateway -```bash -# Check port-forward is running -ps aux | grep "port-forward.*9091" - -# Restart port-forward -kubectl port-forward -n monitoring svc/pushgateway 9091:9091 -``` - -### Metrics not appearing in Prometheus -```bash -# Check Pushgateway has metrics -curl http://localhost:9091/metrics | grep "app_" - -# Check Prometheus scrape targets -# Open http://localhost:9090/targets -# Look for "pushgateway" job - -# Check Prometheus logs -kubectl logs -n monitoring -l app.kubernetes.io/name=prometheus -``` - -### Reload Prometheus config manually -```bash -# The Prometheus Operator should auto-reload, but if needed: -kubectl delete pod -n monitoring -l app.kubernetes.io/name=prometheus -``` - -## Clean Up - -```bash -# Stop port-forwards -pkill -f "port-forward.*9091" -pkill -f "port-forward.*9090" - -# Remove deployment (if you want to uninstall) -helm uninstall pushgateway-only -n monitoring -``` |
