diff options
| author | Paul Buetow <paul@buetow.org> | 2026-01-15 21:01:35 +0200 |
|---|---|---|
| committer | Paul Buetow <paul@buetow.org> | 2026-01-15 21:01:43 +0200 |
| commit | 8e0cf186a1e1dba042e4dd4eb6727889d2b22bbd (patch) | |
| tree | 5f1c70195961a5bbf3cc8765f8cec87ae474f2f7 | |
| parent | 617a9f741bad57640e06f03b67b8ea2983c5904e (diff) | |
feat: add Argo Rollouts controller and tracing-demo canary rollout demo
| -rw-r--r-- | f3s/ARGO-ROLLOUTS-SUMMARY.md | 248 | ||||
| -rw-r--r-- | f3s/README-ROLLOUTS.md | 229 | ||||
| -rw-r--r-- | f3s/ROLLOUTS-CHECKLIST.md | 189 | ||||
| -rw-r--r-- | f3s/ROLLOUTS-FILE-TREE.txt | 183 | ||||
| -rw-r--r-- | f3s/ROLLOUTS-SETUP.md | 429 | ||||
| -rw-r--r-- | f3s/argo-rollouts/Justfile | 33 | ||||
| -rw-r--r-- | f3s/argo-rollouts/README.md | 85 | ||||
| -rw-r--r-- | f3s/argo-rollouts/values.yaml | 28 | ||||
| -rw-r--r-- | f3s/argocd-apps/cicd/argo-rollouts.yaml | 28 | ||||
| -rw-r--r-- | f3s/argocd-apps/services/tracing-demo.yaml | 2 | ||||
| -rw-r--r-- | f3s/tracing-demo/Justfile | 37 | ||||
| -rw-r--r-- | f3s/tracing-demo/ROLLOUTS-DEMO.md | 317 | ||||
| -rw-r--r-- | f3s/tracing-demo/helm-chart/templates/frontend-rollout.yaml | 75 | ||||
| -rwxr-xr-x | f3s/tracing-demo/rollout-demo.sh | 57 |
14 files changed, 1940 insertions, 0 deletions
diff --git a/f3s/ARGO-ROLLOUTS-SUMMARY.md b/f3s/ARGO-ROLLOUTS-SUMMARY.md new file mode 100644 index 0000000..2c0372e --- /dev/null +++ b/f3s/ARGO-ROLLOUTS-SUMMARY.md @@ -0,0 +1,248 @@ +# Argo Rollouts Implementation Summary + +## What Was Created + +### 1. Argo Rollouts Controller Installation +**Location**: `/home/paul/git/conf/f3s/argo-rollouts/` + +Files: +- `Justfile` - Installation automation +- `values.yaml` - Helm configuration +- `README.md` - Installation guide + +Deployment: +```bash +cd /home/paul/git/conf/f3s/argo-rollouts +just install +``` + +Also registered in ArgoCD: `/home/paul/git/conf/f3s/argocd-apps/cicd/argo-rollouts.yaml` + +### 2. Frontend Rollout Manifest +**Location**: `/home/paul/git/conf/f3s/tracing-demo/helm-chart/templates/frontend-rollout.yaml` + +**Replaces**: `frontend-deployment.yaml` (kept for reference) + +**Strategy**: Canary with 2-minute observation window +``` +Step 1: 50% traffic to new version +Step 2: Pause 2 minutes (observation period) +Step 3: 100% traffic to new version (auto-promote) +``` + +**Why Frontend?** +- Has 2 replicas (good for canary demo) +- User-facing (can observe behavior easily) +- Generates traces (can monitor impact) +- Non-critical for cluster health + +### 3. Demo Documentation + +**`/home/paul/git/conf/f3s/tracing-demo/ROLLOUTS-DEMO.md`** +- Comprehensive walkthrough +- Real-time monitoring commands +- Troubleshooting guide +- Advanced patterns + +**`/home/paul/git/conf/f3s/ROLLOUTS-SETUP.md`** +- Quick setup instructions +- 5 demo scenarios (basic, manual, abort, prometheus, gitops) +- Expected output and timings +- Monitoring dashboard examples + +**`/home/paul/git/conf/f3s/tracing-demo/rollout-demo.sh`** +- Automated demo starter script +- Checks prerequisites +- Provides instructions + +### 4. Enhanced Justfile Commands +**Location**: `/home/paul/git/conf/f3s/tracing-demo/Justfile` + +New commands: +```bash +just rollout-watch # Watch progress in real-time +just rollout-status # Check current status +just rollout-info # Detailed information +just rollout-promote # Skip waiting, promote to 100% +just rollout-abort # Abort current rollout +just rollout-history # View past rollouts +just rollout-demo # Start demo script +``` + +### 5. Updated ArgoCD Application +**Location**: `/home/paul/git/conf/f3s/argocd-apps/services/tracing-demo.yaml` + +Added sync option: `RespectIgnoreDifferences=true` to gracefully handle migration from Deployment to Rollout. + +## Architecture + +``` +┌─────────────────────────────────────────┐ +│ Kubernetes Cluster │ +├─────────────────────────────────────────┤ +│ │ +│ ┌──────────────────┐ │ +│ │ ArgoCD (cicd) │ │ +│ └────────┬─────────┘ │ +│ │ │ +│ └──→ Git Repository │ +│ (conf.git) │ +│ │ +│ ┌──────────────────────────────────┐ │ +│ │ Argo Rollouts Controller (cicd) │ │ +│ │ - Manages Rollout resources │ │ +│ │ - Orchestrates canary │ │ +│ │ - Monitors replica sets │ │ +│ └──────────────────────────────────┘ │ +│ ▲ │ +│ │ watches │ +│ │ │ +│ ┌────────────────────────────────────┐ │ +│ │ tracing-demo-frontend Rollout │ │ +│ │ ┌──────────────┐ ┌──────────────┐│ │ +│ │ │ Stable RS │ │ Canary RS ││ │ +│ │ │ 2 replicas │ │ 1-2 replicas ││ │ +│ │ └──────────────┘ └──────────────┘│ │ +│ │ │ │ +│ │ Endpoints: frontend-service │ │ +│ │ - Selects both RS (proportional) │ │ +│ │ - Routes traffic to 50%/100% │ │ +│ └────────────────────────────────────┘ │ +│ │ +│ ┌──────────────────┐ │ +│ │ Middleware │ ┌──────────────┐│ +│ │ Backend │ │ Deployment ││ +│ │ (unchanged) │ │ (unchanged) ││ +│ └──────────────────┘ └──────────────┘│ +│ │ +└─────────────────────────────────────────┘ + Monitoring (Prometheus/Grafana) +``` + +## Key Differences: Deployment vs Rollout + +| Aspect | Deployment | Rollout | +|--------|------------|---------| +| **Update Strategy** | RollingUpdate (all or nothing) | Canary, Blue-Green, A/B | +| **Traffic Split** | No built-in support | Native pod-level splitting | +| **Pause/Resume** | No | Yes (at canary steps) | +| **Automatic Rollback** | No (manual `rollout undo`) | Yes (if health checks fail) | +| **Visibility** | kubectl rollout status | kubectl argo rollouts get --watch | +| **Observability** | Basic pod counts | Detailed step information | + +## How It Works + +### Normal Deployment (Traditional) +``` +kubectl apply → All pods immediately scale up/down +Old pods: 2 → 0 +New pods: 0 → 2 +Users affected: ~5 seconds of traffic loss risk +``` + +### Canary Rollout (New) +``` +Git commit → ArgoCD detects → Argo Rollouts orchestrates + +Step 1 (50% traffic): + Stable: 2 pods → 1 pod (old version) + Canary: 0 pods → 1 pod (new version) + Users see: 50% old, 50% new for 0-2 minutes + +Step 2 (Pause): + Stable: 1 pod (old) + Canary: 1 pod (new) + Observe metrics, logs, error rates for 2 minutes + +Step 3 (100% traffic): + Stable: 1 → 0 pods (old version terminated) + Canary: 1 → 2 pods (new version scales up) + Users see: 100% new version + + Complete: Canary promoted to stable +``` + +## Demo Quick Start + +### 1. Install Everything +```bash +cd /home/paul/git/conf/f3s +# Sync with ArgoCD (auto or manual) +argocd app sync argo-rollouts +argocd app sync tracing-demo +``` + +### 2. Verify Setup +```bash +cd /home/paul/git/conf/f3s/tracing-demo +just rollout-status +# Should show: Rollout is healthy +``` + +### 3. Run Demo +```bash +# Terminal 1: Watch rollout +just rollout-watch + +# Terminal 2: Trigger rollout (modify git or patch) +kubectl patch rollout tracing-demo-frontend -n services \ + --type='json' \ + -p='[{"op":"replace","path":"/spec/template/spec/containers/0/image","value":"registry.lan.buetow.org:30001/tracing-demo-frontend:latest"}]' +``` + +### 4. Observe +- See canary step progress in Terminal 1 +- Optional: `just load-test` to generate traffic during rollout +- After ~4 minutes: Rollout complete, 100% traffic to new version + +## Files Summary + +| Path | Purpose | +|------|---------| +| `argo-rollouts/Justfile` | Install/upgrade/check Argo Rollouts | +| `argo-rollouts/values.yaml` | Helm configuration for controller | +| `argo-rollouts/README.md` | Installation and basic usage | +| `tracing-demo/helm-chart/templates/frontend-rollout.yaml` | Canary rollout definition | +| `tracing-demo/Justfile` | Added `just rollout-*` commands | +| `tracing-demo/ROLLOUTS-DEMO.md` | Detailed walkthrough | +| `tracing-demo/rollout-demo.sh` | Demo starter script | +| `argocd-apps/cicd/argo-rollouts.yaml` | ArgoCD Application for controller | +| `argocd-apps/services/tracing-demo.yaml` | Updated to work with Rollout | +| `ROLLOUTS-SETUP.md` | Complete setup guide with scenarios | +| `ARGO-ROLLOUTS-SUMMARY.md` | This file | + +## Next Steps + +1. **Install controller**: `cd argo-rollouts && just install` +2. **Wait for ArgoCD sync** or manually sync `argo-rollouts` and `tracing-demo` apps +3. **Verify**: `just rollout-status` shows healthy +4. **Run demo**: `just rollout-watch` + trigger in another terminal +5. **Explore**: Try abort, promote, or different canary durations + +## Important Notes + +- **No service mesh required**: Uses native Kubernetes service-based routing +- **Traffic splitting**: Proportional to pod counts (1 old, 1 new = 50/50) +- **Auto-promotion**: After 2 minutes, canary automatically promotes to 100% +- **Graceful**: ArgoCD correctly handles transition from Deployment → Rollout +- **Reversible**: Can abort and keep old version running + +## Limitations & Future Work + +**Current (Basic Canary)**: +- Simple replica-based traffic splitting +- No header-based routing +- No advanced health checks + +**To Add** (Optional): +- **Istio integration**: For precise % traffic splitting, header-based routing +- **Flagger**: Automated canary analysis with Prometheus thresholds +- **Linkerd**: For distributed tracing and observability +- **Longer observation**: Change `pause: duration: 2m` to `5m` or `10m` + +## Questions? + +See: +- `/home/paul/git/conf/f3s/ROLLOUTS-SETUP.md` - Complete setup & scenarios +- `/home/paul/git/conf/f3s/tracing-demo/ROLLOUTS-DEMO.md` - Detailed walkthrough +- `/home/paul/git/conf/f3s/argo-rollouts/README.md` - Controller-specific info diff --git a/f3s/README-ROLLOUTS.md b/f3s/README-ROLLOUTS.md new file mode 100644 index 0000000..60ec9b6 --- /dev/null +++ b/f3s/README-ROLLOUTS.md @@ -0,0 +1,229 @@ +# Argo Rollouts - Quick Reference + +Progressive delivery (canary deployments) for the f3s cluster. + +## TL;DR - Get Started in 5 Minutes + +```bash +# 1. Install controller +cd /home/paul/git/conf/f3s/argo-rollouts +just install + +# 2. Wait for ArgoCD sync (or force) +argocd app sync argo-rollouts +argocd app sync tracing-demo + +# 3. Verify setup +cd /home/paul/git/conf/f3s/tracing-demo +just rollout-status + +# 4. Run a demo (Terminal 1) +just rollout-watch + +# 5. Trigger in another terminal (Terminal 2) +kubectl patch rollout tracing-demo-frontend -n services \ + --type='json' \ + -p='[{"op":"replace","path":"/spec/template/spec/containers/0/image","value":"registry.lan.buetow.org:30001/tracing-demo-frontend:latest"}]' + +# 6. Watch progress in Terminal 1 (~4 minutes total) +``` + +Expected flow: +- 0-2 min: **50% traffic** to new version (canary phase 1) +- 2-4 min: **Wait** for confirmation (canary phase 2) +- 4+ min: **100% traffic** to new version (auto-promoted) + +## Files Created + +### Setup & Installation +- `argo-rollouts/Justfile` - Install/manage controller +- `argo-rollouts/values.yaml` - Helm config +- `argocd-apps/cicd/argo-rollouts.yaml` - ArgoCD app + +### Demo App Configuration +- `tracing-demo/helm-chart/templates/frontend-rollout.yaml` - Canary definition +- `tracing-demo/Justfile` - New `just rollout-*` commands +- `tracing-demo/rollout-demo.sh` - Demo automation script + +### Documentation +- `ARGO-ROLLOUTS-SUMMARY.md` - **START HERE** - Full overview +- `ROLLOUTS-SETUP.md` - **DETAILED GUIDE** - 5 demo scenarios +- `ROLLOUTS-CHECKLIST.md` - **DEPLOYMENT CHECKLIST** - Step-by-step +- `tracing-demo/ROLLOUTS-DEMO.md` - Technical walkthrough +- `README-ROLLOUTS.md` - This file + +## Why Canary Deployments? + +**Old way (Deployment)**: +- 2 old pods → removed +- 2 new pods → created +- ~5 seconds of potential traffic loss +- No way to validate before 100% rollout + +**New way (Rollout with Canary)**: +- 2 old pods → 1 old + 1 new (50/50 traffic) +- Observe for 2 minutes +- If healthy → automatically promote to 2 new pods +- If unhealthy → abort, revert to 2 old pods +- Zero downtime, validated before full rollout + +## Common Commands + +```bash +cd /home/paul/git/conf/f3s/tracing-demo + +# Watch rollout progress (real-time) +just rollout-watch + +# Check current status +just rollout-status + +# Detailed info +just rollout-info + +# Skip waiting, promote now +just rollout-promote + +# Abort and rollback +just rollout-abort + +# View history +just rollout-history + +# Generate load during rollout +just load-test +``` + +## What Happens During Canary + +### Step 1: 50% Traffic (0-2 minutes) +``` +Frontend Service +├── Stable ReplicaSet (old version): 1 pod → receives 50% traffic +└── Canary ReplicaSet (new version): 1 pod → receives 50% traffic +``` + +Monitor during this phase: +- Error rates +- Response latency +- Logs and traces +- Prometheus metrics + +### Step 2: Pause (2 minutes) +``` +Service pauses traffic shift, waiting for: +- Manual promotion via: kubectl argo rollouts promote ... +- Auto-promotion after 2 minutes +- Or abort: kubectl argo rollouts abort ... +``` + +### Step 3: 100% Traffic (4+ minutes) +``` +Frontend Service +├── Stable ReplicaSet (new version): 2 pods → receives 100% traffic +└── Canary ReplicaSet (old version): 0 pods → terminated +``` + +## Architecture + +``` +Git Commit (new image) + ↓ +Git Server (conf.git) + ↓ +ArgoCD detects change + ↓ +Updates Rollout resource + ↓ +Argo Rollouts Controller + ↓ + ├─→ Scales Canary ReplicaSet (1 new pod) + ├─→ Frontend Service routes 50/50 traffic + ├─→ Monitors health/metrics for 2 minutes + └─→ Auto-promotes or waits for manual action + ├─→ If healthy: Scale to 2 new, remove old + └─→ If abort: Remove canary, keep old +``` + +## Demo Scenarios + +See `ROLLOUTS-SETUP.md` for complete walkthrough of: + +1. **Basic Canary** - Watch 50% → 100% progression +2. **Manual Promotion** - Skip waiting with `just rollout-promote` +3. **Abort/Rollback** - Fail canary and revert +4. **Prometheus Monitoring** - Track metrics during rollout +5. **GitOps Flow** - Commit code, watch auto-rollout + +## Monitoring + +### Command-line +```bash +# Real-time watch +kubectl argo rollouts get rollout tracing-demo-frontend -n services --watch + +# Check metrics +kubectl top pods -n services -l app=tracing-demo-frontend +``` + +### Grafana +https://grafana.f3s.buetow.org + +1. Explore → Tempo +2. Query: `{ resource.service.name = "frontend" }` +3. See traces from old and new versions + +### Prometheus +```bash +# Port-forward +kubectl port-forward -n monitoring svc/prometheus 9090:9090 +# Open http://localhost:9090 + +# Query pod status +kube_pod_status_phase{namespace="services", pod=~".*frontend.*"} +``` + +## Troubleshooting + +**Controller not running?** +```bash +kubectl get pods -n cicd -l app.kubernetes.io/name=argo-rollouts +kubectl logs -n cicd -l app.kubernetes.io/name=argo-rollouts +``` + +**Rollout stuck?** +```bash +kubectl describe rollout tracing-demo-frontend -n services +kubectl get pods -n services -l app=tracing-demo-frontend +``` + +**Need plugin?** +```bash +curl -LO https://github.com/argoproj/argo-rollouts/releases/latest/download/kubectl-argo-rollouts-linux-amd64 +sudo install -m 755 kubectl-argo-rollouts-linux-amd64 /usr/local/bin/kubectl-argo-rollouts +``` + +## Next Steps + +1. Complete setup using `ROLLOUTS-CHECKLIST.md` +2. Run demo scenarios from `ROLLOUTS-SETUP.md` +3. Share with team +4. Optional: Add Istio for advanced traffic routing +5. Optional: Deploy Flagger for automated analysis +6. Migrate other services to Rollout + +## Key Resources + +| File | Purpose | +|------|---------| +| `ARGO-ROLLOUTS-SUMMARY.md` | Architecture & what was created | +| `ROLLOUTS-SETUP.md` | Complete setup & 5 demo scenarios | +| `ROLLOUTS-CHECKLIST.md` | Step-by-step deployment | +| `tracing-demo/ROLLOUTS-DEMO.md` | Technical details & troubleshooting | +| `argo-rollouts/README.md` | Controller installation guide | + +## Support + +- Argo Rollouts Docs: https://argoproj.github.io/argo-rollouts/ +- Canary Strategy: https://argoproj.github.io/argo-rollouts/features/canary/ +- Kubectl Plugin: https://argoproj.github.io/argo-rollouts/getting-started/#using-kubectl-with-argo-rollouts diff --git a/f3s/ROLLOUTS-CHECKLIST.md b/f3s/ROLLOUTS-CHECKLIST.md new file mode 100644 index 0000000..b32f1ac --- /dev/null +++ b/f3s/ROLLOUTS-CHECKLIST.md @@ -0,0 +1,189 @@ +# Argo Rollouts Deployment Checklist + +## Pre-Deployment Setup + +- [ ] Read `ARGO-ROLLOUTS-SUMMARY.md` to understand what was created +- [ ] Ensure kubectl access to f3s cluster +- [ ] Ensure ArgoCD is running and accessible +- [ ] Git repository (conf.git) synced to git-server + +## Installation + +- [ ] Navigate to `/home/paul/git/conf/f3s/argo-rollouts` +- [ ] Run `just install` to deploy controller +- [ ] Verify controller running: `kubectl get pods -n cicd -l app.kubernetes.io/name=argo-rollouts` +- [ ] Verify CRD installed: `kubectl get crd | grep rollout` + +## Optional: Install kubectl Plugin + +- [ ] Download kubectl-argo-rollouts: + ```bash + curl -LO https://github.com/argoproj/argo-rollouts/releases/latest/download/kubectl-argo-rollouts-linux-amd64 + chmod +x kubectl-argo-rollouts-linux-amd64 + sudo install -m 755 kubectl-argo-rollouts-linux-amd64 /usr/local/bin/kubectl-argo-rollouts + ``` +- [ ] Verify: `kubectl argo rollouts version` + +## ArgoCD Syncing + +- [ ] Create/push `argocd-apps/cicd/argo-rollouts.yaml` to git +- [ ] Create/push `argocd-apps/services/tracing-demo.yaml` updates to git +- [ ] Force ArgoCD sync (wait 3 min or manual): + ```bash + argocd app sync argo-rollouts + argocd app sync tracing-demo + ``` +- [ ] Verify tracing-demo application status: `argocd app get tracing-demo` + +## Rollout Verification + +- [ ] Check frontend rollout deployed: `kubectl get rollout tracing-demo-frontend -n services` +- [ ] Verify status: `kubectl describe rollout tracing-demo-frontend -n services` +- [ ] Expected: `Status: Healthy` with `2/2 replicas` in stable state +- [ ] Check pods running: `kubectl get pods -n services -l app=tracing-demo-frontend` + +## Basic Demo (First Time) + +### Terminal 1: Watch Rollout +```bash +cd /home/paul/git/conf/f3s/tracing-demo +just rollout-watch +``` +- [ ] Command running and connected + +### Terminal 2: Generate Load (Optional) +```bash +cd /home/paul/git/conf/f3s/tracing-demo +just load-test & +``` +- [ ] Requests being sent to frontend + +### Terminal 3: Trigger Rollout +Choose one method: + +**Method A: Kubectl Patch (Fastest)** +```bash +kubectl patch rollout tracing-demo-frontend -n services \ + --type='json' \ + -p='[{"op":"replace","path":"/spec/template/spec/containers/0/image","value":"registry.lan.buetow.org:30001/tracing-demo-frontend:latest"}]' +``` +- [ ] Executed successfully + +**Method B: Git + ArgoCD (Most GitOps)** +```bash +cd /home/paul/git/conf/f3s +# Edit tracing-demo/helm-chart/templates/frontend-rollout.yaml (change image tag) +git add -A +git commit -m "chore: update frontend image for demo" +git remote add r0 ssh://git@r0:30022/repos/conf.git 2>/dev/null || true +git push r0 master +kubectl annotate application tracing-demo -n cicd argocd.argoproj.io/refresh=normal --overwrite +``` +- [ ] Git push successful +- [ ] ArgoCD syncing (check web UI or CLI) + +## Demo Observation + +- [ ] Terminal 1 shows: "Progressing" → "canary step 1/3" +- [ ] After ~30 sec: New canary pod appears +- [ ] After ~2 min: "canary step 2/3" (pause) +- [ ] After ~4 min: "canary step 3/3" (100% traffic) +- [ ] After ~4:20 min: Status shows "Healthy" +- [ ] Old pods terminated, 2 new pods in stable state + +## Monitoring (Optional) + +- [ ] Check logs: `just logs-frontend` +- [ ] Check Grafana Tempo for traces: https://grafana.f3s.buetow.org + - [ ] Navigate to Explore → Tempo + - [ ] Query: `{ resource.service.name = "frontend" }` + - [ ] See traces from old and new versions +- [ ] Check Prometheus metrics: Port-forward and query + +## Advanced Scenarios + +### Scenario 1: Manual Promotion +- [ ] Trigger rollout (step above) +- [ ] After step 1 (30 sec), run: + ```bash + just rollout-promote + ``` +- [ ] Watch rollout skip step 2, immediately promote to 100% +- [ ] Verify: `just rollout-status` shows "Healthy" + +### Scenario 2: Abort/Rollback +- [ ] Trigger rollout +- [ ] While progressing, run: + ```bash + just rollout-abort + ``` +- [ ] Watch canary pods terminate +- [ ] Old version continues running +- [ ] Verify: `just rollout-status` shows "Aborted" + +### Scenario 3: Check History +- [ ] After any rollout: + ```bash + just rollout-history + ``` +- [ ] See previous revisions and their status + +## Integration with CI/CD + +- [ ] Image builds automatically on git push (or configured pipeline) +- [ ] New image pushed to registry: `registry.lan.buetow.org:30001/tracing-demo-frontend:NEWTAG` +- [ ] Git updated with new image tag +- [ ] ArgoCD detects change +- [ ] Rollout automatically triggered +- [ ] Canary strategy executes + +## Post-Deployment + +- [ ] Share documentation: + - [ ] `ROLLOUTS-SETUP.md` - Complete setup guide + - [ ] `tracing-demo/ROLLOUTS-DEMO.md` - Detailed walkthrough + - [ ] `ARGO-ROLLOUTS-SUMMARY.md` - Architecture overview +- [ ] Add team to `kubectl argo rollouts` usage +- [ ] Consider next steps: + - [ ] Deploy Istio for advanced traffic management + - [ ] Add Flagger for automated analysis + - [ ] Extend to other services (middleware, backend) + - [ ] Create monitoring dashboards + +## Troubleshooting Checklist + +### Controller not running +- [ ] Check pod: `kubectl get pods -n cicd -l app.kubernetes.io/name=argo-rollouts` +- [ ] Check logs: `kubectl logs -n cicd -l app.kubernetes.io/name=argo-rollouts` +- [ ] Check CRD: `kubectl get crd | grep rollout` + +### Rollout not deploying +- [ ] Check ArgoCD sync: `argocd app get tracing-demo` +- [ ] Check git changes pushed: `git log --oneline | head -5` +- [ ] Force sync: `argocd app sync tracing-demo --prune` + +### Canary pods not starting +- [ ] Check pod status: `kubectl describe pod -n services <pod-name>` +- [ ] Check logs: `kubectl logs -n services <pod-name>` +- [ ] Check resource limits: `kubectl top pods -n services` +- [ ] Check image: `kubectl get pods -n services -o jsonpath='{.items[*].spec.containers[0].image}'` + +### Rollout stuck in Progressing +- [ ] Check health probes: `kubectl get rollout tracing-demo-frontend -n services -o yaml | grep -A 10 health` +- [ ] Check replica status: `kubectl get rs -n services -l app=tracing-demo-frontend -o wide` +- [ ] Check controller logs: `kubectl logs -n cicd -l app.kubernetes.io/name=argo-rollouts --tail=50` + +## Cleanup (If Needed) + +- [ ] Stop rollout: `kubectl argo rollouts abort tracing-demo-frontend -n services` +- [ ] Rollback to previous: `kubectl rollout undo deployment/tracing-demo-frontend -n services` (if needed) +- [ ] Uninstall Argo Rollouts: `cd argo-rollouts && just uninstall` + +--- + +**Setup complete when:** +- ✅ Argo Rollouts controller running in `cicd` namespace +- ✅ Frontend rollout deployed in `services` namespace +- ✅ ArgoCD recognizes rollout resource +- ✅ One demo run successful (git trigger or kubectl patch) +- ✅ Team can watch and manage rollouts diff --git a/f3s/ROLLOUTS-FILE-TREE.txt b/f3s/ROLLOUTS-FILE-TREE.txt new file mode 100644 index 0000000..6c85754 --- /dev/null +++ b/f3s/ROLLOUTS-FILE-TREE.txt @@ -0,0 +1,183 @@ +/home/paul/git/conf/f3s/ +├── README-ROLLOUTS.md ← ENTRY POINT (quick reference) +├── ARGO-ROLLOUTS-SUMMARY.md ← Full architecture & overview +├── ROLLOUTS-SETUP.md ← Detailed setup + 5 scenarios +├── ROLLOUTS-CHECKLIST.md ← Step-by-step deployment +├── ROLLOUTS-FILE-TREE.txt ← This file +│ +├── argo-rollouts/ ← NEW: Argo Rollouts Controller +│ ├── Justfile ← Install/upgrade/uninstall +│ ├── values.yaml ← Helm configuration +│ └── README.md ← Controller-specific guide +│ +├── argocd-apps/ +│ ├── cicd/ +│ │ ├── git-server.yaml +│ │ └── argo-rollouts.yaml ← NEW: Controller app +│ │ +│ └── services/ +│ ├── tracing-demo.yaml ← UPDATED: Deployment → Rollout +│ └── ... (other apps) +│ +├── tracing-demo/ +│ ├── README.md +│ ├── Justfile ← UPDATED: Added rollout commands +│ ├── ROLLOUTS-DEMO.md ← NEW: Technical walkthrough +│ ├── rollout-demo.sh ← NEW: Demo automation +│ │ +│ └── helm-chart/ +│ ├── Chart.yaml +│ └── templates/ +│ ├── frontend-rollout.yaml ← NEW: Canary rollout definition +│ ├── frontend-deployment.yaml ← KEPT: For reference +│ ├── middleware-deployment.yaml ← (unchanged) +│ ├── backend-deployment.yaml ← (unchanged) +│ ├── frontend-service.yaml +│ ├── middleware-service.yaml +│ ├── backend-service.yaml +│ └── ingress.yaml +│ +└── ... (other apps unchanged) + + +═══════════════════════════════════════════════════════════════════════════ + +INSTALLATION SUMMARY +═══════════════════════════════════════════════════════════════════════════ + +Step 1: Install Controller + cd /home/paul/git/conf/f3s/argo-rollouts + just install + +Step 2: Verify ArgoCD + argocd app sync argo-rollouts + argocd app sync tracing-demo + +Step 3: Watch Demo + cd /home/paul/git/conf/f3s/tracing-demo + just rollout-watch + +Step 4: Trigger Rollout (in another terminal) + kubectl patch rollout tracing-demo-frontend -n services \ + --type='json' \ + -p='[{"op":"replace","path":"/spec/template/spec/containers/0/image","value":"registry.lan.buetow.org:30001/tracing-demo-frontend:latest"}]' + +═══════════════════════════════════════════════════════════════════════════ + +DOCUMENTATION ROADMAP +═══════════════════════════════════════════════════════════════════════════ + +NEW TO ARGO ROLLOUTS? + 1. Read: README-ROLLOUTS.md (3 min) + 2. Read: ARGO-ROLLOUTS-SUMMARY.md (10 min) + 3. Follow: ROLLOUTS-CHECKLIST.md (step-by-step) + +WANT DETAILED GUIDE? + → ROLLOUTS-SETUP.md + - Complete setup instructions + - 5 demo scenarios with expected output + - Monitoring dashboards + - Advanced patterns + +DOING THE DEPLOYMENT? + → ROLLOUTS-CHECKLIST.md + - Pre-deployment checks + - Installation steps + - Verification + - Troubleshooting + +TROUBLESHOOTING? + → ROLLOUTS-SETUP.md → Troubleshooting section + → argo-rollouts/README.md + → tracing-demo/ROLLOUTS-DEMO.md + +═══════════════════════════════════════════════════════════════════════════ + +KEY FILES EXPLAINED +═══════════════════════════════════════════════════════════════════════════ + +argo-rollouts/Justfile + - Automates installation of Argo Rollouts controller + - Commands: install, upgrade, uninstall, status, logs + - Deploys to: cicd namespace + +argo-rollouts/values.yaml + - Helm chart configuration for Argo Rollouts + - Sets resource limits, metrics, replicas + +argocd-apps/cicd/argo-rollouts.yaml + - ArgoCD Application resource + - Manages controller installation via GitOps + - Auto-syncs when argo-rollouts/ changes in git + +tracing-demo/helm-chart/templates/frontend-rollout.yaml + - Replaces frontend-deployment.yaml + - Defines canary strategy: + * Step 1: 50% traffic + * Step 2: 2-minute pause + * Step 3: 100% promotion + - Keeps same pods, volumes, env vars as Deployment + +tracing-demo/Justfile (updated) + - New commands for rollout management + - just rollout-watch + - just rollout-status + - just rollout-promote + - just rollout-abort + - just rollout-history + +tracing-demo/rollout-demo.sh + - Automation script for demo + - Checks prerequisites + - Guides through demo workflow + - Can be extended for CI/CD + +═══════════════════════════════════════════════════════════════════════════ + +WHAT CHANGED IN EXISTING FILES +═══════════════════════════════════════════════════════════════════════════ + +tracing-demo/Justfile + [+] 8 new rollout commands + [-] No breaking changes to existing commands + +tracing-demo/helm-chart/templates/frontend-deployment.yaml + [~] Still exists (for reference, not deployed) + [→] Replaced by frontend-rollout.yaml in deployment + +argocd-apps/services/tracing-demo.yaml + [+] RespectIgnoreDifferences=true sync option + [-] No other changes (points to same Helm chart) + +═══════════════════════════════════════════════════════════════════════════ + +WHAT DID NOT CHANGE +═══════════════════════════════════════════════════════════════════════════ + +✓ Middleware & Backend services remain Deployments +✓ All service definitions (frontend, middleware, backend services) +✓ Ingress configuration +✓ All other apps (audiobookshelf, miniflux, etc.) +✓ ArgoCD configuration & installation +✓ Prometheus/Grafana setup + +═══════════════════════════════════════════════════════════════════════════ + +HOW TO NAVIGATE THIS +═══════════════════════════════════════════════════════════════════════════ + +If you want to... See... +──────────────────────────────────────────────────────────────────────────── +Understand what was created ARGO-ROLLOUTS-SUMMARY.md +Get started quickly README-ROLLOUTS.md +Deploy step-by-step ROLLOUTS-CHECKLIST.md +See detailed scenarios & examples ROLLOUTS-SETUP.md +Troubleshoot issues ROLLOUTS-SETUP.md (Troubleshooting section) +Learn technical details tracing-demo/ROLLOUTS-DEMO.md +Install the controller argo-rollouts/Justfile + argo-rollouts/README.md +See the rollout definition tracing-demo/helm-chart/templates/frontend-rollout.yaml +Run a demo tracing-demo/rollout-demo.sh or just rollout-watch +Monitor during rollout Prometheus/Grafana (see ROLLOUTS-SETUP.md) +Integrate with CI/CD See ROLLOUTS-SETUP.md section "GitOps Flow" + +═══════════════════════════════════════════════════════════════════════════ diff --git a/f3s/ROLLOUTS-SETUP.md b/f3s/ROLLOUTS-SETUP.md new file mode 100644 index 0000000..b7ebb55 --- /dev/null +++ b/f3s/ROLLOUTS-SETUP.md @@ -0,0 +1,429 @@ +# Argo Rollouts Setup and Demo Guide + +This guide covers the complete setup and demonstration of Argo Rollouts with the tracing-demo application. + +## Quick Setup + +### 1. Install Argo Rollouts Controller + +```bash +cd /home/paul/git/conf/f3s/argo-rollouts +just install +``` + +Verify installation: +```bash +kubectl get pods -n cicd -l app.kubernetes.io/name=argo-rollouts +kubectl get crd | grep rollout +``` + +### 2. Install kubectl Plugin (Optional but Recommended) + +```bash +curl -LO https://github.com/argoproj/argo-rollouts/releases/latest/download/kubectl-argo-rollouts-linux-amd64 +chmod +x kubectl-argo-rollouts-linux-amd64 +sudo install -m 755 kubectl-argo-rollouts-linux-amd64 /usr/local/bin/kubectl-argo-rollouts +``` + +Verify: +```bash +kubectl argo rollouts version +``` + +### 3. Sync ArgoCD with New Applications + +The following ArgoCD Applications will be auto-synced: + +- **argo-rollouts.yaml** - Installs Argo Rollouts controller +- **tracing-demo.yaml** - Now uses Rollout (frontend) + Deployments (middleware, backend) + +Force ArgoCD to sync: +```bash +argocd app sync argo-rollouts +argocd app sync tracing-demo +``` + +Or wait for auto-sync (default: 3 minutes). + +### 4. Verify Rollout is Deployed + +```bash +kubectl get rollout tracing-demo-frontend -n services +kubectl describe rollout tracing-demo-frontend -n services +``` + +Expected status: `Stable` with `2/2 replicas`. + +## Demo Scenarios + +### Scenario 1: Basic Canary Rollout (Guided) + +**Duration**: ~5-10 minutes + +**Objective**: Observe frontend rollout from 50% → 100% traffic with auto-promotion. + +#### Step 1: Prepare Terminals + +Terminal 1 - Watch rollout progress: +```bash +cd /home/paul/git/conf/f3s/tracing-demo +just rollout-watch +``` + +Terminal 2 - Generate load: +```bash +cd /home/paul/git/conf/f3s/tracing-demo +just load-test & +``` + +Terminal 3 - Trigger rollout: +```bash +# Will use this in next step +``` + +#### Step 2: Trigger Rollout (Terminal 3) + +Simulate updating the frontend image: + +```bash +kubectl patch rollout tracing-demo-frontend -n services \ + --type='json' \ + -p='[{"op":"replace","path":"/spec/template/spec/containers/0/image","value":"registry.lan.buetow.org:30001/tracing-demo-frontend:latest"}]' +``` + +Or via git (more GitOps-like): + +```bash +cd /home/paul/git/conf/f3s +# Edit tracing-demo/helm-chart/templates/frontend-rollout.yaml (change image tag) +git add -A +git commit -m "chore: update frontend image for demo" +git remote add r0 ssh://git@r0:30022/repos/conf.git 2>/dev/null || true +git push r0 master + +# Trigger ArgoCD sync +kubectl annotate application tracing-demo -n cicd argocd.argoproj.io/refresh=normal --overwrite +``` + +#### Step 3: Observe Rollout (Terminal 1) + +Watch the output: + +``` +NAME KIND STATUS AGE INFO +tracing-demo-frontend Rollout Progressing 0s canary step 1/3 +tracing-demo-frontend-abc123 ReplicaSet ✓ canary 5s 1/1 replicas +tracing-demo-frontend-xyz789 ReplicaSet ✓ stable 5m 2/2 replicas + +NAME KIND STATUS AGE INFO +tracing-demo-frontend Rollout Progressing 2m5s canary step 2/3 +tracing-demo-frontend-abc123 ReplicaSet ✓ canary 2m 1/1 replicas (ready) +tracing-demo-frontend-xyz789 ReplicaSet ✓ stable 5m 2/2 replicas + +NAME KIND STATUS AGE INFO +tracing-demo-frontend Rollout Progressing 4m10s canary step 3/3 +tracing-demo-frontend-abc123 ReplicaSet ✓ canary 4m 2/2 replicas (ready, updated) +tracing-demo-frontend-xyz789 ReplicaSet ✓ stable 5m 0/2 replicas (pending termination) + +NAME KIND STATUS AGE INFO +tracing-demo-frontend Rollout ✓ Healthy 4m20s +tracing-demo-frontend-abc123 ReplicaSet ✓ stable 4m 2/2 replicas +``` + +**Timeline:** +- **0-2 min**: Step 1 (setWeight: 50) - 1 canary pod, 2 stable pods, 50/50 traffic +- **2-4 min**: Step 2 (pause: 2m) - Waiting for user or auto-promotion +- **4+ min**: Step 3 (setWeight: 100) - All 2 canary pods promoted, old pods terminated +- **4:20 min**: Complete - New version fully deployed + +#### Step 4: Observe Behavior (Optional) + +Check request latency/errors during rollout: + +```bash +# View logs from both old and new pods +kubectl logs -n services -l app=tracing-demo-frontend --timestamps=true | tail -20 + +# Check if any requests failed during transition +grep -i "error\|exception" <(kubectl logs -n services -l app=tracing-demo-frontend) +``` + +View traces in Grafana: +1. Navigate to https://grafana.f3s.buetow.org +2. Explore → Tempo +3. Query: `{ resource.service.name = "frontend" }` +4. See traces from both old and new versions + +### Scenario 2: Manual Promotion (Skip Waiting) + +**Duration**: ~2 minutes + +**Objective**: Demonstrate manual control - don't wait for auto-promotion. + +#### Setup + +Trigger rollout (same as Scenario 1): +```bash +kubectl patch rollout tracing-demo-frontend -n services \ + --type='json' \ + -p='[{"op":"replace","path":"/spec/template/spec/containers/0/image","value":"registry.lan.buetow.org:30001/tracing-demo-frontend:latest"}]' +``` + +Watch: +```bash +just rollout-watch +``` + +#### Promote Early + +After canary looks healthy (step 1 complete, ~30 seconds): + +```bash +cd /home/paul/git/conf/f3s/tracing-demo +just rollout-promote +``` + +This skips the 2-minute pause and immediately promotes to 100%. + +### Scenario 3: Abort/Rollback + +**Duration**: ~3 minutes + +**Objective**: Demonstrate rollback if canary fails. + +#### Setup & Trigger + +Same as Scenario 1. + +#### Simulate Failure + +While at canary step 1 (50% traffic), introduce a failure: + +```bash +# Get one of the new canary pods +CANARY_POD=$(kubectl get pods -n services -l app=tracing-demo-frontend -o name | tail -1) + +# Kill it to simulate crash +kubectl delete $CANARY_POD -n services +``` + +Watch in Terminal 1 - the rollout may stall or fail health checks. + +#### Abort + +```bash +cd /home/paul/git/conf/f3s/tracing-demo +just rollout-abort +``` + +This: +- Stops the rollout +- Terminates canary replicas +- Restores stable version with 2 pods +- Allows investigation + +Verify: +```bash +just rollout-status +``` + +Expected: `Rollout has been aborted. Stable ReplicaSet: 2/2 replicas` + +### Scenario 4: Observability - Prometheus Metrics + +**Duration**: ~5 minutes (during any rollout) + +**Objective**: Monitor rollout via Prometheus metrics. + +During a running rollout: + +```bash +# Port-forward Prometheus +kubectl port-forward -n monitoring svc/prometheus 9090:9090 & + +# Open browser: http://localhost:9090 +``` + +Query useful metrics: + +```promql +# Rollout replica counts +kube_statefulset_replicas{statefulset=~".*frontend.*"} +kube_replicaset_created{replicaset=~".*frontend.*"} + +# Pod status during rollout +kube_pod_status_phase{namespace="services", pod=~".*frontend.*"} + +# Request latency (if your app exports metrics) +rate(http_requests_total{job="frontend"}[5m]) + +# Error rate +rate(http_requests_total{job="frontend", status=~"5.."}[5m]) +``` + +### Scenario 5: GitOps Flow (Realistic) + +**Duration**: ~10 minutes + +**Objective**: Demonstrate GitOps workflow - git commit triggers rollout via ArgoCD. + +#### Step 1: Modify Frontend Code + +```bash +cd /home/paul/git/conf/f3s/tracing-demo/docker/frontend +# Edit app.py (e.g., change response message) +# Commit and push +git add -A +git commit -m "feat: update frontend message" +git push origin master +``` + +#### Step 2: Rebuild and Push Image + +```bash +cd /home/paul/git/conf/f3s/tracing-demo +just build-push +``` + +This creates new Docker image tagged with latest commit hash or timestamp. + +#### Step 3: Update Helm Chart + +```bash +# Edit frontend-rollout.yaml with new image tag +nano /home/paul/git/conf/f3s/tracing-demo/helm-chart/templates/frontend-rollout.yaml +# Change image: registry.lan.buetow.org:30001/tracing-demo-frontend:NEWTAG + +git add -A +git commit -m "chore: update frontend rollout image to latest" +git remote add r0 ssh://git@r0:30022/repos/conf.git 2>/dev/null || true +git push r0 master +``` + +#### Step 4: ArgoCD Syncs Automatically + +Wait 3 minutes or force sync: +```bash +argocd app sync tracing-demo --prune +``` + +ArgoCD detects the new image in git and updates the rollout. + +#### Step 5: Watch Rollout Progress + +```bash +just rollout-watch +``` + +The canary strategy executes: 50% → wait 2min → 100%. + +## Monitoring Dashboard + +Create a Grafana dashboard to visualize rollout progress: + +1. Open Grafana: https://grafana.f3s.buetow.org +2. Dashboards → New → Create +3. Add panels: + +**Panel 1: Rollout Status** +```promql +kube_rollout_status_current_step{rollout="tracing-demo-frontend"} +``` + +**Panel 2: Replica Counts** +```promql +topk(2, kube_replicaset_replicas{replicaset=~"tracing-demo-frontend.*"}) +``` + +**Panel 3: Pod Age** +```promql +time() - kube_pod_created{namespace="services", pod=~"tracing-demo-frontend.*"} +``` + +**Panel 4: Request Rate** +```promql +rate(http_requests_total{job="tracing-demo-frontend"}[1m]) +``` + +## Advanced: Custom Analysis + +To add automated health checks during canary (e.g., error rate thresholds), integrate with **Flagger**: + +```yaml +apiVersion: flagger.app/v1beta1 +kind: Canary +metadata: + name: tracing-demo-frontend +spec: + targetRef: + apiVersion: argoproj.io/v1alpha1 + kind: Rollout + name: tracing-demo-frontend + progressDeadlineSeconds: 300 + service: + port: 5000 + analysis: + interval: 1m + threshold: 2 + maxWeight: 50 + stepWeight: 10 + metrics: + - name: error_rate + thresholdRange: + max: 1 # Max 1% error rate +``` + +This requires installing **Flagger** and requires a service mesh (Istio/Linkerd). + +## Troubleshooting + +### Rollout Stuck in Progressing + +```bash +kubectl describe rollout tracing-demo-frontend -n services +``` + +Check for: +- Pod failures (CrashLoopBackOff) +- Image pull errors +- Resource exhaustion +- Health probe failures + +### Canary Pods Not Becoming Ready + +```bash +kubectl get pods -n services -l app=tracing-demo-frontend -o wide +kubectl logs -n services -l app=tracing-demo-frontend --tail=50 +``` + +### ArgoCD Not Syncing Rollout Changes + +```bash +kubectl get application tracing-demo -n cicd -o jsonpath='{.status.sync.status}' +argocd app sync tracing-demo +``` + +### kubectl argo rollouts Plugin Issues + +```bash +kubectl argo rollouts version + +# If not installed or outdated: +curl -LO https://github.com/argoproj/argo-rollouts/releases/latest/download/kubectl-argo-rollouts-linux-amd64 +sudo install -m 755 kubectl-argo-rollouts-linux-amd64 /usr/local/bin/kubectl-argo-rollouts +``` + +## Next Steps + +1. **Try all scenarios** to understand rollout behavior +2. **Deploy Istio** for advanced traffic management (weighted routing, header-based) +3. **Add Prometheus queries** to monitor rollout metrics +4. **Implement Flagger** for automated analysis and rollback +5. **Migrate other services** to Rollout (start with low-risk apps) + +## References + +- [Argo Rollouts Canary Strategy](https://argoproj.github.io/argo-rollouts/features/canary/) +- [Argo Rollouts Blue-Green Strategy](https://argoproj.github.io/argo-rollouts/features/bluegreen/) +- [Flagger Documentation](https://flagger.app/) +- [Istio VirtualService](https://istio.io/latest/docs/reference/config/networking/virtual-service/) diff --git a/f3s/argo-rollouts/Justfile b/f3s/argo-rollouts/Justfile new file mode 100644 index 0000000..03ab061 --- /dev/null +++ b/f3s/argo-rollouts/Justfile @@ -0,0 +1,33 @@ +# Argo Rollouts deployment automation +# Deploys Argo Rollouts controller to the 'cicd' namespace + +NAMESPACE := "cicd" +RELEASE_NAME := "argo-rollouts" + +install: + helm repo add argo https://argoproj.github.io/argo-helm || true + helm repo update + kubectl apply -f - <<'EOF' +apiVersion: v1 +kind: Namespace +metadata: + name: {{NAMESPACE}} +EOF + helm install {{RELEASE_NAME}} argo/argo-rollouts --namespace {{NAMESPACE}} -f values.yaml + @echo "" + @echo "Argo Rollouts deployed successfully!" + @echo "Controller installed in {{NAMESPACE}} namespace" + @echo "" + +upgrade: + helm upgrade {{RELEASE_NAME}} argo/argo-rollouts --namespace {{NAMESPACE}} -f values.yaml + +uninstall: + helm uninstall {{RELEASE_NAME}} --namespace {{NAMESPACE}} || true + +status: + kubectl get pods -n {{NAMESPACE}} -l app.kubernetes.io/name=argo-rollouts + kubectl get crd | grep rollout + +logs: + kubectl logs -n {{NAMESPACE}} -l app.kubernetes.io/name=argo-rollouts --tail=100 -f diff --git a/f3s/argo-rollouts/README.md b/f3s/argo-rollouts/README.md new file mode 100644 index 0000000..6f9e86e --- /dev/null +++ b/f3s/argo-rollouts/README.md @@ -0,0 +1,85 @@ +# Argo Rollouts Deployment for f3s Cluster + +Argo Rollouts is a Kubernetes controller for progressive delivery strategies including canary, blue-green, and A/B testing deployments. + +## Overview + +- **Namespace**: `cicd` (alongside ArgoCD) +- **Deployment Mode**: Single instance +- **CRD**: Rollout custom resource for progressive deployments +- **Integration**: Works with ArgoCD for GitOps-based rollouts + +## Installation + +```bash +just install +``` + +## Verification + +```bash +just status +``` + +Check that the rollouts controller pod is running: +```bash +kubectl get pods -n cicd -l app.kubernetes.io/name=argo-rollouts +``` + +Check CRD is installed: +```bash +kubectl get crd | grep rollout +``` + +## Demo: Tracing-Demo Frontend Rollout + +The frontend service uses a Canary strategy: + +1. Deploy new version +2. Send 50% traffic to new version +3. Monitor for 2 minutes +4. If successful, shift 100% traffic +5. If failures detected, rollback + +### Watch Rollout Progress + +```bash +# Real-time status +kubectl argo rollouts get rollouts tracing-demo-frontend -n services --watch + +# Full rollout status +kubectl argo rollouts status tracing-demo-frontend -n services + +# Describe rollout details +kubectl describe rollout tracing-demo-frontend -n services +``` + +### Trigger a New Rollout + +Update the frontend image tag in git (or use kubectl): + +```bash +# Patch to trigger new rollout +kubectl patch rollout tracing-demo-frontend -n services \ + --type json -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value":"registry.lan.buetow.org:30001/tracing-demo-frontend:v2"}]' +``` + +Or via git commit and ArgoCD sync. + +### Manual Promotion (Skip Canary Steps) + +```bash +kubectl argo rollouts promote tracing-demo-frontend -n services +``` + +### Abort/Rollback + +```bash +kubectl argo rollouts abort tracing-demo-frontend -n services +``` + +## References + +- [Argo Rollouts Documentation](https://argoproj.github.io/argo-rollouts/) +- [Canary Strategy Guide](https://argoproj.github.io/argo-rollouts/features/canary/) +- [ArgoCD Integration](https://argoproj.github.io/argo-rollouts/generated/notification-services/argocd/) diff --git a/f3s/argo-rollouts/values.yaml b/f3s/argo-rollouts/values.yaml new file mode 100644 index 0000000..ed77670 --- /dev/null +++ b/f3s/argo-rollouts/values.yaml @@ -0,0 +1,28 @@ +# Argo Rollouts Helm Chart Values Override +# Following f3s cluster patterns: single instance deployment + +# Controller configuration +controller: + replicas: 1 + # Enable metrics for Prometheus integration + metrics: + enabled: true + serviceMonitor: + enabled: false # Will enable if Prometheus available + # Resource limits + resources: + limits: + cpu: 500m + memory: 512Mi + requests: + cpu: 250m + memory: 256Mi + +# Notification controller - disabled +notifications: + enabled: false + +# CRD installation +crds: + install: true + keep: true diff --git a/f3s/argocd-apps/cicd/argo-rollouts.yaml b/f3s/argocd-apps/cicd/argo-rollouts.yaml new file mode 100644 index 0000000..4437bee --- /dev/null +++ b/f3s/argocd-apps/cicd/argo-rollouts.yaml @@ -0,0 +1,28 @@ +apiVersion: argoproj.io/v1alpha1 +kind: Application +metadata: + name: argo-rollouts + namespace: cicd + finalizers: + - resources-finalizer.argocd.argoproj.io +spec: + project: default + source: + repoURL: http://git-server.cicd.svc.cluster.local/conf.git + targetRevision: master + path: f3s/argo-rollouts + destination: + server: https://kubernetes.default.svc + namespace: cicd + syncPolicy: + automated: + prune: true + selfHeal: true + syncOptions: + - CreateNamespace=true + retry: + limit: 3 + backoff: + duration: 5s + factor: 2 + maxDuration: 1m diff --git a/f3s/argocd-apps/services/tracing-demo.yaml b/f3s/argocd-apps/services/tracing-demo.yaml index 61a0a9c..c9283a6 100644 --- a/f3s/argocd-apps/services/tracing-demo.yaml +++ b/f3s/argocd-apps/services/tracing-demo.yaml @@ -20,6 +20,8 @@ spec: selfHeal: true syncOptions: - CreateNamespace=false + # Ignore old Deployment resources (using Rollout instead) + - RespectIgnoreDifferences=true retry: limit: 3 backoff: diff --git a/f3s/tracing-demo/Justfile b/f3s/tracing-demo/Justfile index 79d2cdd..b3b1eae 100644 --- a/f3s/tracing-demo/Justfile +++ b/f3s/tracing-demo/Justfile @@ -104,3 +104,40 @@ port-forward-backend: check-traces: @echo "Check Grafana Tempo for traces with:" @echo " { resource.service.namespace = \"tracing-demo\" }" + +# === Argo Rollouts Commands === + +# Watch frontend rollout progress +rollout-watch: + @if ! command -v kubectl-argo-rollouts &> /dev/null; then \ + echo "ERROR: kubectl argo rollouts plugin not installed"; \ + echo "Install with: curl -LO https://github.com/argoproj/argo-rollouts/releases/latest/download/kubectl-argo-rollouts-linux-amd64 && sudo install -m 755 kubectl-argo-rollouts-linux-amd64 /usr/local/bin/kubectl-argo-rollouts"; \ + exit 1; \ + fi + kubectl argo rollouts get rollout tracing-demo-frontend -n {{NAMESPACE}} --watch + +# Check rollout status +rollout-status: + kubectl argo rollouts status tracing-demo-frontend -n {{NAMESPACE}} + +# Describe rollout +rollout-info: + kubectl argo rollouts get rollout tracing-demo-frontend -n {{NAMESPACE}} + +# Promote canary to 100% (skip waiting) +rollout-promote: + kubectl argo rollouts promote tracing-demo-frontend -n {{NAMESPACE}} + +# Abort current rollout +rollout-abort: + kubectl argo rollouts abort tracing-demo-frontend -n {{NAMESPACE}} + +# View rollout history +rollout-history: + kubectl argo rollouts history tracing-demo-frontend -n {{NAMESPACE}} + +# Start demo (requires manual rollout trigger in another terminal) +rollout-demo: + @echo "Starting Argo Rollouts demo..." + @echo "" + ./rollout-demo.sh diff --git a/f3s/tracing-demo/ROLLOUTS-DEMO.md b/f3s/tracing-demo/ROLLOUTS-DEMO.md new file mode 100644 index 0000000..53a43b3 --- /dev/null +++ b/f3s/tracing-demo/ROLLOUTS-DEMO.md @@ -0,0 +1,317 @@ +# Argo Rollouts Demo Guide for Tracing-Demo + +This guide demonstrates progressive delivery using Argo Rollouts with the tracing-demo frontend service. + +## Prerequisites + +- Argo Rollouts installed in `cicd` namespace +- ArgoCD synced with the latest conf.git +- tracing-demo-frontend rollout deployed +- kubectl argo rollouts plugin installed + +### Install kubectl argo rollouts plugin + +```bash +curl -LO https://github.com/argoproj/argo-rollouts/releases/latest/download/kubectl-argo-rollouts-linux-amd64 +chmod +x kubectl-argo-rollouts-linux-amd64 +sudo mv kubectl-argo-rollouts-linux-amd64 /usr/local/bin/kubectl-argo-rollouts +``` + +## Demo Workflow + +### 1. Verify Current State + +```bash +# Check frontend rollout +kubectl get rollout tracing-demo-frontend -n services + +# Get detailed status +kubectl argo rollouts status tracing-demo-frontend -n services +kubectl argo rollouts get rollout tracing-demo-frontend -n services +``` + +Expected output shows 2 stable replicas, 0 canary. + +### 2. Generate Load (Optional but Recommended) + +In a separate terminal, generate traffic to the frontend: + +```bash +# Port-forward frontend +kubectl port-forward -n services svc/frontend-service 5000:5000 & + +# Send requests in a loop +while true; do + curl http://localhost:5000/api/process -s | jq . + sleep 1 +done +``` + +Or use the load test: + +```bash +cd /home/paul/git/conf/f3s/tracing-demo +just load-test & +``` + +### 3. Trigger a New Rollout + +Simulate updating the frontend image (e.g., new version): + +```bash +# Method 1: Patch the rollout to trigger new image +kubectl patch rollout tracing-demo-frontend -n services \ + --type='json' \ + -p='[{"op":"replace","path":"/spec/template/spec/containers/0/image","value":"registry.lan.buetow.org:30001/tracing-demo-frontend:latest"}]' +``` + +Or, more realistically, push a new image tag and update git: + +```bash +cd /home/paul/git/conf/f3s/tracing-demo/helm-chart/templates +# Edit frontend-rollout.yaml and change image tag +# Then commit and push to git-server +git commit -am "chore: update frontend image" +git push origin master +``` + +ArgoCD will auto-sync and apply the new image, triggering the rollout. + +### 4. Watch the Rollout Progress + +**Terminal 1: Real-time rollout status (refreshes)** + +```bash +kubectl argo rollouts get rollout tracing-demo-frontend -n services --watch +``` + +You'll see: +``` +NAME KIND STATUS AGE INFO +tracing-demo-frontend Rollout Progressing 2m canary step 1/3 +tracing-demo-frontend-abc123 ReplicaSet ✓ canary 2m 2/2 replicas +tracing-demo-frontend-xyz789 ReplicaSet ✓ stable 5m 2/2 replicas +``` + +**Terminal 2: Pod status** + +```bash +kubectl get pods -n services -l app=tracing-demo-frontend -w +``` + +Shows new canary pods being created, old stable pods remaining. + +**Terminal 3: Service endpoints (traffic split)** + +```bash +watch -n 1 'kubectl get endpoints -n services frontend-service' +``` + +During canary, both old and new endpoints visible (50/50 traffic). + +### 5. Key Rollout States + +**Progressing (Canary Step 1: 50% Traffic)** +- Duration: 0-2 minutes +- New canary replicas (1 out of 2) serve traffic +- Old stable replicas (1 out of 2) serve traffic +- Health checks and error rates monitored + +**Paused (Canary Step 2: Hold)** +- Duration: 2 minutes +- Allows observing new version behavior +- Watch metrics in Grafana/Prometheus +- Can manually promote or abort + +**Full Promotion (Canary Step 3: 100% Traffic)** +- After 2 minutes, auto-promotes to 100% +- New replicas become stable +- Old replicas terminated + +### 6. Monitor Canary Behavior + +While rollout is progressing, check application health: + +**Check logs of new canary pods:** + +```bash +# Get canary replica set revision +CANARY_REVISION=$(kubectl get rs -n services -l app=tracing-demo-frontend --sort-by='.metadata.creationTimestamp' | tail -1 | awk '{print $1}') + +# View logs +kubectl logs -n services -l app=tracing-demo-frontend,controller-revision-hash=$CANARY_REVISION --tail=50 -f +``` + +**Check Prometheus metrics:** + +```bash +# Query frontend endpoint availability +curl -s 'http://localhost:9090/api/v1/query?query=up{job="tracing-demo-frontend"}' | jq + +# Query request error rate +curl -s 'http://localhost:9090/api/v1/query?query=rate(http_requests_total{job="tracing-demo-frontend",status=~"5.."}[5m])' | jq +``` + +**Check traces in Grafana Tempo:** + +Navigate to Grafana Explore → Tempo, query: +``` +{ resource.service.name = "frontend" } +``` + +Watch traces from both old and new versions. + +### 7. Manual Promotion (Skip Waiting) + +If canary looks healthy, skip the 2-minute wait: + +```bash +kubectl argo rollouts promote tracing-demo-frontend -n services +``` + +Immediately promotes to 100% traffic, completes rollout. + +### 8. Abort/Rollback + +If canary is unhealthy, abort the rollout: + +```bash +kubectl argo rollouts abort tracing-demo-frontend -n services +``` + +This: +- Stops the rollout progression +- Terminates canary replicas +- Keeps previous stable version running +- Allows investigation and retry + +### 9. View Rollout History + +```bash +kubectl argo rollouts history tracing-demo-frontend -n services + +# Get details of specific revision +kubectl argo rollouts history tracing-demo-frontend -n services --revision=2 +``` + +## Demo Variations + +### A. Inject a Failure + +Simulate unhealthy canary by making requests fail: + +```bash +# Get canary pod name +CANARY_POD=$(kubectl get pods -n services -l app=tracing-demo-frontend -o jsonpath='{.items[1].metadata.name}') + +# Inject failure (e.g., kill process) +kubectl exec -n services $CANARY_POD -- sh -c 'kill 1' & + +# Watch error rate spike in terminal watching rollout +# Rollout will stall at canary step, waiting for stable metrics +``` + +### B. Compare Old vs New via Load Testing + +```bash +# Terminal 1: Watch rollout at 50% traffic +kubectl argo rollouts get rollout tracing-demo-frontend -n services --watch + +# Terminal 2: Run load test +cd /home/paul/git/conf/f3s/tracing-demo && just load-test + +# Terminal 3: Check if latency differs between old/new +kubectl logs -n services -l app=tracing-demo-frontend --tail=20 --timestamps=true +``` + +### C. Long-running Canary + +Edit frontend-rollout.yaml to increase pause duration: + +```yaml +- pause: + duration: 10m # Observe for 10 minutes +``` + +Allows extended monitoring and confidence building. + +## Architecture Notes + +### No Service Mesh Required + +This demo uses **native Kubernetes service routing** (simple round-robin). Traffic splitting happens at the pod replica level: + +- Stable ReplicaSet: 2 pods (or 1 out of 2) +- Canary ReplicaSet: 0 pods (or 1 out of 2) +- Service selects both ReplicaSets +- K8s load-balancer distributes traffic proportionally + +**To get more sophisticated traffic splitting (header-based, percentage-based), install:** +- **Istio** with VirtualService/DestinationRule +- **Linkerd** with Rollout extension +- **Flagger** for automated canary analysis + +### Advanced Rollout Strategies + +Once comfortable with canary, try: + +**Blue-Green** (instant switch, easy rollback): +```yaml +strategy: + blueGreen: + activeSlotSelector: stable + autoPromotionEnabled: true + autoPromotionSeconds: 120 +``` + +**A/B Testing** (route by header): +```yaml +strategy: + canary: + trafficRouting: + istio: + virtualService: + name: frontend + routes: + - name: primary # 95% traffic + - name: canary # 5% traffic, routed by header +``` + +## Troubleshooting + +### Rollout Stuck in Progressing + +```bash +# Check rollout conditions +kubectl describe rollout tracing-demo-frontend -n services + +# Check controller logs +kubectl logs -n cicd -l app.kubernetes.io/name=argo-rollouts --tail=50 +``` + +Common causes: +- ReplicaSet not becoming ready (image pull error, resource limits) +- Health probe failing +- ArgoCD out of sync + +### Traffic Not Splitting 50/50 + +Native K8s balancing may not be exactly 50/50 due to: +- Connection pooling by clients +- Load balancer algorithm +- Pod restart timing + +For precise traffic splitting, use Istio or Linkerd. + +### View Rollout in ArgoCD UI + +1. Open ArgoCD: https://argocd.f3s.buetow.org +2. Click tracing-demo application +3. Expand frontend-rollout resource +4. See real-time status and sync history + +## References + +- [Argo Rollouts Canary Guide](https://argoproj.github.io/argo-rollouts/features/canary/) +- [Argo Rollouts Kubectl Plugin](https://argoproj.github.io/argo-rollouts/getting-started/#using-kubectl-with-argo-rollouts) +- [Progressive Delivery Patterns](https://argoproj.github.io/argo-rollouts/concepts/) diff --git a/f3s/tracing-demo/helm-chart/templates/frontend-rollout.yaml b/f3s/tracing-demo/helm-chart/templates/frontend-rollout.yaml new file mode 100644 index 0000000..156fd0a --- /dev/null +++ b/f3s/tracing-demo/helm-chart/templates/frontend-rollout.yaml @@ -0,0 +1,75 @@ +# Frontend Service Rollout (Progressive Delivery) +# Replaces frontend-deployment.yaml with canary strategy +apiVersion: argoproj.io/v1alpha1 +kind: Rollout +metadata: + name: tracing-demo-frontend + namespace: services + labels: + app: tracing-demo-frontend + component: frontend +spec: + replicas: 2 + strategy: + canary: + # Canary strategy configuration + steps: + # Step 1: Send 50% of traffic to new version + - setWeight: 50 + # Step 2: Wait 2 minutes before proceeding + - pause: + duration: 2m + # Step 3: Promote to 100% traffic + - setWeight: 100 + + # Traffic management via service weight + trafficRouting: + # Simple service-based traffic splitting (native K8s round-robin) + # For more advanced traffic splitting, install Istio or Linkerd + {} + + # Rollout revision history + revisionHistoryLimit: 3 + + # Pod template specification (same as Deployment) + selector: + matchLabels: + app: tracing-demo-frontend + template: + metadata: + labels: + app: tracing-demo-frontend + component: frontend + spec: + containers: + - name: frontend + image: registry.lan.buetow.org:30001/tracing-demo-frontend:latest + imagePullPolicy: IfNotPresent + ports: + - containerPort: 5000 + name: http + protocol: TCP + env: + - name: MIDDLEWARE_URL + value: "http://middleware-service.services.svc.cluster.local:5001" + - name: OTEL_EXPORTER_OTLP_ENDPOINT + value: "http://alloy.monitoring.svc.cluster.local:4317" + resources: + limits: + cpu: 200m + memory: 256Mi + requests: + cpu: 100m + memory: 128Mi + livenessProbe: + httpGet: + path: /health + port: 5000 + initialDelaySeconds: 10 + periodSeconds: 10 + readinessProbe: + httpGet: + path: /health + port: 5000 + initialDelaySeconds: 5 + periodSeconds: 5 diff --git a/f3s/tracing-demo/rollout-demo.sh b/f3s/tracing-demo/rollout-demo.sh new file mode 100755 index 0000000..78587be --- /dev/null +++ b/f3s/tracing-demo/rollout-demo.sh @@ -0,0 +1,57 @@ +#!/bin/bash +# Quick Argo Rollouts demo script for tracing-demo frontend +# This script automates the demo workflow + +set -e + +NAMESPACE="services" +ROLLOUT_NAME="tracing-demo-frontend" +KUBE_CTX="$(kubectl config current-context)" + +echo "===============================================" +echo "Argo Rollouts Demo for tracing-demo Frontend" +echo "===============================================" +echo "" +echo "Cluster: $KUBE_CTX" +echo "Namespace: $NAMESPACE" +echo "Rollout: $ROLLOUT_NAME" +echo "" + +# Check if rollout exists +if ! kubectl get rollout "$ROLLOUT_NAME" -n "$NAMESPACE" &>/dev/null; then + echo "ERROR: Rollout $ROLLOUT_NAME not found in $NAMESPACE namespace" + echo "Make sure:" + echo " 1. Argo Rollouts controller is installed (kubectl get pods -n cicd | grep argo-rollouts)" + echo " 2. tracing-demo is deployed (kubectl get rollout -n $NAMESPACE)" + exit 1 +fi + +# Check if kubectl argo rollouts plugin is installed +if ! kubectl argo rollouts version &>/dev/null; then + echo "WARNING: kubectl argo rollouts plugin not installed" + echo "Install it with:" + echo " curl -LO https://github.com/argoproj/argo-rollouts/releases/latest/download/kubectl-argo-rollouts-linux-amd64" + echo " sudo install -m 755 kubectl-argo-rollouts-linux-amd64 /usr/local/bin/kubectl-argo-rollouts" + echo "" +fi + +echo "Step 1: Display current rollout status" +echo "=======================================" +kubectl argo rollouts status "$ROLLOUT_NAME" -n "$NAMESPACE" +echo "" + +echo "Step 2: Start watching rollout (Press Ctrl+C to stop)" +echo "======================================================" +echo "This will show real-time rollout progress..." +echo "" +echo "In another terminal, run:" +echo " kubectl patch rollout $ROLLOUT_NAME -n $NAMESPACE --type='json' -p='[{\"op\":\"replace\",\"path\":\"/spec/template/spec/containers/0/image\",\"value\":\"registry.lan.buetow.org:30001/tracing-demo-frontend:latest\"}]'" +echo "" +echo "Or commit and push a change to git to trigger via ArgoCD" +echo "" + +kubectl argo rollouts get rollout "$ROLLOUT_NAME" -n "$NAMESPACE" --watch + +echo "" +echo "Demo Complete!" +echo "" |
