summaryrefslogtreecommitdiff
path: root/f3s/ROLLOUTS-CHECKLIST.md
diff options
context:
space:
mode:
authorPaul Buetow <paul@buetow.org>2026-01-15 21:15:12 +0200
committerPaul Buetow <paul@buetow.org>2026-01-15 21:15:12 +0200
commit6af78382365a83ba1a5b3c786179fac6080bc179 (patch)
treef6461e30034416f77882c05a332399af732a218c /f3s/ROLLOUTS-CHECKLIST.md
parentf1e8230fa0b5b7569f592e266051adf77b733c6b (diff)
docs: update all ROLLOUT*.md files with 1-min 33% canary details
Diffstat (limited to 'f3s/ROLLOUTS-CHECKLIST.md')
-rw-r--r--f3s/ROLLOUTS-CHECKLIST.md283
1 files changed, 158 insertions, 125 deletions
diff --git a/f3s/ROLLOUTS-CHECKLIST.md b/f3s/ROLLOUTS-CHECKLIST.md
index b32f1ac..b475f2d 100644
--- a/f3s/ROLLOUTS-CHECKLIST.md
+++ b/f3s/ROLLOUTS-CHECKLIST.md
@@ -1,189 +1,222 @@
# Argo Rollouts Deployment Checklist
-## Pre-Deployment Setup
-
-- [ ] Read `ARGO-ROLLOUTS-SUMMARY.md` to understand what was created
-- [ ] Ensure kubectl access to f3s cluster
-- [ ] Ensure ArgoCD is running and accessible
-- [ ] Git repository (conf.git) synced to git-server
+Quick checklist for deploying and testing Argo Rollouts with canary demo.
## Installation
+- [ ] Read `ARGO-ROLLOUTS-SUMMARY.md` - understand what was created
+- [ ] Ensure kubectl access to f3s cluster
+- [ ] Ensure ArgoCD is running
- [ ] Navigate to `/home/paul/git/conf/f3s/argo-rollouts`
-- [ ] Run `just install` to deploy controller
-- [ ] Verify controller running: `kubectl get pods -n cicd -l app.kubernetes.io/name=argo-rollouts`
-- [ ] Verify CRD installed: `kubectl get crd | grep rollout`
-
-## Optional: Install kubectl Plugin
-
-- [ ] Download kubectl-argo-rollouts:
+- [ ] Run `just install`
+- [ ] Verify controller: `kubectl get pods -n cicd -l app.kubernetes.io/name=argo-rollouts`
+- [ ] Verify CRD: `kubectl get crd | grep rollout`
+- [ ] (Optional) Install plugin:
```bash
curl -LO https://github.com/argoproj/argo-rollouts/releases/latest/download/kubectl-argo-rollouts-linux-amd64
chmod +x kubectl-argo-rollouts-linux-amd64
sudo install -m 755 kubectl-argo-rollouts-linux-amd64 /usr/local/bin/kubectl-argo-rollouts
+ kubectl argo rollouts version
```
-- [ ] Verify: `kubectl argo rollouts version`
-## ArgoCD Syncing
+## ArgoCD Integration
-- [ ] Create/push `argocd-apps/cicd/argo-rollouts.yaml` to git
-- [ ] Create/push `argocd-apps/services/tracing-demo.yaml` updates to git
-- [ ] Force ArgoCD sync (wait 3 min or manual):
+- [ ] Push changes to git-server:
+ ```bash
+ cd /home/paul/git/conf/f3s
+ git add -A && git commit -m "feat: add Argo Rollouts"
+ git push r0 master
+ ```
+- [ ] Verify ArgoCD app:
```bash
- argocd app sync argo-rollouts
- argocd app sync tracing-demo
+ kubectl get application argo-rollouts -n cicd
+ argocd app get argo-rollouts
+ ```
+- [ ] Verify tracing-demo app:
+ ```bash
+ kubectl get application tracing-demo -n cicd
+ argocd app get tracing-demo
```
-- [ ] Verify tracing-demo application status: `argocd app get tracing-demo`
## Rollout Verification
-- [ ] Check frontend rollout deployed: `kubectl get rollout tracing-demo-frontend -n services`
+- [ ] Check rollout exists: `kubectl get rollout tracing-demo-frontend -n services`
- [ ] Verify status: `kubectl describe rollout tracing-demo-frontend -n services`
-- [ ] Expected: `Status: Healthy` with `2/2 replicas` in stable state
-- [ ] Check pods running: `kubectl get pods -n services -l app=tracing-demo-frontend`
+- [ ] Expected: `Status: Healthy` with `3/3 replicas` in stable state
+- [ ] Check pods: `kubectl get pods -n services -l app=tracing-demo-frontend`
+- [ ] All 3 pods should be `Running`
+
+## Demo: Basic Canary Rollout
-## Basic Demo (First Time)
+**Expected: 0-15s: canary starting, 15-60s: observing, 60-90s: promoting**
### Terminal 1: Watch Rollout
```bash
cd /home/paul/git/conf/f3s/tracing-demo
just rollout-watch
```
-- [ ] Command running and connected
+- [ ] Command runs and connects to cluster
+- [ ] Waiting for rollout to start
-### Terminal 2: Generate Load (Optional)
+### Terminal 2: Trigger Rollout
```bash
-cd /home/paul/git/conf/f3s/tracing-demo
-just load-test &
+kubectl patch rollout tracing-demo-frontend -n services \
+ --type='json' \
+ -p='[{"op":"add","path":"/spec/template/spec/containers/0/env/-","value":{"name":"ROLLOUT_V","value":"'$(date +%s)'"}}]'
```
-- [ ] Requests being sent to frontend
+- [ ] Patch command successful
+- [ ] Terminal 1 shows change immediately
-### Terminal 3: Trigger Rollout
-Choose one method:
+### Terminal 1: Observe Progress
+- [ ] See `Step: 0/3, SetWeight: 33`
+- [ ] 1 canary pod becoming ready
+- [ ] 3 stable pods still running
+- [ ] After ~15 sec: canary pod ready
+- [ ] After ~60 sec: auto-promotion starts
+- [ ] After ~90 sec: all 3 pods running new version
+- [ ] Status shows `Healthy`
+
+## Demo: Abort/Rollback
+
+**Expected: Stop rollout and keep old version running**
-**Method A: Kubectl Patch (Fastest)**
+### Terminal 1: Watch Rollout
+```bash
+just rollout-watch
+```
+
+### Terminal 2: Trigger Rollout
```bash
kubectl patch rollout tracing-demo-frontend -n services \
--type='json' \
- -p='[{"op":"replace","path":"/spec/template/spec/containers/0/image","value":"registry.lan.buetow.org:30001/tracing-demo-frontend:latest"}]'
+ -p='[{"op":"add","path":"/spec/template/spec/containers/0/env/-","value":{"name":"ROLLOUT_V2","value":"'$(date +%s)'"}}]'
+```
+
+### Terminal 3: Abort at Canary Step (after 20 seconds)
+```bash
+cd /home/paul/git/conf/f3s/tracing-demo
+just rollout-abort
```
-- [ ] Executed successfully
+- [ ] Abort command accepted
+- [ ] Terminal 1 shows `Status: Aborted`
+- [ ] Canary pods terminate
+- [ ] Old 3 pods continue running
+- [ ] Verify with: `just rollout-status`
-**Method B: Git + ArgoCD (Most GitOps)**
+## Demo: Load Testing
+
+**Expected: Generate traffic while rollout happens**
+
+### Terminal 1: Watch Rollout
+```bash
+just rollout-watch
+```
+
+### Terminal 2: Start Load Test
```bash
-cd /home/paul/git/conf/f3s
-# Edit tracing-demo/helm-chart/templates/frontend-rollout.yaml (change image tag)
-git add -A
-git commit -m "chore: update frontend image for demo"
-git remote add r0 ssh://git@r0:30022/repos/conf.git 2>/dev/null || true
-git push r0 master
-kubectl annotate application tracing-demo -n cicd argocd.argoproj.io/refresh=normal --overwrite
+just load-test &
+```
+- [ ] Requests being sent
+
+### Terminal 3: Trigger Rollout
+```bash
+kubectl patch rollout tracing-demo-frontend -n services \
+ --type='json' \
+ -p='[{"op":"add","path":"/spec/template/spec/containers/0/env/-","value":{"name":"ROLLOUT_V3","value":"'$(date +%s)'"}}]'
```
-- [ ] Git push successful
-- [ ] ArgoCD syncing (check web UI or CLI)
+- [ ] Rollout progresses with active traffic
+- [ ] Both old and new pods serve requests during canary phase
-## Demo Observation
+## Monitoring
-- [ ] Terminal 1 shows: "Progressing" → "canary step 1/3"
-- [ ] After ~30 sec: New canary pod appears
-- [ ] After ~2 min: "canary step 2/3" (pause)
-- [ ] After ~4 min: "canary step 3/3" (100% traffic)
-- [ ] After ~4:20 min: Status shows "Healthy"
-- [ ] Old pods terminated, 2 new pods in stable state
+- [ ] Check status: `kubectl argo rollouts status tracing-demo-frontend -n services`
+- [ ] Detailed info: `kubectl argo rollouts describe rollout tracing-demo-frontend -n services`
+- [ ] Pod details: `kubectl get pods -n services -l app=tracing-demo-frontend -o wide`
+- [ ] View logs: `just logs-frontend`
+- [ ] View history: `just rollout-history`
-## Monitoring (Optional)
+## Grafana (Optional)
-- [ ] Check logs: `just logs-frontend`
-- [ ] Check Grafana Tempo for traces: https://grafana.f3s.buetow.org
- - [ ] Navigate to Explore → Tempo
- - [ ] Query: `{ resource.service.name = "frontend" }`
- - [ ] See traces from old and new versions
-- [ ] Check Prometheus metrics: Port-forward and query
+- [ ] Open Grafana: https://grafana.f3s.buetow.org
+- [ ] Navigate to Explore → Tempo datasource
+- [ ] Query: `{ resource.service.name = "frontend" }`
+- [ ] See traces from old and new versions during rollout
-## Advanced Scenarios
+## Integration with Git (GitOps)
-### Scenario 1: Manual Promotion
-- [ ] Trigger rollout (step above)
-- [ ] After step 1 (30 sec), run:
+- [ ] Edit rollout config:
```bash
- just rollout-promote
+ nano /home/paul/git/conf/f3s/tracing-demo/helm-chart/templates/frontend-rollout.yaml
```
-- [ ] Watch rollout skip step 2, immediately promote to 100%
-- [ ] Verify: `just rollout-status` shows "Healthy"
-
-### Scenario 2: Abort/Rollback
-- [ ] Trigger rollout
-- [ ] While progressing, run:
+- [ ] Change any settings (e.g., duration, setWeight)
+- [ ] Commit and push:
```bash
- just rollout-abort
+ git add -A && git commit -m "chore: adjust canary settings"
+ git push r0 master
```
-- [ ] Watch canary pods terminate
-- [ ] Old version continues running
-- [ ] Verify: `just rollout-status` shows "Aborted"
-
-### Scenario 3: Check History
-- [ ] After any rollout:
+- [ ] ArgoCD auto-syncs within 3 minutes (or force):
```bash
- just rollout-history
+ kubectl annotate application tracing-demo -n cicd argocd.argoproj.io/refresh=normal --overwrite
```
-- [ ] See previous revisions and their status
-
-## Integration with CI/CD
+- [ ] New settings take effect on next rollout trigger
-- [ ] Image builds automatically on git push (or configured pipeline)
-- [ ] New image pushed to registry: `registry.lan.buetow.org:30001/tracing-demo-frontend:NEWTAG`
-- [ ] Git updated with new image tag
-- [ ] ArgoCD detects change
-- [ ] Rollout automatically triggered
-- [ ] Canary strategy executes
+## Post-Demo
-## Post-Deployment
+- [ ] Abort any stuck rollouts: `just rollout-abort`
+- [ ] Verify stable state: `just rollout-status` shows `Healthy`
+- [ ] Review documentation:
+ - [ ] `ARGO-ROLLOUTS-SUMMARY.md` - architecture
+ - [ ] `ROLLOUTS-SETUP.md` - detailed scenarios
+ - [ ] `README-ROLLOUTS.md` - quick reference
+ - [ ] `tracing-demo/ROLLOUTS-DEMO.md` - technical details
-- [ ] Share documentation:
- - [ ] `ROLLOUTS-SETUP.md` - Complete setup guide
- - [ ] `tracing-demo/ROLLOUTS-DEMO.md` - Detailed walkthrough
- - [ ] `ARGO-ROLLOUTS-SUMMARY.md` - Architecture overview
-- [ ] Add team to `kubectl argo rollouts` usage
-- [ ] Consider next steps:
- - [ ] Deploy Istio for advanced traffic management
- - [ ] Add Flagger for automated analysis
- - [ ] Extend to other services (middleware, backend)
- - [ ] Create monitoring dashboards
-
-## Troubleshooting Checklist
+## Troubleshooting
### Controller not running
-- [ ] Check pod: `kubectl get pods -n cicd -l app.kubernetes.io/name=argo-rollouts`
-- [ ] Check logs: `kubectl logs -n cicd -l app.kubernetes.io/name=argo-rollouts`
-- [ ] Check CRD: `kubectl get crd | grep rollout`
+```bash
+kubectl get pods -n cicd -l app.kubernetes.io/name=argo-rollouts
+kubectl logs -n cicd -l app.kubernetes.io/name=argo-rollouts
+```
+- [ ] Pod running and ready
-### Rollout not deploying
-- [ ] Check ArgoCD sync: `argocd app get tracing-demo`
-- [ ] Check git changes pushed: `git log --oneline | head -5`
-- [ ] Force sync: `argocd app sync tracing-demo --prune`
+### Rollout not deployed
+```bash
+kubectl get rollout tracing-demo-frontend -n services
+kubectl describe rollout tracing-demo-frontend -n services
+```
+- [ ] Check events section for errors
-### Canary pods not starting
-- [ ] Check pod status: `kubectl describe pod -n services <pod-name>`
-- [ ] Check logs: `kubectl logs -n services <pod-name>`
-- [ ] Check resource limits: `kubectl top pods -n services`
-- [ ] Check image: `kubectl get pods -n services -o jsonpath='{.items[*].spec.containers[0].image}'`
+### Canary pods in ImagePullBackoff
+- [ ] Use env var patch instead (don't change image tag):
+ ```bash
+ kubectl patch rollout tracing-demo-frontend -n services \
+ --type='json' \
+ -p='[{"op":"add","path":"/spec/template/spec/containers/0/env/-","value":{"name":"ROLLOUT_V","value":"'$(date +%s)'"}}]'
+ ```
### Rollout stuck in Progressing
-- [ ] Check health probes: `kubectl get rollout tracing-demo-frontend -n services -o yaml | grep -A 10 health`
-- [ ] Check replica status: `kubectl get rs -n services -l app=tracing-demo-frontend -o wide`
-- [ ] Check controller logs: `kubectl logs -n cicd -l app.kubernetes.io/name=argo-rollouts --tail=50`
+```bash
+kubectl describe rollout tracing-demo-frontend -n services
+kubectl get pods -n services -l app=tracing-demo-frontend
+```
+- [ ] Check pod readiness probes
+- [ ] Check pod resource requests/limits
+- [ ] Check controller logs
-## Cleanup (If Needed)
+## Next Steps
-- [ ] Stop rollout: `kubectl argo rollouts abort tracing-demo-frontend -n services`
-- [ ] Rollback to previous: `kubectl rollout undo deployment/tracing-demo-frontend -n services` (if needed)
-- [ ] Uninstall Argo Rollouts: `cd argo-rollouts && just uninstall`
+- [ ] Run through all demo scenarios multiple times
+- [ ] Modify rollout settings and observe behavior
+- [ ] Monitor with Prometheus/Grafana
+- [ ] Extend to other services (middleware, backend)
+- [ ] Optional: Install Istio for advanced traffic routing
+- [ ] Optional: Deploy Flagger for automated analysis
---
-**Setup complete when:**
-- ✅ Argo Rollouts controller running in `cicd` namespace
-- ✅ Frontend rollout deployed in `services` namespace
-- ✅ ArgoCD recognizes rollout resource
-- ✅ One demo run successful (git trigger or kubectl patch)
-- ✅ Team can watch and manage rollouts
+**Setup Complete When:**
+- ✅ Controller running in `cicd` namespace
+- ✅ Rollout deployed in `services` namespace
+- ✅ One full demo executed (0-90 seconds)
+- ✅ Can abort and retry
+- ✅ Team trained on canary deployments