summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorPaul Buetow <paul@buetow.org>2026-05-06 09:35:55 +0300
committerPaul Buetow <paul@buetow.org>2026-05-06 09:35:55 +0300
commitfbb7c9a9ad8d03d5d095ac441a58b37537e0ab8d (patch)
tree2ccb042e90ca3ed99e13d9e7bf36948e7e362936 /docs
parent3b20f2c4d16c7b7f583e9ab2b51213e9ddc94fd5 (diff)
add Dockerfile and Rocky Linux 9 build docs
Introduces a Docker-based build path so ior can be compiled on any Linux host without a native Rocky 9 toolchain setup: - Dockerfile: Rocky 9 minimal image with Go (version from ARG, default from go.mod), static libelf/libzstd built from source, libbpfgo at v0.9.2-libbpf-1.5.1, and mage; CMD runs mage generate + mage all against the repo root mounted as a volume. - scripts/build-with-docker.sh: reads GO_VERSION from go.mod, passes it as --build-arg to docker build, mounts tracefs and BTF into the container, writes the binary to the repo root. - Magefile.go: adds BuildDocker target that wraps the script. - README.md: simplified to the two build paths (Docker + native) with links to docs/; removed GOTOOLCHAIN=auto throughout. - docs/build-rocky-linux-9.md: full manual Rocky 9 steps, libbpfgo toolchain setup/rollback, compile-once-run-everywhere explanation, and timing semantics. - docs/tui-reference.md: complete TUI hotkey reference, recording mode details, and the .ior.zst vs Parquet trade-off table. - AGENTS.md: removed GOTOOLCHAIN=auto from all build commands. - internal/c/generated_tracepoints.c: regenerated against the host kernel. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Diffstat (limited to 'docs')
-rw-r--r--docs/build-rocky-linux-9.md173
-rw-r--r--docs/tui-reference.md175
2 files changed, 348 insertions, 0 deletions
diff --git a/docs/build-rocky-linux-9.md b/docs/build-rocky-linux-9.md
new file mode 100644
index 0000000..424b78e
--- /dev/null
+++ b/docs/build-rocky-linux-9.md
@@ -0,0 +1,173 @@
+# Building ior on Rocky Linux 9
+
+Verified on a fresh Rocky Linux 9.7 install (kernel `5.14.0-611.5.1.el9_7` or
+newer). Runs on the **stock RHEL 9 kernel** — no kernel upgrade needed.
+
+One build-time caveat: Rocky 9 ships neither `libelf.a` nor `libzstd.a` (no
+`*-static` packages). Both must be built from source.
+
+> Historical note. Earlier versions of `ior` typed BPF tracepoint context as
+> `struct trace_event_raw_sys_enter`/`_exit` (the BTF-emitted alias). RHEL 9
+> backports an `rt`-tree patch that adds `preempt_lazy_count` to `struct
+> trace_entry`, which widens those aliases by 8 bytes and shifts the `args`/`ret`
+> offsets — but the actual context the kernel hands the program is still
+> `struct syscall_trace_enter`/`_exit`, where the offsets did not move. The
+> verifier saw the program reading past `max_ctx_offset` and rejected the
+> attach with `EACCES`. `ior` now uses `syscall_trace_*` directly (matching
+> the [bcc fix](https://github.com/iovisor/bcc/pull/4920) and inspektor-gadget),
+> so the stock kernel works with no workaround.
+
+## Docker build (no Rocky 9 host required)
+
+The easiest path — builds entirely inside a container from any Docker-capable
+Linux host:
+
+```shell
+mage buildDocker
+# or directly:
+./scripts/build-with-docker.sh
+# skip image rebuild on subsequent runs:
+./scripts/build-with-docker.sh --run
+```
+
+`mage buildDocker` builds a `ior-builder:rocky9` image on first run (~15–20 min),
+then runs it with the repo root mounted as a volume so the resulting static
+binary lands at `./ior`.
+
+## Manual build on a Rocky Linux 9 host
+
+```shell
+# 1) Enable repos and install build dependencies (CRB ships static libs).
+sudo dnf config-manager --set-enabled crb
+sudo dnf install -y epel-release
+sudo dnf install -y gcc clang bpftool elfutils-libelf-devel zlib-static \
+ glibc-static libzstd-devel git make cmake wget rpmdevtools strace bpftrace
+sudo dnf builddep -y elfutils
+
+# 2) Install Go 1.26 from go.dev (Rocky 9 ships only Go 1.25; ior needs 1.26+).
+cd /tmp
+wget -q https://go.dev/dl/go1.26.2.linux-amd64.tar.gz
+sudo tar -C /usr/local -xf go1.26.2.linux-amd64.tar.gz
+echo 'export PATH=/usr/local/go/bin:$HOME/go/bin:$PATH' | sudo tee /etc/profile.d/go.sh
+source /etc/profile.d/go.sh
+
+# 3) Build libelf.a from elfutils source.
+mkdir -p ~/src && cd ~
+dnf download --source elfutils-libelf
+rpm -ivh elfutils-*.src.rpm
+tar -C ~/src -xjf rpmbuild/SOURCES/elfutils-*.tar.bz2
+cd ~/src/elfutils-*
+./configure --enable-deterministic-archives --disable-debuginfod --disable-libdebuginfod
+make -C lib -j$(nproc)
+make -C libelf -j$(nproc)
+sudo cp -v libelf/libelf.a /usr/lib64/
+
+# 4) Build libzstd.a from upstream (libzstd-devel does not ship the static archive).
+cd /tmp
+wget -q https://github.com/facebook/zstd/releases/download/v1.5.5/zstd-1.5.5.tar.gz
+tar xzf zstd-1.5.5.tar.gz
+make -C zstd-1.5.5/lib -j$(nproc) libzstd.a
+sudo cp -v zstd-1.5.5/lib/libzstd.a /usr/lib64/
+
+# 5) Clone ior + libbpfgo, pin libbpfgo, build the static archive, install mage.
+mkdir -p ~/git
+git clone https://codeberg.org/snonux/ior ~/git/ior
+git clone https://github.com/aquasecurity/libbpfgo ~/git/libbpfgo
+git -C ~/git/libbpfgo checkout v0.9.2-libbpf-1.5.1
+git -C ~/git/libbpfgo submodule update --init --recursive
+make -C ~/git/libbpfgo libbpfgo-static
+go install github.com/magefile/mage@latest
+
+# 6) Generate against the live kernel and build.
+# IOR_FORCE_GENERATE=1 skips the strict diff against the committed audit
+# (generated on a different kernel build).
+cd ~/git/ior
+IOR_FORCE_GENERATE=1 mage generate
+mage all
+
+# 7) Smoke test.
+sudo ./ior -plain -duration 5
+```
+
+If `./ior -plain -duration 5` prints `Probing for 5s` and a stream of CSV rows,
+the install is good.
+
+## libbpfgo toolchain
+
+`ior` links against a locally built `libbpfgo` checkout. By default
+`Magefile.go` expects that checkout at `../libbpfgo` relative to this repo; set
+`LIBBPFGO=/absolute/path/to/libbpfgo` to override.
+
+Pin that checkout to `v0.9.2-libbpf-1.5.1` and rebuild the static artifacts
+before running `mage` targets:
+
+```shell
+git -C ../libbpfgo checkout v0.9.2-libbpf-1.5.1
+git -C ../libbpfgo submodule update --init --recursive
+make -C ../libbpfgo libbpfgo-static
+```
+
+Validated commands for this pin:
+
+```shell
+mage world
+mage integrationTest
+```
+
+Troubleshooting and rollback:
+
+- If builds fail with `bpf/bpf.h` missing, re-run the checkout, submodule
+ sync, and `make libbpfgo-static` commands above, then retry
+ `mage world`.
+- Prefer Mage targets over raw `go test` for packages that import `libbpfgo`;
+ Mage injects the required `CGO_CFLAGS`, `CGO_LDFLAGS`, and `LIBBPFGO` values.
+- To roll back to the previous pin, reset to commit `90dbffffbdab`
+ (`v0.6.0-libbpf-1.3.0.20240111220235-90dbffffbdab`) and rebuild:
+
+```shell
+git -C ../libbpfgo checkout 90dbffffbdab
+git -C ../libbpfgo submodule update --init --recursive
+make -C ../libbpfgo libbpfgo-static
+```
+
+## Compile once, run everywhere
+
+The full build dance above only has to happen on **one** machine. The resulting
+`ior` binary is portable across Linux hosts: `scp ior other-host:/usr/local/bin/`
+and run it there.
+
+Two reasons it works:
+
+- The Go binary is compiled with `-extldflags "-static"` and links libbpf,
+ libelf, libzstd, and zlib as static archives. There is no runtime dependency
+ on the build host's library versions (a couple of glibc resolver functions —
+ `getpwnam_r` and friends — fall back to the target's libc, which is fine on
+ any reasonable distro).
+- The BPF object inside the binary is built with libbpf's CO-RE
+ (Compile-Once, Run-Everywhere) machinery. Field offsets are not baked into
+ the bytecode; libbpf reads the target kernel's BTF (`/sys/kernel/btf/vmlinux`)
+ at load time and patches the program for that kernel. As long as the target
+ ships BTF — true on every Debian, Ubuntu, Fedora, Arch, RHEL, and ElRepo
+ `kernel-ml` build at the time of writing — the same `ior` binary runs without
+ recompilation.
+
+Pick one Rocky 9 / Fedora box, do the build dance once, then distribute the
+23 MB binary to wherever you want to trace. The build host needs all the dev
+tooling; the trace hosts need only a BTF-enabled kernel and `sudo`.
+
+## Timing semantics
+
+Each reported event pair has two timing counters:
+
+- `durationNs`: syscall runtime on the same thread (`exit(current) - enter(current)`).
+- `durationToPrevNs`: inter-syscall gap on the same thread (`enter(current) - exit(previous)`).
+
+Important details:
+
+- `durationToPrevNs` is tracked per `tid` (thread), not globally across all threads.
+- The first observed syscall pair for a thread has `durationToPrevNs = 0` because
+ there is no prior exit timestamp.
+- `durationToPrevNs` is attributed to the current syscall pair (the one whose
+ `enter` closes the gap).
+- There is no separate "idle" pseudo-event bucket; use the `durationToPrev` count
+ field when aggregated flamegraph output should emphasize inter-syscall time.
diff --git a/docs/tui-reference.md b/docs/tui-reference.md
new file mode 100644
index 0000000..d6a7266
--- /dev/null
+++ b/docs/tui-reference.md
@@ -0,0 +1,175 @@
+# TUI Reference
+
+## TUI Flamegraphs
+
+Flamegraphs are available only inside the TUI dashboard.
+Use `-fields` to change the stack order and `-count` to choose the metric.
+The default stack order is `comm,path,tracepoint` (bottom to top).
+
+## Recording Modes
+
+`ior` has four distinct output flows:
+
+| Mode | How to use it | What it writes | Filter behavior |
+| --- | --- | --- | --- |
+| TUI dashboard | default startup | nothing continuously; data stays in memory unless you export | current TUI/global filters drive what you see |
+| TUI CSV snapshot export | press `e` in the dashboard | one `ior-stream-<timestamp>.csv` snapshot of the current filtered stream view | exports only the currently filtered in-memory rows |
+| Headless `.ior.zst` export | start with `-flamegraph -name <name>` | one aggregated native trace artifact written at shutdown | no TUI filter stack; this is the native trace/integration workflow |
+| Parquet recording | press `R` in the TUI, or start with `-parquet <file>` | a streaming Parquet file of traced syscall rows | TUI mode records rows that pass the active TUI filter; headless `-parquet` records all traced rows |
+
+Important distinction:
+
+- `.ior.zst` output is an aggregated native artifact, not a row-by-row event log.
+- CSV export is a point-in-time snapshot of the ring buffer.
+- Parquet recording is a streaming capture from start to stop.
+- The ring buffer is capped, so CSV export is not a replacement for Parquet recording or `.ior.zst` output.
+
+### Headless Native `.ior.zst` Output
+
+Use `-flamegraph` when you want the native `ior` trace artifact instead of a streaming row log:
+
+```shell
+sudo ./ior -flamegraph -name trace-run -duration 60
+```
+
+Native `.ior.zst` behavior:
+
+- writes one `*.ior.zst` file when the run ends
+- stores aggregated counters for repeated syscall/path/process combinations
+- is intended for `ior`'s native flamegraph and integration-style workflows
+- does not preserve one output row per traced syscall
+
+### TUI Parquet Recording
+
+Start a recording from the dashboard with `R`.
+
+- First `R`: open a filename prompt (`ior-recording-<timestamp>.parquet` by default).
+- `Enter`: start recording to that file.
+- Second `R`: stop and finalize the active Parquet file.
+- Recording stops automatically when you quit the TUI or reselect PID/TID/session scope.
+
+Lifecycle details:
+
+- TUI recording uses the active TUI global filter at emission time.
+- If a filter change restarts tracing, the recorder stays alive and continues writing matching rows after the restart.
+- The dashboard footer shows the active recording path or the last recording error.
+
+### Headless Parquet Recording
+
+Use `-parquet` to skip the TUI and stream traced syscall rows directly to a Parquet file:
+
+```shell
+sudo ./ior -parquet trace.parquet -duration 60
+```
+
+Headless Parquet mode behavior:
+
+- skips the TUI completely
+- records all traced rows
+- rejects content filters such as `-comm`, `-path`, `-pid`, and `-tid`
+- cannot be combined with `-plain`, `-flamegraph`, `--testflames`, or `--testliveflames`
+
+Use headless mode when you want a full recording, and TUI mode when you want interactive filtering plus optional start/stop recording from the dashboard.
+
+### Choosing Between `.ior.zst` and Parquet
+
+| Question | Native `.ior.zst` | Parquet |
+| --- | --- | --- |
+| Data shape | aggregated counters | one row per traced syscall |
+| Write pattern | collect in memory, write one compressed artifact at the end | stream rows continuously while recording |
+| Best for | `ior`-native trace artifacts, flamegraph workflows, integration assertions | offline analysis in other tools, long captures, preserving per-event detail |
+| Relative write cost | usually lower because repeated events are folded together before file write | usually higher because each traced row is serialized |
+| Detail retained | loses original row order and per-event granularity | keeps per-event timing and syscall fields |
+
+Rule of thumb:
+
+- choose `.ior.zst` when you want the native `ior` artifact and do not need every traced syscall row preserved
+- choose Parquet when you want a full event stream for downstream analysis outside `ior`
+
+## TUI Navigation
+
+The TUI has an in-screen help panel (toggle with **H**) that lists all available
+keys. Use it to discover shortcuts without consulting this document.
+
+Dashboard tabs:
+
+- **tab** / **shift+tab** — next / previous tab
+- **1** — Overview
+- **2** — Syscalls
+- **3** — Files
+- **4** — Processes
+- **5** — Latency+Gaps
+- **6** — Stream
+
+The TUI has two key scopes:
+
+- Global hotkeys: available from any dashboard screen.
+- Dashboard hotkeys: behavior that depends on the active tab (especially `6:Stream`).
+
+### Global Hotkeys
+
+- `tab` / `shift+tab`: cycle tabs.
+- `1`–`6`: jump to tab by number (`7` is an alias for `6`).
+- `e`: export filtered stream rows to CSV (`ior-stream-<timestamp>.csv`).
+- `R`: start or stop Parquet recording.
+- `p`: re-open process selector (PID selection flow).
+- `t`: open TID selector flow.
+- `o`: open probe selection/toggling dialog.
+- `r`: refresh dashboard snapshot.
+- `H`: toggle bottom help sections on/off.
+- `q` or `ctrl+c`: quit.
+
+### Dashboard / Tab-Specific Hotkeys
+
+- `d` in `3:Files`: toggle directory-grouped files view.
+- `s` in sortable tabs (`2:Syscalls`, `3:Files`, `4:Processes`): sort by selected column.
+- `S` in sortable tabs: reverse-sort by selected column.
+- `j/k` or `up/down` in list tabs: scroll list.
+
+`left/right` and `h/l` do not switch tabs. In `6:Stream` paused mode they move the selected column.
+
+### 6:Stream Hotkeys and Behavior
+
+`6:Stream` has two modes:
+
+- Live mode (`paused=false`): rows update continuously.
+- Pause mode (`paused=true`): selection/cell/filter/search/export workflows are enabled.
+
+Core controls:
+
+- `space`: toggle live/pause.
+- `g`/`G`: jump to top/tail.
+- `c`: clear stream filters.
+- `f`: open advanced filter modal.
+- `j/k` or `up/down`: move selected row (pause) or scroll (live).
+- `left/right` or `h/l`: move selected column in pause mode.
+
+#### Enter-Based Filter Stack (Pause Mode)
+
+In pause mode, `enter` on the selected cell pushes a filter onto a stack and
+immediately re-filters the current ring buffer snapshot. Filters are stackable.
+
+- String columns use case-insensitive substring match:
+ - `Comm` → `comm~<value>`
+ - `Syscall` → `syscall~<value>`
+ - `File` → `file~<value>`
+- Numeric exact match: `PID`, `TID`, `FD`, `Ret`, `Bytes`
+- Numeric threshold (`>=`): `Latency` → `latency>=selected_value`, `Gap` → `gap>=selected_value`
+
+`esc` in pause mode pops the most recent filter (LIFO); repeated `esc` undoes
+all stacked filters.
+
+#### Regex Search (Pause Mode)
+
+- `/`: search forward; `?`: search backward.
+- Search checks all stream columns and wraps around the ring buffer.
+- `n` / `N`: next / previous match.
+
+#### Stream CSV Export (Pause Mode)
+
+- `x`: quick export filtered stream rows to CSV.
+- `X`: export with filename prompt.
+- `E`: open last stream-exported CSV in foreground editor (`EDITOR` → `VISUAL` → `SUDO_EDITOR` → `hx` → `vi`).
+
+`e` (global) exports a fresh filtered snapshot even outside paused mode; `x`/`X`
+export the exact paused view.