diff options
| author | Paul Buetow <paul@buetow.org> | 2026-05-06 09:35:55 +0300 |
|---|---|---|
| committer | Paul Buetow <paul@buetow.org> | 2026-05-06 09:35:55 +0300 |
| commit | fbb7c9a9ad8d03d5d095ac441a58b37537e0ab8d (patch) | |
| tree | 2ccb042e90ca3ed99e13d9e7bf36948e7e362936 /docs | |
| parent | 3b20f2c4d16c7b7f583e9ab2b51213e9ddc94fd5 (diff) | |
add Dockerfile and Rocky Linux 9 build docs
Introduces a Docker-based build path so ior can be compiled on any
Linux host without a native Rocky 9 toolchain setup:
- Dockerfile: Rocky 9 minimal image with Go (version from ARG, default
from go.mod), static libelf/libzstd built from source, libbpfgo at
v0.9.2-libbpf-1.5.1, and mage; CMD runs mage generate + mage all
against the repo root mounted as a volume.
- scripts/build-with-docker.sh: reads GO_VERSION from go.mod, passes it
as --build-arg to docker build, mounts tracefs and BTF into the
container, writes the binary to the repo root.
- Magefile.go: adds BuildDocker target that wraps the script.
- README.md: simplified to the two build paths (Docker + native) with
links to docs/; removed GOTOOLCHAIN=auto throughout.
- docs/build-rocky-linux-9.md: full manual Rocky 9 steps, libbpfgo
toolchain setup/rollback, compile-once-run-everywhere explanation,
and timing semantics.
- docs/tui-reference.md: complete TUI hotkey reference, recording mode
details, and the .ior.zst vs Parquet trade-off table.
- AGENTS.md: removed GOTOOLCHAIN=auto from all build commands.
- internal/c/generated_tracepoints.c: regenerated against the host kernel.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Diffstat (limited to 'docs')
| -rw-r--r-- | docs/build-rocky-linux-9.md | 173 | ||||
| -rw-r--r-- | docs/tui-reference.md | 175 |
2 files changed, 348 insertions, 0 deletions
diff --git a/docs/build-rocky-linux-9.md b/docs/build-rocky-linux-9.md new file mode 100644 index 0000000..424b78e --- /dev/null +++ b/docs/build-rocky-linux-9.md @@ -0,0 +1,173 @@ +# Building ior on Rocky Linux 9 + +Verified on a fresh Rocky Linux 9.7 install (kernel `5.14.0-611.5.1.el9_7` or +newer). Runs on the **stock RHEL 9 kernel** — no kernel upgrade needed. + +One build-time caveat: Rocky 9 ships neither `libelf.a` nor `libzstd.a` (no +`*-static` packages). Both must be built from source. + +> Historical note. Earlier versions of `ior` typed BPF tracepoint context as +> `struct trace_event_raw_sys_enter`/`_exit` (the BTF-emitted alias). RHEL 9 +> backports an `rt`-tree patch that adds `preempt_lazy_count` to `struct +> trace_entry`, which widens those aliases by 8 bytes and shifts the `args`/`ret` +> offsets — but the actual context the kernel hands the program is still +> `struct syscall_trace_enter`/`_exit`, where the offsets did not move. The +> verifier saw the program reading past `max_ctx_offset` and rejected the +> attach with `EACCES`. `ior` now uses `syscall_trace_*` directly (matching +> the [bcc fix](https://github.com/iovisor/bcc/pull/4920) and inspektor-gadget), +> so the stock kernel works with no workaround. + +## Docker build (no Rocky 9 host required) + +The easiest path — builds entirely inside a container from any Docker-capable +Linux host: + +```shell +mage buildDocker +# or directly: +./scripts/build-with-docker.sh +# skip image rebuild on subsequent runs: +./scripts/build-with-docker.sh --run +``` + +`mage buildDocker` builds a `ior-builder:rocky9` image on first run (~15–20 min), +then runs it with the repo root mounted as a volume so the resulting static +binary lands at `./ior`. + +## Manual build on a Rocky Linux 9 host + +```shell +# 1) Enable repos and install build dependencies (CRB ships static libs). +sudo dnf config-manager --set-enabled crb +sudo dnf install -y epel-release +sudo dnf install -y gcc clang bpftool elfutils-libelf-devel zlib-static \ + glibc-static libzstd-devel git make cmake wget rpmdevtools strace bpftrace +sudo dnf builddep -y elfutils + +# 2) Install Go 1.26 from go.dev (Rocky 9 ships only Go 1.25; ior needs 1.26+). +cd /tmp +wget -q https://go.dev/dl/go1.26.2.linux-amd64.tar.gz +sudo tar -C /usr/local -xf go1.26.2.linux-amd64.tar.gz +echo 'export PATH=/usr/local/go/bin:$HOME/go/bin:$PATH' | sudo tee /etc/profile.d/go.sh +source /etc/profile.d/go.sh + +# 3) Build libelf.a from elfutils source. +mkdir -p ~/src && cd ~ +dnf download --source elfutils-libelf +rpm -ivh elfutils-*.src.rpm +tar -C ~/src -xjf rpmbuild/SOURCES/elfutils-*.tar.bz2 +cd ~/src/elfutils-* +./configure --enable-deterministic-archives --disable-debuginfod --disable-libdebuginfod +make -C lib -j$(nproc) +make -C libelf -j$(nproc) +sudo cp -v libelf/libelf.a /usr/lib64/ + +# 4) Build libzstd.a from upstream (libzstd-devel does not ship the static archive). +cd /tmp +wget -q https://github.com/facebook/zstd/releases/download/v1.5.5/zstd-1.5.5.tar.gz +tar xzf zstd-1.5.5.tar.gz +make -C zstd-1.5.5/lib -j$(nproc) libzstd.a +sudo cp -v zstd-1.5.5/lib/libzstd.a /usr/lib64/ + +# 5) Clone ior + libbpfgo, pin libbpfgo, build the static archive, install mage. +mkdir -p ~/git +git clone https://codeberg.org/snonux/ior ~/git/ior +git clone https://github.com/aquasecurity/libbpfgo ~/git/libbpfgo +git -C ~/git/libbpfgo checkout v0.9.2-libbpf-1.5.1 +git -C ~/git/libbpfgo submodule update --init --recursive +make -C ~/git/libbpfgo libbpfgo-static +go install github.com/magefile/mage@latest + +# 6) Generate against the live kernel and build. +# IOR_FORCE_GENERATE=1 skips the strict diff against the committed audit +# (generated on a different kernel build). +cd ~/git/ior +IOR_FORCE_GENERATE=1 mage generate +mage all + +# 7) Smoke test. +sudo ./ior -plain -duration 5 +``` + +If `./ior -plain -duration 5` prints `Probing for 5s` and a stream of CSV rows, +the install is good. + +## libbpfgo toolchain + +`ior` links against a locally built `libbpfgo` checkout. By default +`Magefile.go` expects that checkout at `../libbpfgo` relative to this repo; set +`LIBBPFGO=/absolute/path/to/libbpfgo` to override. + +Pin that checkout to `v0.9.2-libbpf-1.5.1` and rebuild the static artifacts +before running `mage` targets: + +```shell +git -C ../libbpfgo checkout v0.9.2-libbpf-1.5.1 +git -C ../libbpfgo submodule update --init --recursive +make -C ../libbpfgo libbpfgo-static +``` + +Validated commands for this pin: + +```shell +mage world +mage integrationTest +``` + +Troubleshooting and rollback: + +- If builds fail with `bpf/bpf.h` missing, re-run the checkout, submodule + sync, and `make libbpfgo-static` commands above, then retry + `mage world`. +- Prefer Mage targets over raw `go test` for packages that import `libbpfgo`; + Mage injects the required `CGO_CFLAGS`, `CGO_LDFLAGS`, and `LIBBPFGO` values. +- To roll back to the previous pin, reset to commit `90dbffffbdab` + (`v0.6.0-libbpf-1.3.0.20240111220235-90dbffffbdab`) and rebuild: + +```shell +git -C ../libbpfgo checkout 90dbffffbdab +git -C ../libbpfgo submodule update --init --recursive +make -C ../libbpfgo libbpfgo-static +``` + +## Compile once, run everywhere + +The full build dance above only has to happen on **one** machine. The resulting +`ior` binary is portable across Linux hosts: `scp ior other-host:/usr/local/bin/` +and run it there. + +Two reasons it works: + +- The Go binary is compiled with `-extldflags "-static"` and links libbpf, + libelf, libzstd, and zlib as static archives. There is no runtime dependency + on the build host's library versions (a couple of glibc resolver functions — + `getpwnam_r` and friends — fall back to the target's libc, which is fine on + any reasonable distro). +- The BPF object inside the binary is built with libbpf's CO-RE + (Compile-Once, Run-Everywhere) machinery. Field offsets are not baked into + the bytecode; libbpf reads the target kernel's BTF (`/sys/kernel/btf/vmlinux`) + at load time and patches the program for that kernel. As long as the target + ships BTF — true on every Debian, Ubuntu, Fedora, Arch, RHEL, and ElRepo + `kernel-ml` build at the time of writing — the same `ior` binary runs without + recompilation. + +Pick one Rocky 9 / Fedora box, do the build dance once, then distribute the +23 MB binary to wherever you want to trace. The build host needs all the dev +tooling; the trace hosts need only a BTF-enabled kernel and `sudo`. + +## Timing semantics + +Each reported event pair has two timing counters: + +- `durationNs`: syscall runtime on the same thread (`exit(current) - enter(current)`). +- `durationToPrevNs`: inter-syscall gap on the same thread (`enter(current) - exit(previous)`). + +Important details: + +- `durationToPrevNs` is tracked per `tid` (thread), not globally across all threads. +- The first observed syscall pair for a thread has `durationToPrevNs = 0` because + there is no prior exit timestamp. +- `durationToPrevNs` is attributed to the current syscall pair (the one whose + `enter` closes the gap). +- There is no separate "idle" pseudo-event bucket; use the `durationToPrev` count + field when aggregated flamegraph output should emphasize inter-syscall time. diff --git a/docs/tui-reference.md b/docs/tui-reference.md new file mode 100644 index 0000000..d6a7266 --- /dev/null +++ b/docs/tui-reference.md @@ -0,0 +1,175 @@ +# TUI Reference + +## TUI Flamegraphs + +Flamegraphs are available only inside the TUI dashboard. +Use `-fields` to change the stack order and `-count` to choose the metric. +The default stack order is `comm,path,tracepoint` (bottom to top). + +## Recording Modes + +`ior` has four distinct output flows: + +| Mode | How to use it | What it writes | Filter behavior | +| --- | --- | --- | --- | +| TUI dashboard | default startup | nothing continuously; data stays in memory unless you export | current TUI/global filters drive what you see | +| TUI CSV snapshot export | press `e` in the dashboard | one `ior-stream-<timestamp>.csv` snapshot of the current filtered stream view | exports only the currently filtered in-memory rows | +| Headless `.ior.zst` export | start with `-flamegraph -name <name>` | one aggregated native trace artifact written at shutdown | no TUI filter stack; this is the native trace/integration workflow | +| Parquet recording | press `R` in the TUI, or start with `-parquet <file>` | a streaming Parquet file of traced syscall rows | TUI mode records rows that pass the active TUI filter; headless `-parquet` records all traced rows | + +Important distinction: + +- `.ior.zst` output is an aggregated native artifact, not a row-by-row event log. +- CSV export is a point-in-time snapshot of the ring buffer. +- Parquet recording is a streaming capture from start to stop. +- The ring buffer is capped, so CSV export is not a replacement for Parquet recording or `.ior.zst` output. + +### Headless Native `.ior.zst` Output + +Use `-flamegraph` when you want the native `ior` trace artifact instead of a streaming row log: + +```shell +sudo ./ior -flamegraph -name trace-run -duration 60 +``` + +Native `.ior.zst` behavior: + +- writes one `*.ior.zst` file when the run ends +- stores aggregated counters for repeated syscall/path/process combinations +- is intended for `ior`'s native flamegraph and integration-style workflows +- does not preserve one output row per traced syscall + +### TUI Parquet Recording + +Start a recording from the dashboard with `R`. + +- First `R`: open a filename prompt (`ior-recording-<timestamp>.parquet` by default). +- `Enter`: start recording to that file. +- Second `R`: stop and finalize the active Parquet file. +- Recording stops automatically when you quit the TUI or reselect PID/TID/session scope. + +Lifecycle details: + +- TUI recording uses the active TUI global filter at emission time. +- If a filter change restarts tracing, the recorder stays alive and continues writing matching rows after the restart. +- The dashboard footer shows the active recording path or the last recording error. + +### Headless Parquet Recording + +Use `-parquet` to skip the TUI and stream traced syscall rows directly to a Parquet file: + +```shell +sudo ./ior -parquet trace.parquet -duration 60 +``` + +Headless Parquet mode behavior: + +- skips the TUI completely +- records all traced rows +- rejects content filters such as `-comm`, `-path`, `-pid`, and `-tid` +- cannot be combined with `-plain`, `-flamegraph`, `--testflames`, or `--testliveflames` + +Use headless mode when you want a full recording, and TUI mode when you want interactive filtering plus optional start/stop recording from the dashboard. + +### Choosing Between `.ior.zst` and Parquet + +| Question | Native `.ior.zst` | Parquet | +| --- | --- | --- | +| Data shape | aggregated counters | one row per traced syscall | +| Write pattern | collect in memory, write one compressed artifact at the end | stream rows continuously while recording | +| Best for | `ior`-native trace artifacts, flamegraph workflows, integration assertions | offline analysis in other tools, long captures, preserving per-event detail | +| Relative write cost | usually lower because repeated events are folded together before file write | usually higher because each traced row is serialized | +| Detail retained | loses original row order and per-event granularity | keeps per-event timing and syscall fields | + +Rule of thumb: + +- choose `.ior.zst` when you want the native `ior` artifact and do not need every traced syscall row preserved +- choose Parquet when you want a full event stream for downstream analysis outside `ior` + +## TUI Navigation + +The TUI has an in-screen help panel (toggle with **H**) that lists all available +keys. Use it to discover shortcuts without consulting this document. + +Dashboard tabs: + +- **tab** / **shift+tab** — next / previous tab +- **1** — Overview +- **2** — Syscalls +- **3** — Files +- **4** — Processes +- **5** — Latency+Gaps +- **6** — Stream + +The TUI has two key scopes: + +- Global hotkeys: available from any dashboard screen. +- Dashboard hotkeys: behavior that depends on the active tab (especially `6:Stream`). + +### Global Hotkeys + +- `tab` / `shift+tab`: cycle tabs. +- `1`–`6`: jump to tab by number (`7` is an alias for `6`). +- `e`: export filtered stream rows to CSV (`ior-stream-<timestamp>.csv`). +- `R`: start or stop Parquet recording. +- `p`: re-open process selector (PID selection flow). +- `t`: open TID selector flow. +- `o`: open probe selection/toggling dialog. +- `r`: refresh dashboard snapshot. +- `H`: toggle bottom help sections on/off. +- `q` or `ctrl+c`: quit. + +### Dashboard / Tab-Specific Hotkeys + +- `d` in `3:Files`: toggle directory-grouped files view. +- `s` in sortable tabs (`2:Syscalls`, `3:Files`, `4:Processes`): sort by selected column. +- `S` in sortable tabs: reverse-sort by selected column. +- `j/k` or `up/down` in list tabs: scroll list. + +`left/right` and `h/l` do not switch tabs. In `6:Stream` paused mode they move the selected column. + +### 6:Stream Hotkeys and Behavior + +`6:Stream` has two modes: + +- Live mode (`paused=false`): rows update continuously. +- Pause mode (`paused=true`): selection/cell/filter/search/export workflows are enabled. + +Core controls: + +- `space`: toggle live/pause. +- `g`/`G`: jump to top/tail. +- `c`: clear stream filters. +- `f`: open advanced filter modal. +- `j/k` or `up/down`: move selected row (pause) or scroll (live). +- `left/right` or `h/l`: move selected column in pause mode. + +#### Enter-Based Filter Stack (Pause Mode) + +In pause mode, `enter` on the selected cell pushes a filter onto a stack and +immediately re-filters the current ring buffer snapshot. Filters are stackable. + +- String columns use case-insensitive substring match: + - `Comm` → `comm~<value>` + - `Syscall` → `syscall~<value>` + - `File` → `file~<value>` +- Numeric exact match: `PID`, `TID`, `FD`, `Ret`, `Bytes` +- Numeric threshold (`>=`): `Latency` → `latency>=selected_value`, `Gap` → `gap>=selected_value` + +`esc` in pause mode pops the most recent filter (LIFO); repeated `esc` undoes +all stacked filters. + +#### Regex Search (Pause Mode) + +- `/`: search forward; `?`: search backward. +- Search checks all stream columns and wraps around the ring buffer. +- `n` / `N`: next / previous match. + +#### Stream CSV Export (Pause Mode) + +- `x`: quick export filtered stream rows to CSV. +- `X`: export with filename prompt. +- `E`: open last stream-exported CSV in foreground editor (`EDITOR` → `VISUAL` → `SUDO_EDITOR` → `hx` → `vi`). + +`e` (global) exports a fresh filtered snapshot even outside paused mode; `x`/`X` +export the exact paused view. |
