ior - I/O Riot NG is an experiment with BPF.

Age	Commit message (Collapse)	Author
7 days	docs(follow-forks): add process-tree-following plan + filter.c reference	Paul Buetow
	Document the planned opt-in "follow forks" mode that would let ior trace a target PID and all its descendants (needed for the landlock_restrict_self integration case, task ci0, and for tracing forking workloads as a tree). The plan covers the BPF descendant-set map, sched_process_fork/exit hooks, the FOLLOW_FORK gate in filter(), userland flag/seeding/assertion changes, and explicitly requires syscall-count aggregation to roll up across the followed tree. Add a reference comment above filter() pointing to the plan. Plan only — not implemented. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
13 days	fix(classify): capture timerfd_gettime/settime + splice/tee fd, not KindNull	Paul Buetow
	Root cause: the generic field matcher classifyByField only maps an arg literally named "fd" to KindFd. Several syscalls operate on an EXISTING fd whose tracepoint arg0 is named something else, so they fell through to KindNull -> null_event, capturing NO descriptor and dropping the fd they act on: - timerfd_gettime / timerfd_settime: arg0 is "int ufd" (the timerfd) - splice: arg0 is "int fd_in" (source fd of an in-kernel transfer) - tee: arg0 is "int fdin" (source fd of an in-kernel transfer) Fix: add explicit KindFd overrides for these four sys_enter_* keys to nameOnlyKindsTable so the enter handler captures arg0, mirroring the established epoll_wait(epfd) / mq_(mqdes) / sendfile64(out_fd) / copy_file_range(fd_in) precedent. splice/tee were surfaced by a systemic sweep of tracepoint formats for fd-typed arg0 named other than "fd" that currently classify to null; they are TransferClassified siblings of sendfile64/copy_file_range and clearly fd-operating. The at() family (dfd arg0) is intentionally untouched: it is path-classified, and timerfd_create remains the KindEventfd fd CREATOR. Regenerated artifacts (mage generate): the four enter handlers now emit fd_event capturing ctx->args[0] instead of null_event; exit handlers stay UNCLASSIFIED. Updated the generated kind maps, the golden result.txt, the classify_test expectations, and docs/syscall-tracing-plan.md (moved the four from kind "null" to kind "fd"; families IPC/Network unchanged). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01	fix(classify): mq_timedsend returns status, not bytes — make ret UNCLASSIFIED	Paul Buetow
	mq_timedsend(2)/mq_send(3) return 0 on success or -1 on error; the payload size msg_len is an INPUT argument, never the return value. It was wrongly listed in retClassifications as WriteClassified, which made bytesFromRet attribute its 0 return as "bytes written" (the stats engine WriteClassified path). Remove it so its return stays UNCLASSIFIED, consistent with its POSIX mq sibling mq_timedreceive (which legitimately stays ReadClassified because it returns the received byte count). This is the exact same defect just fixed for SysV msgsnd (5057bd9) and mirrors the msgrcv/msgsnd asymmetry. Regenerated tracepoints/docs accordingly and updated the pre-existing classify unit test and the TestPosixMqBasic integration assertion: the mq_timedsend send no longer asserts a write byte count (now expects 0), while mq_timedreceive keeps its received-byte-count assertion. Verified: mage generate idempotent, mage build OK, internal/generate tests pass. TestPosixMqBasic skips in this sandbox (mq_open: permission denied) but compiles with the corrected assertions. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01	fix(classify): msgsnd returns status, not bytes — make ret UNCLASSIFIED	Paul Buetow
	msgsnd(2) returns 0 on success or -1 on error; the payload size msgsz is an INPUT argument, never the return value. It was wrongly listed in retClassifications as WriteClassified, which made the stats engine treat its 0 return as "bytes written". Remove it so its return stays UNCLASSIFIED, consistent with its SysV IPC siblings (msgrcv legitimately stays ReadClassified because it returns a received byte count). Regenerated tracepoints/docs accordingly. Verified: mage generate idempotent, mage build OK, internal/generate tests pass, and the TestSysVMsgBasic integration test (added in task 7i0) still passes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31	listxattrat: READ-classify return for xattr-list family consistency	Paul Buetow
	listxattrat(2) (Linux 6.13+) returns the size in bytes of the list of extended attribute names, exactly like listxattr/llistxattr/flistxattr, but its exit was classified UNCLASSIFIED, so its read bytes were dropped from I/O totals. Classify it as ReadClassified and regenerate the BPF handler (ret_type now READ_CLASSIFIED). This mirrors the getxattrat fix (task ku, commit c3177bd) and completes xattr-family consistency: get-family and list-family are READ_CLASSIFIED while set-family and remove-family stay UNCLASSIFIED (they return 0/-1). Update the docs ReadClassified list and the retclassify expectation, and add an ioworkload scenario plus integration test: the workload sets a user xattr then lists names via the raw listxattrat(2) syscall with AT_FDCWD, and the test asserts enter_listxattrat captures the file path and accounts the returned name-list size as read bytes. Task: r20 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31	getxattrat: READ-classify return for xattr-get family consistency	Paul Buetow
	getxattrat(2) (Linux 6.13+) returns the xattr value size in bytes, exactly like getxattr/lgetxattr/fgetxattr, but its exit was classified UNCLASSIFIED, so its read bytes were dropped from I/O totals. Classify it as ReadClassified and regenerate the BPF handler (ret_type now READ_CLASSIFIED). Path extraction (args[1], after the dirfd) and the name-not-captured-as-path behaviour were already correct. Update the docs ReadClassified list and the retclassify expectation, and add the first xattr integration coverage: an ioworkload scenario that sets then getxattrat-reads a user xattr on tmpfs, plus a test that asserts enter_getxattrat captures the file path (not the xattr name) and accounts the returned value size as read bytes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30	generate: treat rt_sigreturn as noreturn (suppress dead exit handler)	Paul Buetow
	rt_sigreturn(2) restores the pre-signal execution context off the signal stack frame and resumes the interrupted instruction; it never returns to the instruction after the syscall. man sigreturn(2) states plainly that "sigreturn() never returns", and tracing against /sys/kernel/tracing confirms it: sys_enter_rt_sigreturn fires once per signal-handler return while sys_exit_rt_sigreturn never fires. The generator previously emitted a dead handle_sys_exit_rt_sigreturn (it can never run) and recorded a per-tid syscall_enter_state_map entry on the enter path that nothing would ever delete (no exit fires), leaking entries in the bounded map on every signal-handler return. Add rt_sigreturn to noreturnSyscalls so codegen suppresses the dead exit handler and routes the enter handler through ior_on_noreturn_syscall_enter (sampling decision only, no map write), exactly like exit/exit_group. The enter null_event is still emitted, and the FamilySignals/KindNull classification is unchanged. Regenerated the C/Go artifacts and the result baseline accordingly, and generalized the related comments. Lock-in tests: TestRtSigreturnIsNoreturn asserts rt_sigreturn is noreturn; TestRtSigSiblingsAreNotNoreturn guards that the returning rt_sig* siblings are not; TestGenerateExitNoreturnHandlers now also covers rt_sigreturn. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30	sendfile64: capture out_fd instead of dropping both fds	Paul Buetow
	sendfile64(out_fd, in_fd, offset, count) transfers bytes between two file descriptors in the kernel and returns the number of bytes written to out_fd. Its tracepoint fields carry no field literally named "fd", so it fell through to KindNull and captured no descriptor at all - inconsistent with its sibling copy_file_range (KindFd) and the read/write/sendto/recvfrom families. Add an explicit sys_enter_sendfile64 -> KindFd override that captures out_fd (args[0], the destination the bytes are written to), matching the single-fd KindFd convention. The return value stays TransferClassified, consistent with copy_file_range/splice/tee/vmsplice. Family stays Network (sendfile is historically socket-oriented; copy_file_range=FS is pure file-to-file). Update docs/syscall-tracing-plan.md (move sendfile64 from null to fd kind), regenerate C/Go artifacts, fix the phase-A classify assertion, and add TestClassifySendfile64CapturesOutFd as a lock-in + negative test. The existing TestRetbytesPhaseA integration test still passes with the runtime change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30	fix(sleep): record sentinel for TIMER_ABSTIME clock_nanosleep (a20)	Paul Buetow
	clock_nanosleep with the TIMER_ABSTIME flag passes an ABSOLUTE wakeup time in the request timespec, not a relative duration. The generated BPF sleep handler computed requested_ns = tv_sec*1e9 + tv_nsec unconditionally, so absolute sleeps exported a bogus multi-decade "sleep duration" in CSV/parquet/stream. generateExtraSleep now carries an optional flags-argument expression per sleep syscall. For clock_nanosleep the generated handler checks args[1] & TIMER_ABSTIME (value 1) and only computes the relative duration when the flag is clear; absolute sleeps keep the existing -1 sentinel (same value used for null/unreadable timespec pointers). nanosleep is always relative and stays unconditional (no flags arg). - Regenerated internal/c/generated_tracepoints.c (mage generate idempotent). - Added codegen tests asserting the TIMER_ABSTIME guard for clock_nanosleep and its absence for nanosleep. - Extended the ioworkload sleep scenario to issue an absolute clock_nanosleep and the sleep parquet integration test to assert it is reported as -1. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30	fix(z10): skip enter-state write for noreturn syscalls	Paul Buetow
	After p10 suppressed the sys_exit_exit/sys_exit_exit_group handlers, the enter handlers for exit/exit_group still called ior_on_syscall_enter, which writes a per-tid entry into syscall_enter_state_map. With the exit handler gone, nothing ever bpf_map_delete_elem'd that entry, so stale per-tid state accumulated in the bounded (32768) map on hosts churning many distinct tids and could starve legitimate inserts. Add ior_on_noreturn_syscall_enter in internal/c/filter.c: it only makes the sampling decision (ior_should_emit_trace) and deliberately does NOT record enter-state. The code generator now emits this hook for noreturn enter handlers (detected via isNoreturnSyscall(syscallName(name))) so the enter null_event is still emitted while the dead, unreclaimable map write is skipped. Regenerated generated_tracepoints.c accordingly. Extend TestGenerateExitNoreturnHandlers with a negative assertion (no ior_on_syscall_enter for noreturn) and add TestGenerateReturningSyscallEnterRecordsState as a positive contrast. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-29	codegen: suppress unreachable sys_exit handlers for noreturn syscalls	Paul Buetow
	exit and exit_group never return to userspace, so their sys_exit tracepoints can never fire. The generator previously emitted matching EXIT_RET_EVENT handlers anyway, producing dead code in the generated BPF program. classifySyscall now skips exit-handler emission for noreturn syscalls via isNoreturnSyscall, and the regenerated artifacts drop the sys_exit_exit / sys_exit_exit_group handlers (enter handlers are kept). Tests updated to match the new reality: - TestGenerateExitNoreturnHandlers asserts no exit handler is emitted. - TestClassifySyscallPairEmitsAllFamilies exempts noreturn syscalls from the exit-handler-required assertion while staying strict for all others. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-28	close_range: honor last bound and CLOSE_RANGE_CLOEXEC flag	Paul Buetow
	close_range was captured as a single-fd fd_event carrying only first, so the runtime evicted every tracked fd >= first, ignoring the last upper bound and the flags. Bounded calls wrongly dropped still-open higher fds, and CLOSE_RANGE_CLOEXEC (which keeps fds open) was treated as a full close. Reclassify close_range to the two_fd_event kind, mapping fd_a/fd_b/extra to first/last/flags. The runtime now closes only the inclusive [first, last] range (a negative last from ~0U means unbounded) and skips eviction when CLOSE_RANGE_CLOEXEC is set or the syscall fails. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23	2c fix epoll_create and pidfd_open flags in BPF codegen	Paul Buetow
	epoll_create(size) was recording size (args[0]) as flags — hardcode to 0 since the syscall has no flags argument. pidfd_open(pid, flags) was recording pid (args[0]) as flags — use args[1] instead. Add test fixtures and codegen tests that verify the correct argument indexes and reject the old wrong ones. Regenerate generated_tracepoints.c. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23	5c remove tracepoint ID adjacency dependency from aggregate pairing	Paul Buetow
	Generated exit handlers now pass the explicit enter trace ID (SYS_ENTER_X) to ior_on_syscall_exit instead of relying on the implicit enter_id == exit_id + 1 arithmetic invariant. filter.c compares directly against the passed enter ID. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-22	xb make syscall aggregates per-cpu deltas	Paul Buetow

2026-05-21	o7 classify landlock add-rule and restrict-self as fd	Paul Buetow

2026-05-21	m7 classify time and posix timer syscalls	Paul Buetow

2026-05-21	k7 classify process control and prctl syscalls	Paul Buetow

2026-05-21	j7 add futex kind and aggregate-only defaults	Paul Buetow

2026-05-21	l7 classify numa and process memory syscalls	Paul Buetow

2026-05-21	h7 classify additional memory syscalls	Paul Buetow

2026-05-21	57 add bpf syscall kind classification	Paul Buetow

2026-05-21	37 classify clone family as proc kind	Paul Buetow

2026-05-21	b7 classify sysv ipc ids and ops	Paul Buetow

2026-05-21	e7 classify acct pathname and misc null syscalls	Paul Buetow

2026-05-21	67 add seccomp and module trace kinds	Paul Buetow

2026-05-21	i7 classify memory P3 syscalls as mem kind	Paul Buetow

2026-05-21	g7 classify fd-from-air eventfd users	Paul Buetow

2026-05-21	n7 classify pidfd and misc tail syscalls	Paul Buetow

2026-05-21	f7 wire eventfd kind for fd-from-air IPC syscalls	Paul Buetow

2026-05-21	task-47: fix execveat dirfd codegen fallback	Paul Buetow

2026-05-20	task-47: add KindExec for execve paths	Paul Buetow

2026-05-20	feat: add keyctl ptrace perf_event_open tracing (task 77)	Paul Buetow

2026-05-20	d7: add POSIX mq syscall kind/classification and coverage	Paul Buetow

2026-05-20	feat: add mount/fs management syscall tracing for c7	Paul Buetow

2026-05-20	task 27: add KindSleep and requested sleep metric	Paul Buetow

2026-05-20	feat: add syscall aggregate sampling infrastructure (task 17)	Paul Buetow

2026-05-20	task 07: add KindMem and separate address-space byte accounting	Paul Buetow

2026-05-19	z6: add KindPoll wiring for poll/select ready counts	Paul Buetow

2026-05-19	y6: add epoll ctl/wait tracing and ready-count coverage	Paul Buetow

2026-05-19	x6: add pipe/eventfd fd-from-air syscall support	Paul Buetow

2026-05-19	v6: add KindAccept and wire accept/accept4	Paul Buetow

2026-05-19	u6: fix socketpair exit fd capture and socket filtering	Paul Buetow

2026-05-19	u6: add socket/socketpair kind scaffolding and wiring	Paul Buetow

2026-05-18	j6: defer mmsg byte classification	Paul Buetow

2026-05-18	k6: emit tracepoints for all syscall families	Paul Buetow

2026-05-06	add Dockerfile and Rocky Linux 9 build docs	Paul Buetow
	Introduces a Docker-based build path so ior can be compiled on any Linux host without a native Rocky 9 toolchain setup: - Dockerfile: Rocky 9 minimal image with Go (version from ARG, default from go.mod), static libelf/libzstd built from source, libbpfgo at v0.9.2-libbpf-1.5.1, and mage; CMD runs mage generate + mage all against the repo root mounted as a volume. - scripts/build-with-docker.sh: reads GO_VERSION from go.mod, passes it as --build-arg to docker build, mounts tracefs and BTF into the container, writes the binary to the repo root. - Magefile.go: adds BuildDocker target that wraps the script. - README.md: simplified to the two build paths (Docker + native) with links to docs/; removed GOTOOLCHAIN=auto throughout. - docs/build-rocky-linux-9.md: full manual Rocky 9 steps, libbpfgo toolchain setup/rollback, compile-once-run-everywhere explanation, and timing semantics. - docs/tui-reference.md: complete TUI hotkey reference, recording mode details, and the .ior.zst vs Parquet trade-off table. - AGENTS.md: removed GOTOOLCHAIN=auto from all build commands. - internal/c/generated_tracepoints.c: regenerated against the host kernel. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-02	fix BPF tracepoint context type for RHEL 9 stock kernel	Paul Buetow
	The BPF handler generator emitted struct trace_event_raw_sys_enter/ trace_event_raw_sys_exit (the BTF-blessed aliases). RHEL 9 carries an rt-tree backport that adds preempt_lazy_count to struct trace_entry, which widens those aliases by 8 bytes and shifts args/ret. The actual tracepoint context the kernel hands the program is still syscall_trace_enter / syscall_trace_exit, where the offsets did not move. Programs typed against the wider alias read past max_ctx_offset and the verifier rejects the attach with EACCES. Switching the generator to emit syscall_trace_enter/exit lines up with the real context on RHEL 9 (and is identical on every other distro, since the two structs only diverge there). Same fix bcc shipped in iovisor/bcc#4920 and inspektor-gadget did in inspektor-gadget#2546. Field accesses (ctx->args[N], ctx->ret) are unchanged. Verified end-to-end on Rocky Linux 9.7 stock 5.14.0-611.5.1.el9_7 (no kernel-ml needed) and Fedora 6.19. README rewritten accordingly: drops the elrepo kernel-ml step and the trailing 'permission denied' troubleshooting paragraph; adds a historical note explaining why the old workaround existed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-03-18	cleanup	Paul Buetow

2026-02-25	Fix initial TUI sizing and align two-row sparklines	Paul Buetow