summaryrefslogtreecommitdiff
path: root/internal/generate/bpfhandler.go
AgeCommit message (Collapse)Author
2026-05-30generate: treat rt_sigreturn as noreturn (suppress dead exit handler)Paul Buetow
rt_sigreturn(2) restores the pre-signal execution context off the signal stack frame and resumes the interrupted instruction; it never returns to the instruction after the syscall. man sigreturn(2) states plainly that "sigreturn() never returns", and tracing against /sys/kernel/tracing confirms it: sys_enter_rt_sigreturn fires once per signal-handler return while sys_exit_rt_sigreturn never fires. The generator previously emitted a dead handle_sys_exit_rt_sigreturn (it can never run) and recorded a per-tid syscall_enter_state_map entry on the enter path that nothing would ever delete (no exit fires), leaking entries in the bounded map on every signal-handler return. Add rt_sigreturn to noreturnSyscalls so codegen suppresses the dead exit handler and routes the enter handler through ior_on_noreturn_syscall_enter (sampling decision only, no map write), exactly like exit/exit_group. The enter null_event is still emitted, and the FamilySignals/KindNull classification is unchanged. Regenerated the C/Go artifacts and the result baseline accordingly, and generalized the related comments. Lock-in tests: TestRtSigreturnIsNoreturn asserts rt_sigreturn is noreturn; TestRtSigSiblingsAreNotNoreturn guards that the returning rt_sig* siblings are not; TestGenerateExitNoreturnHandlers now also covers rt_sigreturn. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30fix(sleep): record sentinel for TIMER_ABSTIME clock_nanosleep (a20)Paul Buetow
clock_nanosleep with the TIMER_ABSTIME flag passes an ABSOLUTE wakeup time in the request timespec, not a relative duration. The generated BPF sleep handler computed requested_ns = tv_sec*1e9 + tv_nsec unconditionally, so absolute sleeps exported a bogus multi-decade "sleep duration" in CSV/parquet/stream. generateExtraSleep now carries an optional flags-argument expression per sleep syscall. For clock_nanosleep the generated handler checks args[1] & TIMER_ABSTIME (value 1) and only computes the relative duration when the flag is clear; absolute sleeps keep the existing -1 sentinel (same value used for null/unreadable timespec pointers). nanosleep is always relative and stays unconditional (no flags arg). - Regenerated internal/c/generated_tracepoints.c (mage generate idempotent). - Added codegen tests asserting the TIMER_ABSTIME guard for clock_nanosleep and its absence for nanosleep. - Extended the ioworkload sleep scenario to issue an absolute clock_nanosleep and the sleep parquet integration test to assert it is reported as -1. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30fix(z10): skip enter-state write for noreturn syscallsPaul Buetow
After p10 suppressed the sys_exit_exit/sys_exit_exit_group handlers, the enter handlers for exit/exit_group still called ior_on_syscall_enter, which writes a per-tid entry into syscall_enter_state_map. With the exit handler gone, nothing ever bpf_map_delete_elem'd that entry, so stale per-tid state accumulated in the bounded (32768) map on hosts churning many distinct tids and could starve legitimate inserts. Add ior_on_noreturn_syscall_enter in internal/c/filter.c: it only makes the sampling decision (ior_should_emit_trace) and deliberately does NOT record enter-state. The code generator now emits this hook for noreturn enter handlers (detected via isNoreturnSyscall(syscallName(name))) so the enter null_event is still emitted while the dead, unreclaimable map write is skipped. Regenerated generated_tracepoints.c accordingly. Extend TestGenerateExitNoreturnHandlers with a negative assertion (no ior_on_syscall_enter for noreturn) and add TestGenerateReturningSyscallEnterRecordsState as a positive contrast. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-232c fix epoll_create and pidfd_open flags in BPF codegenPaul Buetow
epoll_create(size) was recording size (args[0]) as flags — hardcode to 0 since the syscall has no flags argument. pidfd_open(pid, flags) was recording pid (args[0]) as flags — use args[1] instead. Add test fixtures and codegen tests that verify the correct argument indexes and reject the old wrong ones. Regenerate generated_tracepoints.c. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-235c remove tracepoint ID adjacency dependency from aggregate pairingPaul Buetow
Generated exit handlers now pass the explicit enter trace ID (SYS_ENTER_X) to ior_on_syscall_exit instead of relying on the implicit enter_id == exit_id + 1 arithmetic invariant. filter.c compares directly against the passed enter ID. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23ac table-drive BPF extra-code generation away from switchesPaul Buetow
Replace the large switch in generateExtra with an extraEmitters registry (map[TracepointKind]extraEmitter) and convert six inner switch-on-name helpers to table-driven lookups: - generateExtraMem -> memFieldOverrides table - generateExtraEventfd -> eventfdFlagsExpr table - generateExtraTwoFd -> twoFdOverrides + twoFdDefault - generateExtraPoll -> pollOverrides + pollTimeoutBody(style) - generateExtraSleep -> sleepTimespecPtr table - generateExtraKeyctl -> keyctlOverrides table Adding a new syscall kind or variant now requires only a table entry instead of editing switch arms with raw C string literals. Generated BPF C output is behaviorally equivalent; all existing tests pass unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21h7 classify additional memory syscallsPaul Buetow
2026-05-21i7 classify memory P3 syscalls as mem kindPaul Buetow
2026-05-21g7 classify fd-from-air eventfd usersPaul Buetow
2026-05-21n7 classify pidfd and misc tail syscallsPaul Buetow
2026-05-21f7 wire eventfd kind for fd-from-air IPC syscallsPaul Buetow
2026-05-21p7 add attach-time trace dimension gatingPaul Buetow
2026-05-21task-47: fix execveat dirfd codegen fallbackPaul Buetow
2026-05-20task-47: add KindExec for execve pathsPaul Buetow
2026-05-20feat: add keyctl ptrace perf_event_open tracing (task 77)Paul Buetow
2026-05-20d7: add POSIX mq syscall kind/classification and coveragePaul Buetow
2026-05-20feat: add mount/fs management syscall tracing for c7Paul Buetow
2026-05-20task 27: add KindSleep and requested sleep metricPaul Buetow
2026-05-20feat: add syscall aggregate sampling infrastructure (task 17)Paul Buetow
2026-05-20task 07: add KindMem and separate address-space byte accountingPaul Buetow
2026-05-19z6: add KindPoll wiring for poll/select ready countsPaul Buetow
2026-05-19y6: add epoll ctl/wait tracing and ready-count coveragePaul Buetow
2026-05-19x6: add pipe/eventfd fd-from-air syscall supportPaul Buetow
2026-05-19v6: add KindAccept and wire accept/accept4Paul Buetow
2026-05-19u6: fix socketpair exit fd capture and socket filteringPaul Buetow
2026-05-19u6: add socket/socketpair kind scaffolding and wiringPaul Buetow
2026-05-13refactor: break down functions exceeding 50 lines into smaller helpersPaul Buetow
Split 22 production files across the codebase — event loop, TUI models, probe manager, dashboard, export, flag parsing, code generation, and ioworkload scenarios — so that no function body exceeds 50 lines. Each extracted helper carries its own comment explaining its role. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13refactor(generate): replace classifySyscall switches with kindRegistry (OCP)Paul Buetow
Introduce kindregistry.go with a kindMeta struct (structName, enterAccepted) and a kindRegistry map keyed by TracepointKind. Replace the switch in isEnterRejected (codegen.go) and the switch in eventStructName (bpfhandler.go) with lookupKind registry lookups. Adding a new TracepointKind now only requires a single registry entry — no switch statements need to be touched. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-02fix BPF tracepoint context type for RHEL 9 stock kernelPaul Buetow
The BPF handler generator emitted struct trace_event_raw_sys_enter/ trace_event_raw_sys_exit (the BTF-blessed aliases). RHEL 9 carries an rt-tree backport that adds preempt_lazy_count to struct trace_entry, which widens those aliases by 8 bytes and shifts args/ret. The actual tracepoint context the kernel hands the program is still syscall_trace_enter / syscall_trace_exit, where the offsets did not move. Programs typed against the wider alias read past max_ctx_offset and the verifier rejects the attach with EACCES. Switching the generator to emit syscall_trace_enter/exit lines up with the real context on RHEL 9 (and is identical on every other distro, since the two structs only diverge there). Same fix bcc shipped in iovisor/bcc#4920 and inspektor-gadget did in inspektor-gadget#2546. Field accesses (ctx->args[N], ctx->ret) are unchanged. Verified end-to-end on Rocky Linux 9.7 stock 5.14.0-611.5.1.el9_7 (no kernel-ml needed) and Fedora 6.19. README rewritten accordingly: drops the elrepo kernel-ml step and the trailing 'permission denied' troubleshooting paragraph; adds a historical note explaining why the old workaround existed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-02-23Add baseline pidfd_getfd tracepoint supportPaul Buetow
2026-02-23Fix integration trace expectations and fd/open event handlingPaul Buetow
2026-02-21Migrate make targets to magePaul Buetow
Amp-Thread-ID: https://ampcode.com/threads/T-019c7f4e-cc5f-76f1-aaf0-dd7cbaabbb18 Co-authored-by: Amp <amp@ampcode.com>