From 7031211501884555139351bb676fc0592c9df14c Mon Sep 17 00:00:00 2001 From: Paul Buetow Date: Tue, 9 Jun 2026 22:18:42 +0300 Subject: feat(parquet): surface epoll_ctl op/target-fd/events metadata epoll_ctl's BPF handler already decodes the operation (args[1]), target descriptor (args[2]), and requested event mask (args[3]->events) into an EpollCtlEvent, but the single resolved-epfd `fd` column was the only epoll detail reaching the output schema. Consumers could not see which descriptor was registered nor the operation performed. Surface the metadata as three additive, backward-compatible columns, mirroring the existing dedicated optional-column convention used by requested_sleep_ns and address_space_bytes: - epoll_op (String): ADD/MOD/DEL, or the raw decimal for unknown ops; empty for non-epoll_ctl rows. - epoll_target_fd (Int32): registered descriptor (args[2]); 0 otherwise. - epoll_events (UInt32): requested event mask; 0 otherwise. Data flows EpollCtlEvent -> event.Pair (new EpollCtl/HasEpoll fields, populated in handleEpollCtlExit) -> streamrow.Row -> parquet.Record. The op-to-string mapping lives on event.EpollCtl.OpName. Docs (docs/parquet-querying.md) and the Magefile parquetValidate column list updated in lockstep (also adding the previously-undocumented address_space_bytes/requested_sleep_ns columns). The polling parquet integration test now asserts epoll_ctl rows carry a decoded op and a valid target fd, and that other syscalls leave epoll_op empty. Co-Authored-By: Claude Opus 4.8 --- docs/parquet-querying.md | 24 +++++++++++++++++------- 1 file changed, 17 insertions(+), 7 deletions(-) (limited to 'docs') diff --git a/docs/parquet-querying.md b/docs/parquet-querying.md index 4c31474..2ebf16e 100644 --- a/docs/parquet-querying.md +++ b/docs/parquet-querying.md @@ -30,9 +30,14 @@ state, no installation needed beyond Docker. | `fd` | Int32 | File descriptor | | `ret` | Int64 | Return value (negative = errno) | | `bytes` | UInt64 | Bytes transferred (0 if not applicable) | +| `address_space_bytes` | UInt64 | Memory-region extent for memory syscalls (e.g. `munmap`/`mremap`); 0 otherwise | +| `requested_sleep_ns` | Int64 | Requested sleep duration for nanosleep-style syscalls; 0 otherwise | | `file` | String | File path (empty if not resolved) | | `is_error` | Bool | True when `ret` is a negative errno | | `filter_epoch` | UInt64 | Filter generation at capture time | +| `epoll_op` | String | `epoll_ctl` operation (`ADD`/`MOD`/`DEL`); empty for other syscalls | +| `epoll_target_fd` | Int32 | `epoll_ctl` target descriptor being registered (args[2]); 0 for other syscalls | +| `epoll_events` | UInt32 | `epoll_ctl` requested event mask (args[3]->events); 0 for other syscalls | --- @@ -78,12 +83,17 @@ pid UInt32 tid UInt32 syscall String family String -fd Int32 -ret Int64 -bytes UInt64 -file String -is_error Bool -filter_epoch UInt64 +fd Int32 +ret Int64 +bytes UInt64 +address_space_bytes UInt64 +requested_sleep_ns Int64 +file String +is_error Bool +filter_epoch UInt64 +epoll_op String +epoll_target_fd Int32 +epoll_events UInt32 ``` ### Row count @@ -220,6 +230,6 @@ PARQUET_FILE=ior-recording-20260313-170234.parquet env GOTOOLCHAIN=auto mage par ``` It checks: -1. All 14 expected columns are present +1. All 20 expected columns are present 2. Row count > 0 3. `seq` is monotonically ordered and `time_ns` is non-zero -- cgit v1.2.3