Files
Zachary D. Rowitsch 38e6dcc34a chore: archive v1.0 phase directories to milestones/v1.0-phases/
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 01:33:15 -04:00

30 KiB

Phase 1: Data Pipeline - Research

Researched: 2026-03-21 Domain: Linux eBPF kernel tracing, TCP/UDP connection monitoring, Aya Rust eBPF framework Confidence: HIGH

Summary

Phase 1 implements the core data pipeline: eBPF programs in the Linux kernel capture TCP and UDP events (connections, bytes, packets, state changes, RTT), push them through a ring buffer to userspace, where a Rust application aggregates per-connection statistics and streams them to stdout. This is a greenfield Aya eBPF project following the standard workspace template (ebpf crate, common crate, userspace crate).

The eBPF hook strategy combines kprobes for byte/packet counting (tcp_sendmsg, tcp_recvmsg, udp_sendmsg, udp_recvmsg) with the sock:inet_sock_set_state tracepoint for TCP state transitions and reading srtt_us from the sock struct for RTT. Pre-existing connections are bootstrapped from /proc/net/tcp and /proc/net/udp on startup. All kernel-to-userspace communication uses a single RingBuf map (requires kernel 5.8+).

The platform abstraction trait (NetworkCollector) is defined in this phase to enable the macOS backend in Phase 4, but only the Linux/eBPF implementation is built. The proof-of-life output is streaming lines to stdout -- no TUI, no cursor manipulation.

Primary recommendation: Use the Aya template workspace structure with aya-build in build.rs (not xtask). Attach to 4 kprobes + 1 tracepoint, use a single RingBuf for all events, and keep the common crate's event enum as the single shared type between kernel and userspace.

<user_constraints>

User Constraints (from CONTEXT.md)

Locked Decisions

  • D-01: Streaming lines to stdout -- each connection update prints a new line (like tail -f). No screen clearing or cursor manipulation.
  • D-02: Human-readable sizes by default (1.2 MB, 340 KB/s). Show raw value alongside when it fits cleanly without noise (e.g., 1258291 (1.2M)). If too noisy, drop to human-readable only.
  • D-03: Full detail per connection -- all fields: local/remote addr+port, PID, process name, TCP state, bytes in/out, packets in/out, RTT, bandwidth rate.
  • D-04: Output is assumed throwaway scaffolding. Build it simple; decide whether to keep as --batch mode when TUI lands in Phase 2.
  • D-05: Full 4-tuple grouping -- (src IP, src port, dst IP, dst port) = one flow for UDP.
  • D-06: 5-second idle timeout -- UDP flows disappear 5s after last packet. Tunable via flag deferred.
  • D-07: No synthesized state for UDP -- show - or UDP in the state column.
  • D-08: Flat bidirectional flow tracking -- count bytes/packets in each direction, no request/response inference.
  • D-09: Minimal error message on missing privileges: error: tcptop requires root privileges. Run with sudo.
  • D-10: If Linux capabilities (CAP_BPF, CAP_PERFMON) are present, proceed without root. Don't suggest capabilities in error message.
  • D-11: Exit code 77 on privilege failure.
  • D-12: Closed TCP connections linger for one display/refresh cycle, then are removed.
  • D-13: Visual distinction for new/closing connections -- exact styling deferred to Phase 2.
  • D-14: Connection close events print [CLOSED] 192.168.1.1:443 -> ... in streaming output.
  • D-15: Pre-existing connections sourced from /proc/net/tcp on startup, marked as partial.

Claude's Discretion

  • eBPF hook point selection (kprobes vs tracepoints)
  • Platform abstraction trait design and boundaries
  • Ring buffer vs perf event array for kernel-to-userspace transport
  • RTT estimation implementation approach
  • Exact format/layout of streaming output lines
  • /proc/net/tcp parsing strategy for pre-existing connections
  • Internal data structures and concurrency model

Deferred Ideas (OUT OF SCOPE)

  • UDP flow idle timeout user-configurable via CLI flag -- Phase 2 or 3
  • Protocol hint column for well-known UDP ports -- future enhancement
  • Toggle between human-readable and raw byte display via keypress -- Phase 2 TUI
  • --batch or --once mode -- evaluate after Phase 2 </user_constraints>

<phase_requirements>

Phase Requirements

ID Description Research Support
DATA-01 Per-connection byte counts (sent/received) in real time kprobes on tcp_sendmsg/tcp_recvmsg/udp_sendmsg/udp_recvmsg capture byte counts from msg length args
DATA-02 Per-connection packet counts (sent/received) in real time Same kprobes increment packet counters per event
DATA-03 TCP connection state (ESTABLISHED, LISTEN, TIME_WAIT, etc.) sock:inet_sock_set_state tracepoint fires on every state transition with old+new state
DATA-04 Correlate connection to owning process (PID, process name) bpf_get_current_pid_tgid() in kprobes captures PID; process name via bpf_get_current_comm() or userspace sysinfo/procfs
DATA-05 Per-connection TCP RTT estimate Read srtt_us from struct tcp_sock in kprobe context (shifted right by 3 for microseconds)
DATA-06 Bandwidth rates (KB/s or MB/s) per connection Userspace calculation: delta bytes / delta time between refresh cycles
DATA-07 Track TCP and UDP (UDP flows synthesized from 4-tuple with idle timeout) UDP kprobes + userspace HashMap keyed by 4-tuple with 5s idle expiry
PLAT-01 Works on Linux kernel 5.8+ using eBPF RingBuf requires 5.8+; all hook points available in 5.8+; Aya handles BTF portability
PLAT-03 Platform abstraction allows different backends NetworkCollector trait with async Stream of connection events; Linux impl in this phase, macOS impl in Phase 4
OPS-01 Detect missing root/elevated privileges with clear error Check geteuid() == 0 or CAP_BPF+CAP_PERFMON capabilities before eBPF load
OPS-02 Low overhead on host (no heavy polling, eBPF ring buffer based) RingBuf is event-driven (no polling); kprobes fire only on actual network calls
</phase_requirements>

Standard Stack

Core (Phase 1 Dependencies)

Library Version Purpose Verified
aya 0.13.1 eBPF loader, map access, program attachment crates.io 2025-11-17
aya-ebpf 0.1.1 Kernel-side eBPF macros (#[kprobe], #[tracepoint]) crates.io 2025-11-17
aya-log 0.2.1 Userspace eBPF log receiver crates.io 2024-10-09
aya-log-ebpf 0.1.0 Kernel-side eBPF logging (info!, debug!) crates.io 2024-10-09
aya-build 0.1.3 Build script for eBPF compilation (replaces xtask) crates.io 2025-11-17
tokio 1.50.0 Async runtime (eBPF event loop, timers) crates.io 2026-03-03
clap 4.6.0 CLI argument parsing crates.io 2026-03-12
anyhow 1.0.102 Application error handling crates.io 2026-02-20
thiserror 2.0.18 Typed errors for library/trait code crates.io 2026-01-18
nix 0.31.2 Unix syscalls (privilege check, geteuid) crates.io 2026-02-28
procfs 0.18.0 /proc/net/tcp and /proc/net/udp parsing crates.io 2025-08-30
serde 1.0.228 Serialization for shared types crates.io 2025-09-27
log + env_logger 0.4.x / 0.11.x Internal diagnostics logging Stable, widely used
signal-hook 0.4.3 Graceful SIGINT/SIGTERM shutdown crates.io 2026-01-24

eBPF Crate Dependencies (tcptop-ebpf)

Library Version Purpose
aya-ebpf 0.1.1 eBPF program macros and helpers
aya-log-ebpf 0.1.0 Kernel-side logging

Build Tools

Tool Purpose Notes
Rust nightly eBPF compilation target Pin in rust-toolchain.toml with bpfel-unknown-none target
bpf-linker Links eBPF object files cargo install bpf-linker (on macOS: --no-default-features + LLVM 21+)
aya-build Build.rs integration Compiles eBPF crate during cargo build -- no xtask needed

Not Needed in Phase 1

Library Why Deferred
ratatui / crossterm Phase 2 (TUI)
csv Phase 3 (CSV logging)
pcap Phase 4 (macOS backend)
sysinfo Not needed if bpf_get_current_comm() suffices for process names in kernel; procfs covers PID-to-name fallback

Architecture Patterns

tcptop/
├── Cargo.toml                    # Workspace root
├── rust-toolchain.toml           # Pin nightly + bpfel-unknown-none target
├── .cargo/config.toml            # Build flags for eBPF target
├── tcptop/                       # Userspace binary crate
│   ├── Cargo.toml
│   ├── build.rs                  # Uses aya-build to compile eBPF crate
│   └── src/
│       ├── main.rs               # Entry point: privilege check, CLI, event loop
│       ├── collector/
│       │   ├── mod.rs            # NetworkCollector trait definition
│       │   └── linux.rs          # Linux/eBPF implementation
│       ├── model.rs              # ConnectionRecord, ConnectionKey, ConnectionStats
│       ├── aggregator.rs         # Connection state aggregation, bandwidth calc, UDP timeout
│       ├── output.rs             # Streaming stdout formatter
│       ├── privilege.rs          # Root/capability check logic
│       └── proc_bootstrap.rs     # /proc/net/tcp+udp parser for pre-existing connections
├── tcptop-ebpf/                  # eBPF kernel programs (no_std)
│   ├── Cargo.toml
│   └── src/
│       └── main.rs               # All eBPF programs: kprobes + tracepoint
└── tcptop-common/                # Shared types (no_std compatible)
    ├── Cargo.toml
    └── src/
        └── lib.rs                # Event enum, ConnectionKey, #[repr(C)] structs

Pattern 1: Event-Driven eBPF Pipeline

What: Kernel eBPF programs emit events into a RingBuf; userspace consumes them asynchronously via tokio and updates an in-memory connection table.

When to use: Always -- this is the core architecture.

Flow:

Kernel: kprobe/tracepoint fires
  -> eBPF program extracts fields from sock/skb
  -> Writes TcptopEvent to RingBuf
  -> Userspace tokio task polls RingBuf via AsyncFd
  -> Deserializes event, updates ConnectionTable HashMap
  -> Periodic tick formats and prints to stdout

eBPF side (kernel):

// tcptop-common/src/lib.rs
#![no_std]

#[repr(C)]
#[derive(Clone, Copy)]
pub enum TcptopEvent {
    TcpSend(DataEvent),
    TcpRecv(DataEvent),
    UdpSend(DataEvent),
    UdpRecv(DataEvent),
    TcpStateChange(StateEvent),
}

#[repr(C)]
#[derive(Clone, Copy)]
pub struct DataEvent {
    pub pid: u32,
    pub comm: [u8; 16],
    pub saddr: u32,       // IPv4 for now; extend to support IPv6
    pub daddr: u32,
    pub sport: u16,
    pub dport: u16,
    pub bytes: u32,
    pub srtt_us: u32,     // only meaningful for TCP
}

#[repr(C)]
#[derive(Clone, Copy)]
pub struct StateEvent {
    pub pid: u32,
    pub saddr: u32,
    pub daddr: u32,
    pub sport: u16,
    pub dport: u16,
    pub old_state: u32,
    pub new_state: u32,
}

eBPF program (kernel):

// tcptop-ebpf/src/main.rs
#![no_std]
#![no_main]

use aya_ebpf::{macros::{kprobe, tracepoint, map}, maps::RingBuf, programs::{ProbeContext, TracePointContext}};
use tcptop_common::{TcptopEvent, DataEvent};

#[map]
static EVENTS: RingBuf = RingBuf::with_byte_size(256 * 1024, 0); // 256 KB

#[kprobe]
pub fn tcp_sendmsg(ctx: ProbeContext) -> u32 {
    match try_tcp_sendmsg(ctx) {
        Ok(ret) => ret,
        Err(_) => 0,
    }
}

fn try_tcp_sendmsg(ctx: ProbeContext) -> Result<u32, i64> {
    // arg0: struct sock*, arg1: struct msghdr*, arg2: size_t (bytes)
    let sock: *const core::ffi::c_void = ctx.arg(0).ok_or(1)?;
    let size: usize = ctx.arg(2).ok_or(1)?;
    // Extract connection tuple from sock->__sk_common
    // Write TcptopEvent::TcpSend to EVENTS ring buffer
    if let Some(mut entry) = EVENTS.reserve::<TcptopEvent>(0) {
        // populate entry...
        entry.submit(0);
    }
    Ok(0)
}

Userspace consumer:

// Userspace: read from RingBuf with AsyncFd for async notification
use aya::maps::RingBuf;
use tokio::io::unix::AsyncFd;

let mut ring_buf = RingBuf::try_from(bpf.map_mut("EVENTS").unwrap())?;
let async_fd = AsyncFd::new(ring_buf.as_raw_fd())?;

loop {
    let mut guard = async_fd.readable().await?;
    while let Some(item) = ring_buf.next() {
        let event: &TcptopEvent = unsafe { &*(item.as_ptr() as *const TcptopEvent) };
        connection_table.update(event);
    }
    guard.clear_ready();
}

Pattern 2: Platform Abstraction Trait

What: A trait that abstracts the data collection backend, enabling Linux (eBPF) and macOS (pcap) implementations behind a common interface.

Design:

// collector/mod.rs
use tokio::sync::mpsc;

pub enum CollectorEvent {
    Data(DataEvent),
    StateChange(StateEvent),
}

#[async_trait::async_trait]
pub trait NetworkCollector: Send {
    /// Start collecting. Sends events to the provided channel.
    async fn start(&mut self, tx: mpsc::Sender<CollectorEvent>) -> Result<()>;

    /// Stop collecting and clean up kernel resources.
    async fn stop(&mut self) -> Result<()>;

    /// Bootstrap pre-existing connections (e.g., from /proc/net/tcp).
    fn bootstrap_existing(&self) -> Result<Vec<ConnectionRecord>>;
}

Why mpsc channel: Decouples the collector from the aggregator. The eBPF ring buffer reader pushes events into the channel; the aggregator consumes them on its own schedule. This also naturally supports the macOS backend in Phase 4 pushing pcap-derived events into the same channel.

Pattern 3: Connection Table with Bandwidth Calculation

What: In-memory HashMap keyed by connection tuple, with periodic tick for rate calculation and UDP idle timeout.

pub struct ConnectionTable {
    connections: HashMap<ConnectionKey, ConnectionRecord>,
    last_tick: Instant,
}

impl ConnectionTable {
    pub fn tick(&mut self) {
        let now = Instant::now();
        let dt = now.duration_since(self.last_tick);

        for record in self.connections.values_mut() {
            record.rate_in = (record.bytes_in - record.prev_bytes_in) as f64 / dt.as_secs_f64();
            record.rate_out = (record.bytes_out - record.prev_bytes_out) as f64 / dt.as_secs_f64();
            record.prev_bytes_in = record.bytes_in;
            record.prev_bytes_out = record.bytes_out;
        }

        // Expire UDP flows idle > 5 seconds
        self.connections.retain(|k, r| {
            if k.protocol == Protocol::Udp {
                r.last_seen.elapsed() < Duration::from_secs(5)
            } else {
                true // TCP lifecycle managed by state events
            }
        });

        self.last_tick = now;
    }
}

Anti-Patterns to Avoid

  • Polling /proc/net/tcp in a loop: Misses short-lived connections, high overhead, racy PID mapping. Use eBPF for event-driven capture; /proc only at startup for bootstrapping.
  • PerfEventArray instead of RingBuf for this use case: PerfEventArray is per-CPU with no ordering guarantees and requires per-CPU buffer management. RingBuf provides global ordering (important for state change events), shared buffer (simpler), and notification via fd. Since kernel 5.8 is the minimum anyway, always prefer RingBuf.
  • Separate RingBuf per program type: Unnecessary complexity. A single RingBuf with a tagged enum (TcptopEvent) handles all event types cleanly.
  • Blocking the eBPF event loop with stdout writes: The RingBuf consumer and the stdout printer must be decoupled. If stdout blocks (pipe full), it must not cause RingBuf overflow. Use a channel between consumer and printer.

Don't Hand-Roll

Problem Don't Build Use Instead Why
/proc/net/tcp parsing Custom line parser procfs crate (0.18.0) Handles IPv4/IPv6, hex parsing, socket states, inode mapping. Battle-tested.
CLI argument parsing Manual std::env::args clap 4.6.0 derive macros Handles --help, validation, error messages automatically
eBPF program compilation Custom build scripts aya-build 0.1.3 in build.rs Handles target selection, cross-compilation, artifact embedding
Async eBPF event reading Raw epoll/poll loop tokio::io::unix::AsyncFd wrapping RingBuf fd Integrates with tokio runtime, handles wakeups correctly
Privilege checking Raw syscall nix::unistd::geteuid() + caps crate or manual /proc/self/status CapEff parse Correct cross-distro behavior
Human-readable byte formatting Format function Simple utility function is fine here (too small for a crate) Only needs KB/MB/GB and rate formatting; 20 lines of code

Key insight: The eBPF build pipeline is the most complex "don't hand-roll" item. Aya's template structure and aya-build handle the nightmare of cross-compiling no_std Rust to BPF bytecode, linking, and embedding the result in the userspace binary.

Common Pitfalls

Pitfall 1: RingBuf Entry Alignment

What goes wrong: Shared types between eBPF and userspace have mismatched memory layout, causing corrupted data. Why it happens: eBPF enforces 8-byte max alignment. Rust may add padding differently than C. How to avoid: All shared types in tcptop-common MUST use #[repr(C)] and have no field with alignment > 8. Use only primitive types (u8, u16, u32, u64, i32, arrays of primitives). No String, Vec, Option, or enums with non-trivial discriminants. Warning signs: Garbled IP addresses, impossible PID values, random byte counts.

Pitfall 2: eBPF Verifier Rejection

What goes wrong: Kernel verifier rejects the eBPF program at load time with cryptic errors. Why it happens: Unbounded loops, stack size exceeded (512 bytes max), uninitialized memory reads, pointer arithmetic verifier can't prove safe. How to avoid: Keep eBPF functions small and linear. No loops (or use bounded for i in 0..MAX). Minimize stack usage -- read kernel structs field-by-field, don't copy whole structs. Use bpf_probe_read_kernel for every kernel pointer dereference. Warning signs: "R1 invalid mem access", "back-edge in control flow", "unreachable insn".

Pitfall 3: Missing Process Context in Tracepoint

What goes wrong: sock:inet_sock_set_state fires in interrupt/softirq context where bpf_get_current_pid_tgid() returns 0 (kernel context, not the process). Why it happens: TCP state changes can be triggered by incoming packets processed in softirq, not in the context of the owning process. How to avoid: Capture PID in the kprobes (tcp_sendmsg/tcp_recvmsg) which DO run in process context, store in a HashMap keyed by sock pointer. When inet_sock_set_state fires, look up the sock pointer in the map to get the PID. If not found, PID is 0/unknown -- enrich from userspace via procfs. Warning signs: Many connections showing PID 0 or "unknown" process name.

Pitfall 4: IPv6 Support Forgotten

What goes wrong: Tool only handles IPv4, crashes or silently drops IPv6 connections. Why it happens: Easy to prototype with u32 for IP addresses and forget about IPv6. How to avoid: Use [u8; 16] for all IP address fields from the start. In the common crate, include an af_family: u16 field. Format IPv4-mapped-IPv6 addresses correctly. The sock.__sk_common.skc_family field tells you AF_INET vs AF_INET6. Warning signs: Missing connections on dual-stack hosts, especially localhost (::1).

Pitfall 5: srtt_us Field Interpretation

What goes wrong: RTT values are 8x too high. Why it happens: The kernel stores srtt_us as "smoothed RTT << 3" (shifted left by 3). Must shift right by 3 to get actual microseconds. How to avoid: Always srtt_us >> 3 when reading from struct tcp_sock. Document this in the code comment referencing include/net/tcp.h. Warning signs: RTT values of 800us for localhost connections (should be ~100us).

Pitfall 6: Pre-existing Connection Byte Counts

What goes wrong: Connections that existed before tcptop started show misleadingly low byte counts. Why it happens: eBPF hooks only capture bytes from the moment they're attached. Historical byte counts are not available. How to avoid: Per D-15, mark pre-existing connections as partial. The bootstrap from /proc/net/tcp provides the connection tuple and state but NOT byte counts. Start byte counters at 0 and indicate to the user these are "since monitoring started." Warning signs: Long-lived connections (like SSH) showing 0 bytes initially.

Pitfall 7: RingBuf Overflow Under Load

What goes wrong: High-traffic hosts generate more events than userspace can consume; events are silently dropped. Why it happens: RingBuf has fixed size. Unlike PerfEventArray, dropped events are reported to the eBPF side (reserve returns None), not to userspace. How to avoid: Size the ring buffer appropriately (start with 256KB, tune up if needed). In eBPF code, handle the reserve returning None gracefully -- increment a counter in a separate eBPF array map so userspace can detect and report drops. Consider rate-limiting per-connection updates in the eBPF program. Warning signs: Missing events, gaps in connection tracking, drop counter incrementing.

Code Examples

Privilege Check (D-09, D-10, D-11)

// privilege.rs
use nix::unistd::geteuid;
use std::process;

pub fn check_privileges() {
    if geteuid().is_root() {
        return;
    }

    // Check for CAP_BPF and CAP_PERFMON via /proc/self/status
    if has_required_capabilities() {
        return;
    }

    eprintln!("error: tcptop requires root privileges. Run with sudo.");
    process::exit(77);
}

fn has_required_capabilities() -> bool {
    // Parse /proc/self/status for CapEff line
    // Check bits for CAP_BPF (39) and CAP_PERFMON (38)
    let status = std::fs::read_to_string("/proc/self/status").ok();
    if let Some(status) = status {
        for line in status.lines() {
            if line.starts_with("CapEff:") {
                let hex = line.trim_start_matches("CapEff:").trim();
                if let Ok(caps) = u64::from_str_radix(hex, 16) {
                    let cap_perfmon = 1u64 << 38;
                    let cap_bpf = 1u64 << 39;
                    return (caps & cap_perfmon != 0) && (caps & cap_bpf != 0);
                }
            }
        }
    }
    false
}

Streaming Output Format (D-01, D-02, D-03, D-14)

// output.rs -- example line format
// PROTO  LOCAL                REMOTE               PID    PROCESS     STATE        BYTES_IN    BYTES_OUT   PKTS_IN  PKTS_OUT   RTT       RATE_IN     RATE_OUT
// TCP    192.168.1.10:54321   93.184.216.34:443    1234   curl        ESTABLISHED  1258291 (1.2M)  340 (340B)    892     12     28.3ms   420.1 KB/s  113B/s
// UDP    0.0.0.0:53           8.8.8.8:53           567    systemd-r   UDP          4096 (4.0K)     128 (128B)    32      1      -         1.3 KB/s   42B/s
// [CLOSED] TCP 192.168.1.10:54321 -> 93.184.216.34:443 (curl, PID 1234)

rust-toolchain.toml

[toolchain]
channel = "nightly-2026-01-15"  # Pin a specific nightly
components = ["rust-src", "rustfmt", "clippy"]

[target.bpfel-unknown-none]
# eBPF target -- only needed for the -ebpf crate

Workspace Cargo.toml

[workspace]
members = ["tcptop", "tcptop-common", "tcptop-ebpf"]
resolver = "2"

[workspace.dependencies]
aya = { version = "0.13.1", features = ["async_tokio"] }
aya-log = "0.2"
tokio = { version = "1", features = ["full"] }
anyhow = "1"
thiserror = "2"
clap = { version = "4.6", features = ["derive"] }
log = "0.4"
env_logger = "0.11"

build.rs (Userspace Crate)

// tcptop/build.rs
use std::env;

fn main() {
    // aya-build compiles the eBPF crate and makes the bytecode available
    // via include_bytes_aligned! in the userspace code
    let out_dir = env::var("OUT_DIR").unwrap();
    aya_build::build_ebpf(&["tcptop-ebpf"]).expect("Failed to build eBPF programs");
}

eBPF Hook Strategy (Claude's Discretion Recommendation)

Hook Type Captures Why This Over Alternatives
tcp_sendmsg kprobe Bytes/packets out, PID, comm, connection tuple, srtt_us Process context available; msg size in arg2
tcp_recvmsg kprobe Bytes/packets in, PID, comm, connection tuple, srtt_us Process context available; return value has bytes read
udp_sendmsg kprobe UDP bytes/packets out, PID, comm, 4-tuple Only way to track UDP sends with PID attribution
udp_recvmsg kprobe UDP bytes/packets in, PID, comm, 4-tuple Only way to track UDP receives with PID attribution
sock:inet_sock_set_state tracepoint TCP state transitions (old_state, new_state, tuple) Stable kernel API; fires on every TCP state change

Why Kprobes Over Tracepoints for Data Hooks

Tracepoints for tcp:tcp_probe exist but don't carry byte count information per call. The tcp_sendmsg / tcp_recvmsg kernel functions take the byte size as a parameter, making kprobes the natural choice. The tradeoff is that kprobes are less stable across kernel versions, but tcp_sendmsg signature has been stable for many kernel versions.

RTT Strategy

Read srtt_us from struct tcp_sock during tcp_sendmsg/tcp_recvmsg kprobes. The sock pointer (arg0) can be cast to tcp_sock* to access the smoothed RTT field. Shift right by 3 to get microseconds. This piggybacks on existing hooks -- no additional hook needed.

Ring Buffer vs PerfEventArray

Use RingBuf. Rationale:

  • Single shared buffer across all CPUs (simpler code)
  • Strong event ordering (important for state change events being processed after corresponding data events)
  • Precise wakeup notifications (no sampling/watermark tuning)
  • Dropped events visible to eBPF side (can count and report)
  • Kernel 5.8 is already the minimum requirement

State of the Art

Old Approach Current Approach When Changed Impact
xtask build pattern aya-build in build.rs aya 0.13+ (2025) Simpler builds, standard cargo build works
PerfEventArray for all events RingBuf (kernel 5.8+) Linux 5.8 (2020) Better ordering, shared buffer, less userspace code
BCC Python + C Aya pure Rust 2022+ Single language, better safety, no runtime dependency on BCC/LLVM
Manual eBPF bytecode loading Aya auto-BTF relocation aya 0.12+ Portable across kernel versions without recompilation

Deprecated/outdated:

  • RedBPF: Abandoned, superseded by Aya
  • tui-rs: Deprecated in favor of ratatui (relevant for Phase 2)
  • PerfEventArray for ordered event streams: RingBuf is strictly better when kernel >= 5.8

Open Questions

  1. tcp_recvmsg Return Value for Byte Count

    • What we know: tcp_sendmsg has byte count as arg2. tcp_recvmsg may need a kretprobe to capture actual bytes received from the return value.
    • What's unclear: Whether kretprobe on tcp_recvmsg is reliable for byte counts or if we need to read from the msghdr.
    • Recommendation: Start with kretprobe on tcp_recvmsg to capture return value. If unreliable, fall back to reading msg_iter length from the msghdr argument.
  2. IPv6 Struct Layout in eBPF

    • What we know: IPv4 addresses are in __sk_common.skc_rcv_saddr and __sk_common.skc_daddr. IPv6 uses skc_v6_rcv_saddr and skc_v6_daddr.
    • What's unclear: Whether aya-tool generates correct bindings for these fields across kernel versions, or if manual offset calculation is needed.
    • Recommendation: Start with IPv4 only for initial proof-of-life, add IPv6 in a follow-up task within Phase 1. Use aya-tool generate to get correct struct offsets.
  3. Exact aya-build Usage

    • What we know: aya-build 0.1.3 exists and replaces xtask. It's used in build.rs.
    • What's unclear: Exact API surface -- documentation is sparse.
    • Recommendation: Generate a project from aya-template first, then adapt the generated build.rs. The template will show current best practices.

Sources

Primary (HIGH confidence)

Secondary (MEDIUM confidence)

Tertiary (LOW confidence)

Metadata

Confidence breakdown:

  • Standard stack: HIGH - All versions verified on crates.io; Aya is the only viable Rust eBPF library
  • Architecture: HIGH - Follows established Aya patterns (template structure, RingBuf, kprobe+tracepoint combo); validated against multiple sources
  • Hook strategy: MEDIUM-HIGH - Kprobe targets well-established; srtt_us reading is proven technique; tcp_recvmsg byte capture needs runtime validation
  • Pitfalls: HIGH - Based on documented kernel behaviors and community-known issues

Research date: 2026-03-21 Valid until: 2026-04-21 (30 days -- Aya ecosystem is stable at 0.13.x)