30 KiB
Phase 1: Data Pipeline - Research
Researched: 2026-03-21 Domain: Linux eBPF kernel tracing, TCP/UDP connection monitoring, Aya Rust eBPF framework Confidence: HIGH
Summary
Phase 1 implements the core data pipeline: eBPF programs in the Linux kernel capture TCP and UDP events (connections, bytes, packets, state changes, RTT), push them through a ring buffer to userspace, where a Rust application aggregates per-connection statistics and streams them to stdout. This is a greenfield Aya eBPF project following the standard workspace template (ebpf crate, common crate, userspace crate).
The eBPF hook strategy combines kprobes for byte/packet counting (tcp_sendmsg, tcp_recvmsg, udp_sendmsg, udp_recvmsg) with the sock:inet_sock_set_state tracepoint for TCP state transitions and reading srtt_us from the sock struct for RTT. Pre-existing connections are bootstrapped from /proc/net/tcp and /proc/net/udp on startup. All kernel-to-userspace communication uses a single RingBuf map (requires kernel 5.8+).
The platform abstraction trait (NetworkCollector) is defined in this phase to enable the macOS backend in Phase 4, but only the Linux/eBPF implementation is built. The proof-of-life output is streaming lines to stdout -- no TUI, no cursor manipulation.
Primary recommendation: Use the Aya template workspace structure with aya-build in build.rs (not xtask). Attach to 4 kprobes + 1 tracepoint, use a single RingBuf for all events, and keep the common crate's event enum as the single shared type between kernel and userspace.
<user_constraints>
User Constraints (from CONTEXT.md)
Locked Decisions
- D-01: Streaming lines to stdout -- each connection update prints a new line (like
tail -f). No screen clearing or cursor manipulation. - D-02: Human-readable sizes by default (
1.2 MB,340 KB/s). Show raw value alongside when it fits cleanly without noise (e.g.,1258291 (1.2M)). If too noisy, drop to human-readable only. - D-03: Full detail per connection -- all fields: local/remote addr+port, PID, process name, TCP state, bytes in/out, packets in/out, RTT, bandwidth rate.
- D-04: Output is assumed throwaway scaffolding. Build it simple; decide whether to keep as
--batchmode when TUI lands in Phase 2. - D-05: Full 4-tuple grouping -- (src IP, src port, dst IP, dst port) = one flow for UDP.
- D-06: 5-second idle timeout -- UDP flows disappear 5s after last packet. Tunable via flag deferred.
- D-07: No synthesized state for UDP -- show
-orUDPin the state column. - D-08: Flat bidirectional flow tracking -- count bytes/packets in each direction, no request/response inference.
- D-09: Minimal error message on missing privileges:
error: tcptop requires root privileges. Run with sudo. - D-10: If Linux capabilities (
CAP_BPF,CAP_PERFMON) are present, proceed without root. Don't suggest capabilities in error message. - D-11: Exit code 77 on privilege failure.
- D-12: Closed TCP connections linger for one display/refresh cycle, then are removed.
- D-13: Visual distinction for new/closing connections -- exact styling deferred to Phase 2.
- D-14: Connection close events print
[CLOSED] 192.168.1.1:443 -> ...in streaming output. - D-15: Pre-existing connections sourced from
/proc/net/tcpon startup, marked as partial.
Claude's Discretion
- eBPF hook point selection (kprobes vs tracepoints)
- Platform abstraction trait design and boundaries
- Ring buffer vs perf event array for kernel-to-userspace transport
- RTT estimation implementation approach
- Exact format/layout of streaming output lines
/proc/net/tcpparsing strategy for pre-existing connections- Internal data structures and concurrency model
Deferred Ideas (OUT OF SCOPE)
- UDP flow idle timeout user-configurable via CLI flag -- Phase 2 or 3
- Protocol hint column for well-known UDP ports -- future enhancement
- Toggle between human-readable and raw byte display via keypress -- Phase 2 TUI
--batchor--oncemode -- evaluate after Phase 2 </user_constraints>
<phase_requirements>
Phase Requirements
| ID | Description | Research Support |
|---|---|---|
| DATA-01 | Per-connection byte counts (sent/received) in real time | kprobes on tcp_sendmsg/tcp_recvmsg/udp_sendmsg/udp_recvmsg capture byte counts from msg length args |
| DATA-02 | Per-connection packet counts (sent/received) in real time | Same kprobes increment packet counters per event |
| DATA-03 | TCP connection state (ESTABLISHED, LISTEN, TIME_WAIT, etc.) | sock:inet_sock_set_state tracepoint fires on every state transition with old+new state |
| DATA-04 | Correlate connection to owning process (PID, process name) | bpf_get_current_pid_tgid() in kprobes captures PID; process name via bpf_get_current_comm() or userspace sysinfo/procfs |
| DATA-05 | Per-connection TCP RTT estimate | Read srtt_us from struct tcp_sock in kprobe context (shifted right by 3 for microseconds) |
| DATA-06 | Bandwidth rates (KB/s or MB/s) per connection | Userspace calculation: delta bytes / delta time between refresh cycles |
| DATA-07 | Track TCP and UDP (UDP flows synthesized from 4-tuple with idle timeout) | UDP kprobes + userspace HashMap keyed by 4-tuple with 5s idle expiry |
| PLAT-01 | Works on Linux kernel 5.8+ using eBPF | RingBuf requires 5.8+; all hook points available in 5.8+; Aya handles BTF portability |
| PLAT-03 | Platform abstraction allows different backends | NetworkCollector trait with async Stream of connection events; Linux impl in this phase, macOS impl in Phase 4 |
| OPS-01 | Detect missing root/elevated privileges with clear error | Check geteuid() == 0 or CAP_BPF+CAP_PERFMON capabilities before eBPF load |
| OPS-02 | Low overhead on host (no heavy polling, eBPF ring buffer based) | RingBuf is event-driven (no polling); kprobes fire only on actual network calls |
| </phase_requirements> |
Standard Stack
Core (Phase 1 Dependencies)
| Library | Version | Purpose | Verified |
|---|---|---|---|
| aya | 0.13.1 | eBPF loader, map access, program attachment | crates.io 2025-11-17 |
| aya-ebpf | 0.1.1 | Kernel-side eBPF macros (#[kprobe], #[tracepoint]) |
crates.io 2025-11-17 |
| aya-log | 0.2.1 | Userspace eBPF log receiver | crates.io 2024-10-09 |
| aya-log-ebpf | 0.1.0 | Kernel-side eBPF logging (info!, debug!) |
crates.io 2024-10-09 |
| aya-build | 0.1.3 | Build script for eBPF compilation (replaces xtask) | crates.io 2025-11-17 |
| tokio | 1.50.0 | Async runtime (eBPF event loop, timers) | crates.io 2026-03-03 |
| clap | 4.6.0 | CLI argument parsing | crates.io 2026-03-12 |
| anyhow | 1.0.102 | Application error handling | crates.io 2026-02-20 |
| thiserror | 2.0.18 | Typed errors for library/trait code | crates.io 2026-01-18 |
| nix | 0.31.2 | Unix syscalls (privilege check, geteuid) | crates.io 2026-02-28 |
| procfs | 0.18.0 | /proc/net/tcp and /proc/net/udp parsing |
crates.io 2025-08-30 |
| serde | 1.0.228 | Serialization for shared types | crates.io 2025-09-27 |
| log + env_logger | 0.4.x / 0.11.x | Internal diagnostics logging | Stable, widely used |
| signal-hook | 0.4.3 | Graceful SIGINT/SIGTERM shutdown | crates.io 2026-01-24 |
eBPF Crate Dependencies (tcptop-ebpf)
| Library | Version | Purpose |
|---|---|---|
| aya-ebpf | 0.1.1 | eBPF program macros and helpers |
| aya-log-ebpf | 0.1.0 | Kernel-side logging |
Build Tools
| Tool | Purpose | Notes |
|---|---|---|
| Rust nightly | eBPF compilation target | Pin in rust-toolchain.toml with bpfel-unknown-none target |
| bpf-linker | Links eBPF object files | cargo install bpf-linker (on macOS: --no-default-features + LLVM 21+) |
| aya-build | Build.rs integration | Compiles eBPF crate during cargo build -- no xtask needed |
Not Needed in Phase 1
| Library | Why Deferred |
|---|---|
| ratatui / crossterm | Phase 2 (TUI) |
| csv | Phase 3 (CSV logging) |
| pcap | Phase 4 (macOS backend) |
| sysinfo | Not needed if bpf_get_current_comm() suffices for process names in kernel; procfs covers PID-to-name fallback |
Architecture Patterns
Recommended Project Structure
tcptop/
├── Cargo.toml # Workspace root
├── rust-toolchain.toml # Pin nightly + bpfel-unknown-none target
├── .cargo/config.toml # Build flags for eBPF target
├── tcptop/ # Userspace binary crate
│ ├── Cargo.toml
│ ├── build.rs # Uses aya-build to compile eBPF crate
│ └── src/
│ ├── main.rs # Entry point: privilege check, CLI, event loop
│ ├── collector/
│ │ ├── mod.rs # NetworkCollector trait definition
│ │ └── linux.rs # Linux/eBPF implementation
│ ├── model.rs # ConnectionRecord, ConnectionKey, ConnectionStats
│ ├── aggregator.rs # Connection state aggregation, bandwidth calc, UDP timeout
│ ├── output.rs # Streaming stdout formatter
│ ├── privilege.rs # Root/capability check logic
│ └── proc_bootstrap.rs # /proc/net/tcp+udp parser for pre-existing connections
├── tcptop-ebpf/ # eBPF kernel programs (no_std)
│ ├── Cargo.toml
│ └── src/
│ └── main.rs # All eBPF programs: kprobes + tracepoint
└── tcptop-common/ # Shared types (no_std compatible)
├── Cargo.toml
└── src/
└── lib.rs # Event enum, ConnectionKey, #[repr(C)] structs
Pattern 1: Event-Driven eBPF Pipeline
What: Kernel eBPF programs emit events into a RingBuf; userspace consumes them asynchronously via tokio and updates an in-memory connection table.
When to use: Always -- this is the core architecture.
Flow:
Kernel: kprobe/tracepoint fires
-> eBPF program extracts fields from sock/skb
-> Writes TcptopEvent to RingBuf
-> Userspace tokio task polls RingBuf via AsyncFd
-> Deserializes event, updates ConnectionTable HashMap
-> Periodic tick formats and prints to stdout
eBPF side (kernel):
// tcptop-common/src/lib.rs
#![no_std]
#[repr(C)]
#[derive(Clone, Copy)]
pub enum TcptopEvent {
TcpSend(DataEvent),
TcpRecv(DataEvent),
UdpSend(DataEvent),
UdpRecv(DataEvent),
TcpStateChange(StateEvent),
}
#[repr(C)]
#[derive(Clone, Copy)]
pub struct DataEvent {
pub pid: u32,
pub comm: [u8; 16],
pub saddr: u32, // IPv4 for now; extend to support IPv6
pub daddr: u32,
pub sport: u16,
pub dport: u16,
pub bytes: u32,
pub srtt_us: u32, // only meaningful for TCP
}
#[repr(C)]
#[derive(Clone, Copy)]
pub struct StateEvent {
pub pid: u32,
pub saddr: u32,
pub daddr: u32,
pub sport: u16,
pub dport: u16,
pub old_state: u32,
pub new_state: u32,
}
eBPF program (kernel):
// tcptop-ebpf/src/main.rs
#![no_std]
#![no_main]
use aya_ebpf::{macros::{kprobe, tracepoint, map}, maps::RingBuf, programs::{ProbeContext, TracePointContext}};
use tcptop_common::{TcptopEvent, DataEvent};
#[map]
static EVENTS: RingBuf = RingBuf::with_byte_size(256 * 1024, 0); // 256 KB
#[kprobe]
pub fn tcp_sendmsg(ctx: ProbeContext) -> u32 {
match try_tcp_sendmsg(ctx) {
Ok(ret) => ret,
Err(_) => 0,
}
}
fn try_tcp_sendmsg(ctx: ProbeContext) -> Result<u32, i64> {
// arg0: struct sock*, arg1: struct msghdr*, arg2: size_t (bytes)
let sock: *const core::ffi::c_void = ctx.arg(0).ok_or(1)?;
let size: usize = ctx.arg(2).ok_or(1)?;
// Extract connection tuple from sock->__sk_common
// Write TcptopEvent::TcpSend to EVENTS ring buffer
if let Some(mut entry) = EVENTS.reserve::<TcptopEvent>(0) {
// populate entry...
entry.submit(0);
}
Ok(0)
}
Userspace consumer:
// Userspace: read from RingBuf with AsyncFd for async notification
use aya::maps::RingBuf;
use tokio::io::unix::AsyncFd;
let mut ring_buf = RingBuf::try_from(bpf.map_mut("EVENTS").unwrap())?;
let async_fd = AsyncFd::new(ring_buf.as_raw_fd())?;
loop {
let mut guard = async_fd.readable().await?;
while let Some(item) = ring_buf.next() {
let event: &TcptopEvent = unsafe { &*(item.as_ptr() as *const TcptopEvent) };
connection_table.update(event);
}
guard.clear_ready();
}
Pattern 2: Platform Abstraction Trait
What: A trait that abstracts the data collection backend, enabling Linux (eBPF) and macOS (pcap) implementations behind a common interface.
Design:
// collector/mod.rs
use tokio::sync::mpsc;
pub enum CollectorEvent {
Data(DataEvent),
StateChange(StateEvent),
}
#[async_trait::async_trait]
pub trait NetworkCollector: Send {
/// Start collecting. Sends events to the provided channel.
async fn start(&mut self, tx: mpsc::Sender<CollectorEvent>) -> Result<()>;
/// Stop collecting and clean up kernel resources.
async fn stop(&mut self) -> Result<()>;
/// Bootstrap pre-existing connections (e.g., from /proc/net/tcp).
fn bootstrap_existing(&self) -> Result<Vec<ConnectionRecord>>;
}
Why mpsc channel: Decouples the collector from the aggregator. The eBPF ring buffer reader pushes events into the channel; the aggregator consumes them on its own schedule. This also naturally supports the macOS backend in Phase 4 pushing pcap-derived events into the same channel.
Pattern 3: Connection Table with Bandwidth Calculation
What: In-memory HashMap keyed by connection tuple, with periodic tick for rate calculation and UDP idle timeout.
pub struct ConnectionTable {
connections: HashMap<ConnectionKey, ConnectionRecord>,
last_tick: Instant,
}
impl ConnectionTable {
pub fn tick(&mut self) {
let now = Instant::now();
let dt = now.duration_since(self.last_tick);
for record in self.connections.values_mut() {
record.rate_in = (record.bytes_in - record.prev_bytes_in) as f64 / dt.as_secs_f64();
record.rate_out = (record.bytes_out - record.prev_bytes_out) as f64 / dt.as_secs_f64();
record.prev_bytes_in = record.bytes_in;
record.prev_bytes_out = record.bytes_out;
}
// Expire UDP flows idle > 5 seconds
self.connections.retain(|k, r| {
if k.protocol == Protocol::Udp {
r.last_seen.elapsed() < Duration::from_secs(5)
} else {
true // TCP lifecycle managed by state events
}
});
self.last_tick = now;
}
}
Anti-Patterns to Avoid
- Polling /proc/net/tcp in a loop: Misses short-lived connections, high overhead, racy PID mapping. Use eBPF for event-driven capture;
/proconly at startup for bootstrapping. - PerfEventArray instead of RingBuf for this use case: PerfEventArray is per-CPU with no ordering guarantees and requires per-CPU buffer management. RingBuf provides global ordering (important for state change events), shared buffer (simpler), and notification via fd. Since kernel 5.8 is the minimum anyway, always prefer RingBuf.
- Separate RingBuf per program type: Unnecessary complexity. A single RingBuf with a tagged enum (
TcptopEvent) handles all event types cleanly. - Blocking the eBPF event loop with stdout writes: The RingBuf consumer and the stdout printer must be decoupled. If stdout blocks (pipe full), it must not cause RingBuf overflow. Use a channel between consumer and printer.
Don't Hand-Roll
| Problem | Don't Build | Use Instead | Why |
|---|---|---|---|
/proc/net/tcp parsing |
Custom line parser | procfs crate (0.18.0) |
Handles IPv4/IPv6, hex parsing, socket states, inode mapping. Battle-tested. |
| CLI argument parsing | Manual std::env::args |
clap 4.6.0 derive macros |
Handles --help, validation, error messages automatically |
| eBPF program compilation | Custom build scripts | aya-build 0.1.3 in build.rs |
Handles target selection, cross-compilation, artifact embedding |
| Async eBPF event reading | Raw epoll/poll loop | tokio::io::unix::AsyncFd wrapping RingBuf fd |
Integrates with tokio runtime, handles wakeups correctly |
| Privilege checking | Raw syscall | nix::unistd::geteuid() + caps crate or manual /proc/self/status CapEff parse |
Correct cross-distro behavior |
| Human-readable byte formatting | Format function | Simple utility function is fine here (too small for a crate) | Only needs KB/MB/GB and rate formatting; 20 lines of code |
Key insight: The eBPF build pipeline is the most complex "don't hand-roll" item. Aya's template structure and aya-build handle the nightmare of cross-compiling no_std Rust to BPF bytecode, linking, and embedding the result in the userspace binary.
Common Pitfalls
Pitfall 1: RingBuf Entry Alignment
What goes wrong: Shared types between eBPF and userspace have mismatched memory layout, causing corrupted data.
Why it happens: eBPF enforces 8-byte max alignment. Rust may add padding differently than C.
How to avoid: All shared types in tcptop-common MUST use #[repr(C)] and have no field with alignment > 8. Use only primitive types (u8, u16, u32, u64, i32, arrays of primitives). No String, Vec, Option, or enums with non-trivial discriminants.
Warning signs: Garbled IP addresses, impossible PID values, random byte counts.
Pitfall 2: eBPF Verifier Rejection
What goes wrong: Kernel verifier rejects the eBPF program at load time with cryptic errors.
Why it happens: Unbounded loops, stack size exceeded (512 bytes max), uninitialized memory reads, pointer arithmetic verifier can't prove safe.
How to avoid: Keep eBPF functions small and linear. No loops (or use bounded for i in 0..MAX). Minimize stack usage -- read kernel structs field-by-field, don't copy whole structs. Use bpf_probe_read_kernel for every kernel pointer dereference.
Warning signs: "R1 invalid mem access", "back-edge in control flow", "unreachable insn".
Pitfall 3: Missing Process Context in Tracepoint
What goes wrong: sock:inet_sock_set_state fires in interrupt/softirq context where bpf_get_current_pid_tgid() returns 0 (kernel context, not the process).
Why it happens: TCP state changes can be triggered by incoming packets processed in softirq, not in the context of the owning process.
How to avoid: Capture PID in the kprobes (tcp_sendmsg/tcp_recvmsg) which DO run in process context, store in a HashMap keyed by sock pointer. When inet_sock_set_state fires, look up the sock pointer in the map to get the PID. If not found, PID is 0/unknown -- enrich from userspace via procfs.
Warning signs: Many connections showing PID 0 or "unknown" process name.
Pitfall 4: IPv6 Support Forgotten
What goes wrong: Tool only handles IPv4, crashes or silently drops IPv6 connections.
Why it happens: Easy to prototype with u32 for IP addresses and forget about IPv6.
How to avoid: Use [u8; 16] for all IP address fields from the start. In the common crate, include an af_family: u16 field. Format IPv4-mapped-IPv6 addresses correctly. The sock.__sk_common.skc_family field tells you AF_INET vs AF_INET6.
Warning signs: Missing connections on dual-stack hosts, especially localhost (::1).
Pitfall 5: srtt_us Field Interpretation
What goes wrong: RTT values are 8x too high.
Why it happens: The kernel stores srtt_us as "smoothed RTT << 3" (shifted left by 3). Must shift right by 3 to get actual microseconds.
How to avoid: Always srtt_us >> 3 when reading from struct tcp_sock. Document this in the code comment referencing include/net/tcp.h.
Warning signs: RTT values of 800us for localhost connections (should be ~100us).
Pitfall 6: Pre-existing Connection Byte Counts
What goes wrong: Connections that existed before tcptop started show misleadingly low byte counts.
Why it happens: eBPF hooks only capture bytes from the moment they're attached. Historical byte counts are not available.
How to avoid: Per D-15, mark pre-existing connections as partial. The bootstrap from /proc/net/tcp provides the connection tuple and state but NOT byte counts. Start byte counters at 0 and indicate to the user these are "since monitoring started."
Warning signs: Long-lived connections (like SSH) showing 0 bytes initially.
Pitfall 7: RingBuf Overflow Under Load
What goes wrong: High-traffic hosts generate more events than userspace can consume; events are silently dropped.
Why it happens: RingBuf has fixed size. Unlike PerfEventArray, dropped events are reported to the eBPF side (reserve returns None), not to userspace.
How to avoid: Size the ring buffer appropriately (start with 256KB, tune up if needed). In eBPF code, handle the reserve returning None gracefully -- increment a counter in a separate eBPF array map so userspace can detect and report drops. Consider rate-limiting per-connection updates in the eBPF program.
Warning signs: Missing events, gaps in connection tracking, drop counter incrementing.
Code Examples
Privilege Check (D-09, D-10, D-11)
// privilege.rs
use nix::unistd::geteuid;
use std::process;
pub fn check_privileges() {
if geteuid().is_root() {
return;
}
// Check for CAP_BPF and CAP_PERFMON via /proc/self/status
if has_required_capabilities() {
return;
}
eprintln!("error: tcptop requires root privileges. Run with sudo.");
process::exit(77);
}
fn has_required_capabilities() -> bool {
// Parse /proc/self/status for CapEff line
// Check bits for CAP_BPF (39) and CAP_PERFMON (38)
let status = std::fs::read_to_string("/proc/self/status").ok();
if let Some(status) = status {
for line in status.lines() {
if line.starts_with("CapEff:") {
let hex = line.trim_start_matches("CapEff:").trim();
if let Ok(caps) = u64::from_str_radix(hex, 16) {
let cap_perfmon = 1u64 << 38;
let cap_bpf = 1u64 << 39;
return (caps & cap_perfmon != 0) && (caps & cap_bpf != 0);
}
}
}
}
false
}
Streaming Output Format (D-01, D-02, D-03, D-14)
// output.rs -- example line format
// PROTO LOCAL REMOTE PID PROCESS STATE BYTES_IN BYTES_OUT PKTS_IN PKTS_OUT RTT RATE_IN RATE_OUT
// TCP 192.168.1.10:54321 93.184.216.34:443 1234 curl ESTABLISHED 1258291 (1.2M) 340 (340B) 892 12 28.3ms 420.1 KB/s 113B/s
// UDP 0.0.0.0:53 8.8.8.8:53 567 systemd-r UDP 4096 (4.0K) 128 (128B) 32 1 - 1.3 KB/s 42B/s
// [CLOSED] TCP 192.168.1.10:54321 -> 93.184.216.34:443 (curl, PID 1234)
rust-toolchain.toml
[toolchain]
channel = "nightly-2026-01-15" # Pin a specific nightly
components = ["rust-src", "rustfmt", "clippy"]
[target.bpfel-unknown-none]
# eBPF target -- only needed for the -ebpf crate
Workspace Cargo.toml
[workspace]
members = ["tcptop", "tcptop-common", "tcptop-ebpf"]
resolver = "2"
[workspace.dependencies]
aya = { version = "0.13.1", features = ["async_tokio"] }
aya-log = "0.2"
tokio = { version = "1", features = ["full"] }
anyhow = "1"
thiserror = "2"
clap = { version = "4.6", features = ["derive"] }
log = "0.4"
env_logger = "0.11"
build.rs (Userspace Crate)
// tcptop/build.rs
use std::env;
fn main() {
// aya-build compiles the eBPF crate and makes the bytecode available
// via include_bytes_aligned! in the userspace code
let out_dir = env::var("OUT_DIR").unwrap();
aya_build::build_ebpf(&["tcptop-ebpf"]).expect("Failed to build eBPF programs");
}
eBPF Hook Strategy (Claude's Discretion Recommendation)
Recommended Hook Points
| Hook | Type | Captures | Why This Over Alternatives |
|---|---|---|---|
tcp_sendmsg |
kprobe | Bytes/packets out, PID, comm, connection tuple, srtt_us | Process context available; msg size in arg2 |
tcp_recvmsg |
kprobe | Bytes/packets in, PID, comm, connection tuple, srtt_us | Process context available; return value has bytes read |
udp_sendmsg |
kprobe | UDP bytes/packets out, PID, comm, 4-tuple | Only way to track UDP sends with PID attribution |
udp_recvmsg |
kprobe | UDP bytes/packets in, PID, comm, 4-tuple | Only way to track UDP receives with PID attribution |
sock:inet_sock_set_state |
tracepoint | TCP state transitions (old_state, new_state, tuple) | Stable kernel API; fires on every TCP state change |
Why Kprobes Over Tracepoints for Data Hooks
Tracepoints for tcp:tcp_probe exist but don't carry byte count information per call. The tcp_sendmsg / tcp_recvmsg kernel functions take the byte size as a parameter, making kprobes the natural choice. The tradeoff is that kprobes are less stable across kernel versions, but tcp_sendmsg signature has been stable for many kernel versions.
RTT Strategy
Read srtt_us from struct tcp_sock during tcp_sendmsg/tcp_recvmsg kprobes. The sock pointer (arg0) can be cast to tcp_sock* to access the smoothed RTT field. Shift right by 3 to get microseconds. This piggybacks on existing hooks -- no additional hook needed.
Ring Buffer vs PerfEventArray
Use RingBuf. Rationale:
- Single shared buffer across all CPUs (simpler code)
- Strong event ordering (important for state change events being processed after corresponding data events)
- Precise wakeup notifications (no sampling/watermark tuning)
- Dropped events visible to eBPF side (can count and report)
- Kernel 5.8 is already the minimum requirement
State of the Art
| Old Approach | Current Approach | When Changed | Impact |
|---|---|---|---|
| xtask build pattern | aya-build in build.rs |
aya 0.13+ (2025) | Simpler builds, standard cargo build works |
| PerfEventArray for all events | RingBuf (kernel 5.8+) | Linux 5.8 (2020) | Better ordering, shared buffer, less userspace code |
| BCC Python + C | Aya pure Rust | 2022+ | Single language, better safety, no runtime dependency on BCC/LLVM |
| Manual eBPF bytecode loading | Aya auto-BTF relocation | aya 0.12+ | Portable across kernel versions without recompilation |
Deprecated/outdated:
- RedBPF: Abandoned, superseded by Aya
- tui-rs: Deprecated in favor of ratatui (relevant for Phase 2)
- PerfEventArray for ordered event streams: RingBuf is strictly better when kernel >= 5.8
Open Questions
-
tcp_recvmsg Return Value for Byte Count
- What we know: tcp_sendmsg has byte count as arg2. tcp_recvmsg may need a kretprobe to capture actual bytes received from the return value.
- What's unclear: Whether kretprobe on tcp_recvmsg is reliable for byte counts or if we need to read from the msghdr.
- Recommendation: Start with kretprobe on tcp_recvmsg to capture return value. If unreliable, fall back to reading msg_iter length from the msghdr argument.
-
IPv6 Struct Layout in eBPF
- What we know: IPv4 addresses are in
__sk_common.skc_rcv_saddrand__sk_common.skc_daddr. IPv6 usesskc_v6_rcv_saddrandskc_v6_daddr. - What's unclear: Whether aya-tool generates correct bindings for these fields across kernel versions, or if manual offset calculation is needed.
- Recommendation: Start with IPv4 only for initial proof-of-life, add IPv6 in a follow-up task within Phase 1. Use
aya-tool generateto get correct struct offsets.
- What we know: IPv4 addresses are in
-
Exact aya-build Usage
- What we know: aya-build 0.1.3 exists and replaces xtask. It's used in build.rs.
- What's unclear: Exact API surface -- documentation is sparse.
- Recommendation: Generate a project from aya-template first, then adapt the generated build.rs. The template will show current best practices.
Sources
Primary (HIGH confidence)
- Aya docs.rs - RingBuf - RingBuf API, AsyncFd integration
- Aya docs.rs - aya-ebpf RingBuf - Kernel-side RingBuf reserve/submit API
- Aya Book - Probes - Kprobe definition, attachment, context access
- Aya Template - Project structure, workspace layout
- crates.io version checks (2026-03-21) - All version numbers verified
Secondary (MEDIUM confidence)
- Brendan Gregg - TCP Tracepoints - TCP tracepoint list, inet_sock_set_state fields
- Red Hat - TCP RTT with eBPF - srtt_us field reading, fentry approach
- eunomia eBPF Tutorial 14 - TCP state tracking with inet_sock_set_state
- Yuki Nakamura - Aya Tracepoint - Tracepoint struct generation, context reading
- eBPF Docs - RingBuf Map Type - Ring buffer kernel semantics
Tertiary (LOW confidence)
- OneUptime - eBPF with Rust Aya - General patterns, build pipeline (blog, not official)
- Deepfence - Aya companion - Map sharing patterns (blog)
Metadata
Confidence breakdown:
- Standard stack: HIGH - All versions verified on crates.io; Aya is the only viable Rust eBPF library
- Architecture: HIGH - Follows established Aya patterns (template structure, RingBuf, kprobe+tracepoint combo); validated against multiple sources
- Hook strategy: MEDIUM-HIGH - Kprobe targets well-established; srtt_us reading is proven technique; tcp_recvmsg byte capture needs runtime validation
- Pitfalls: HIGH - Based on documented kernel behaviors and community-known issues
Research date: 2026-03-21 Valid until: 2026-04-21 (30 days -- Aya ecosystem is stable at 0.13.x)