641 lines
28 KiB
Markdown
641 lines
28 KiB
Markdown
# Phase 3: Output & Distribution - Research
|
|
|
|
**Researched:** 2026-03-22
|
|
**Domain:** CSV logging, documentation (man page), Linux packaging (.deb/.rpm), test coverage
|
|
**Confidence:** HIGH
|
|
|
|
## Summary
|
|
|
|
Phase 3 transforms tcptop from a working Linux tool into a distributable, documented one. The work spans four distinct domains: (1) CSV logging via the `csv` crate with serde serialization of `ConnectionRecord`, (2) a hand-written troff man page, (3) Linux package generation via `cargo-deb` and `cargo-generate-rpm`, and (4) unit/integration test expansion for CSV output and core data processing.
|
|
|
|
The existing codebase is well-structured for this phase. `ConnectionRecord` in `model.rs` already has all fields needed for CSV serialization -- it just needs `#[derive(Serialize)]`. The event loop in `main.rs` has a clear branching point for headless mode. The `pipeline_test.rs` establishes test patterns that can be extended. No significant architectural changes are needed; this is primarily additive work.
|
|
|
|
**Primary recommendation:** Add `csv`, `serde` (with derive), and `chrono` dependencies. Create a `csv_writer` module that takes `&[&ConnectionRecord]` and writes rows. Branch in `main.rs` before TUI init: if `--log` is set, skip TUI setup entirely and run a simplified event loop that writes CSV on each tick.
|
|
|
|
<user_constraints>
|
|
## User Constraints (from CONTEXT.md)
|
|
|
|
### Locked Decisions
|
|
- **D-01:** `--log <path>` runs in headless mode -- no TUI, just CSV output to file. TUI and CSV are mutually exclusive modes.
|
|
- **D-02:** All ConnectionRecord fields included in CSV rows: proto, local addr, local port, remote addr, remote port, PID, process name, state, bytes in, bytes out, packets in, packets out, rate in, rate out, RTT.
|
|
- **D-03:** Each snapshot writes one row per active connection at the `--interval` cadence (default 1s). Same tick rate as TUI display.
|
|
- **D-04:** Overwrite existing file if path already exists (no append mode).
|
|
- **D-05:** Each row includes a timestamp column (snapshot time) so rows can be correlated to their snapshot cycle.
|
|
- **D-06:** CSV includes a header row with column names (per OUTP-02).
|
|
- **D-07:** `--help` uses clap's auto-generated output from existing `#[arg]` attributes. No additional customization needed.
|
|
- **D-08:** Man page is hand-written (not auto-generated from clap) to allow richer prose, examples, and notes on eBPF requirements.
|
|
- **D-09:** Man page includes a detailed EXAMPLES section with common use cases.
|
|
- **D-10:** Man page is installed and accessible via `man tcptop` when installed through any supported method.
|
|
- **D-11:** `cargo install tcptop` is the developer install path -- expects nightly toolchain + bpf-linker already set up.
|
|
- **D-12:** `.deb` and `.rpm` packages ship prebuilt binaries via `cargo-deb` and `cargo-generate-rpm`.
|
|
- **D-13:** Packages distributed as bare files (GitHub releases). No APT/YUM repository.
|
|
- **D-14:** Packages include the man page so `man tcptop` works after install.
|
|
- **D-15:** No declared kernel version dependency in packages -- fail at runtime.
|
|
- **D-16:** Homebrew/macOS packaging deferred to Phase 4.
|
|
- **D-17:** Focus on CSV writer tests and expanded data processing coverage.
|
|
- **D-18:** Skip TUI rendering tests -- verified manually.
|
|
- **D-19:** Skip eBPF/collector layer tests -- verified manually on Linux VM.
|
|
- **D-20:** Cover critical paths, no specific coverage target.
|
|
|
|
### Claude's Discretion
|
|
- CSV timestamp format (ISO 8601 recommended)
|
|
- Man page structure and exact prose
|
|
- Which aggregator/filtering edge cases to test
|
|
- `cargo-deb` / `cargo-generate-rpm` configuration details
|
|
- How headless mode reuses the existing event loop vs having its own
|
|
- Build automation for `.deb`/`.rpm` (xtask, Makefile, or CI)
|
|
|
|
### Deferred Ideas (OUT OF SCOPE)
|
|
- Homebrew formula for macOS -- Phase 4
|
|
- `--batch` or `--once` mode for stdout output without TUI
|
|
- JSON Lines output format (OUTP-V2-01)
|
|
- Headless mode without file (stdout CSV)
|
|
- APT/YUM repository hosting
|
|
- UDP flow idle timeout configurable via CLI flag
|
|
</user_constraints>
|
|
|
|
<phase_requirements>
|
|
## Phase Requirements
|
|
|
|
| ID | Description | Research Support |
|
|
|----|-------------|------------------|
|
|
| OUTP-01 | User can log connection data to CSV file via `--log <path>` flag | csv 1.4.0 + serde Serialize on ConnectionRecord; CsvWriter module pattern below |
|
|
| OUTP-02 | CSV output includes header row and periodic snapshots of all connections | csv::Writer auto-writes header from struct field names when using serialize(); timestamp column per D-05 |
|
|
| OUTP-03 | Tool provides `--help` with clear usage documentation | Already satisfied by clap derive -- just add `#[arg]` for `--log`; verify existing help text |
|
|
| OUTP-04 | Tool ships with a man page | Hand-written troff file at `doc/tcptop.1`; installed via package assets |
|
|
| OPS-03 | Tool is installable via `cargo install` | Requires `[package]` metadata (description, license, repository) in tcptop/Cargo.toml |
|
|
| OPS-05 | Tool has test coverage for core data processing and display logic | Expand pipeline_test.rs with CSV writer tests + aggregator edge cases |
|
|
</phase_requirements>
|
|
|
|
## Standard Stack
|
|
|
|
### Core (New Dependencies for Phase 3)
|
|
| Library | Version | Purpose | Why Standard |
|
|
|---------|---------|---------|--------------|
|
|
| csv | 1.4.0 | CSV file writing with serde integration | BurntSushi's crate. 129M+ downloads. Only sensible choice for CSV in Rust. Already in CLAUDE.md stack. |
|
|
| serde | 1.x (latest) | Serialization derive macros | Required by csv crate for struct-to-row mapping. Features: `derive`. |
|
|
| chrono | 0.4.44 | ISO 8601 timestamp generation | Standard Rust datetime library. `Utc::now().to_rfc3339()` for CSV timestamp column. Lightweight alternative: `std::time::SystemTime` with manual formatting, but chrono is cleaner and already battle-tested. |
|
|
|
|
### Build Tools (Install, Don't Add as Dependencies)
|
|
| Tool | Version | Purpose | Install Command |
|
|
|------|---------|---------|-----------------|
|
|
| cargo-deb | 3.6.3 | Generate `.deb` packages from Cargo metadata | `cargo install cargo-deb` |
|
|
| cargo-generate-rpm | 0.20.0 | Generate `.rpm` packages from Cargo metadata | `cargo install cargo-generate-rpm` |
|
|
|
|
### Already Present (No Changes)
|
|
| Library | Version | Relevant Use |
|
|
|---------|---------|-------------|
|
|
| clap | 4.6.x | `--log <path>` flag addition; `--help` auto-generation |
|
|
| tokio | 1.x | Headless event loop reuses same async runtime |
|
|
| anyhow | 1.x | Error handling in CSV writer |
|
|
|
|
### Alternatives Considered
|
|
| Instead of | Could Use | Tradeoff |
|
|
|------------|-----------|----------|
|
|
| chrono | `std::time::SystemTime` + manual format | Avoids a dependency but ISO 8601 formatting is tedious and error-prone. chrono is 1 line. |
|
|
| Hand-written man page | clap_mangen (auto-generate from clap) | Auto-generated man pages lack prose quality, examples section, and eBPF-specific notes. D-08 explicitly requires hand-written. |
|
|
| cargo-deb | dpkg-deb directly | Manual .deb creation is complex (control files, directory layout). cargo-deb handles it all from Cargo.toml metadata. |
|
|
|
|
**Installation (workspace Cargo.toml additions):**
|
|
```toml
|
|
[workspace.dependencies]
|
|
serde = { version = "1", features = ["derive"] }
|
|
csv = "1.4"
|
|
chrono = { version = "0.4", default-features = false, features = ["clock"] }
|
|
```
|
|
|
|
**tcptop/Cargo.toml additions:**
|
|
```toml
|
|
[dependencies]
|
|
serde = { workspace = true }
|
|
csv = { workspace = true }
|
|
chrono = { workspace = true }
|
|
```
|
|
|
|
## Architecture Patterns
|
|
|
|
### Recommended Project Structure (additions)
|
|
```
|
|
tcptop/
|
|
├── src/
|
|
│ ├── csv_writer.rs # New: CsvWriter struct, CsvRow serde struct
|
|
│ ├── main.rs # Modified: branch on --log before TUI init
|
|
│ ├── model.rs # Modified: derive Serialize on relevant types
|
|
│ └── ...
|
|
├── tests/
|
|
│ ├── pipeline_test.rs # Expanded: CSV writer + aggregator edge cases
|
|
│ └── csv_test.rs # New: dedicated CSV output tests
|
|
├── Cargo.toml # Modified: add deps, package metadata, deb/rpm config
|
|
doc/
|
|
└── tcptop.1 # New: hand-written man page (workspace root, not inside tcptop/)
|
|
```
|
|
|
|
### Pattern 1: Flat CSV Row Struct (Serde Serialization)
|
|
**What:** Create a dedicated `CsvRow` struct that flattens `ConnectionRecord` fields for CSV output, rather than trying to serialize `ConnectionRecord` directly.
|
|
**When to use:** When the source struct has nested types (like `ConnectionKey`, `Option<TcpState>`, `Instant`) that don't serialize cleanly to CSV columns.
|
|
**Why:** `ConnectionRecord` contains `ConnectionKey` (nested), `Instant` (not serializable), `prev_bytes_*` (internal state not wanted in CSV), and `is_partial`/`is_closed` (internal flags). A flat `CsvRow` with only the D-02 columns is cleaner than fighting serde attributes.
|
|
|
|
```rust
|
|
use serde::Serialize;
|
|
use chrono::Utc;
|
|
|
|
#[derive(Serialize)]
|
|
pub struct CsvRow {
|
|
pub timestamp: String,
|
|
pub protocol: String,
|
|
pub local_addr: String,
|
|
pub local_port: u16,
|
|
pub remote_addr: String,
|
|
pub remote_port: u16,
|
|
pub pid: u32,
|
|
pub process_name: String,
|
|
pub state: String,
|
|
pub bytes_in: u64,
|
|
pub bytes_out: u64,
|
|
pub packets_in: u64,
|
|
pub packets_out: u64,
|
|
pub rate_in_bytes_sec: f64,
|
|
pub rate_out_bytes_sec: f64,
|
|
pub rtt_us: String, // "N/A" for UDP, microseconds for TCP
|
|
}
|
|
|
|
impl CsvRow {
|
|
pub fn from_record(record: &ConnectionRecord, timestamp: &str) -> Self {
|
|
CsvRow {
|
|
timestamp: timestamp.to_string(),
|
|
protocol: match record.key.protocol {
|
|
Protocol::Tcp => "TCP".to_string(),
|
|
Protocol::Udp => "UDP".to_string(),
|
|
},
|
|
local_addr: record.key.local_addr.to_string(),
|
|
local_port: record.key.local_port,
|
|
remote_addr: record.key.remote_addr.to_string(),
|
|
remote_port: record.key.remote_port,
|
|
pid: record.pid,
|
|
process_name: record.process_name.clone(),
|
|
state: record.tcp_state.map_or("UDP".to_string(), |s| s.as_str().to_string()),
|
|
bytes_in: record.bytes_in,
|
|
bytes_out: record.bytes_out,
|
|
packets_in: record.packets_in,
|
|
packets_out: record.packets_out,
|
|
rate_in_bytes_sec: record.rate_in,
|
|
rate_out_bytes_sec: record.rate_out,
|
|
rtt_us: record.rtt_us.map_or("N/A".to_string(), |v| v.to_string()),
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Pattern 2: Headless Mode Branching
|
|
**What:** Branch in `main.rs` after privilege check but BEFORE `ratatui::init()`. Headless mode must never touch the terminal.
|
|
**When to use:** When `--log` flag is present.
|
|
**Why:** `ratatui::init()` puts the terminal in raw/alternate screen mode. If CSV headless mode initializes the terminal, it breaks stdout and requires cleanup. The branch must happen before any terminal manipulation.
|
|
|
|
```rust
|
|
// In main.rs, inside the #[cfg(target_os = "linux")] block:
|
|
let cli = Cli::parse();
|
|
|
|
if let Some(ref log_path) = cli.log {
|
|
// Headless CSV mode -- no terminal manipulation
|
|
run_headless(&cli, log_path).await?;
|
|
} else {
|
|
// TUI mode
|
|
let mut terminal = ratatui::init();
|
|
let result = run_linux(&mut terminal, &cli).await;
|
|
ratatui::restore();
|
|
result?;
|
|
}
|
|
```
|
|
|
|
### Pattern 3: Headless Event Loop
|
|
**What:** Simplified event loop without TUI rendering, keyboard input, or crossterm EventStream. Same collector + aggregator pipeline, CSV write on tick instead of terminal draw.
|
|
**Why:** Reuses the collector/aggregator infrastructure (same data quality) but strips all TUI overhead.
|
|
|
|
```rust
|
|
async fn run_headless(cli: &Cli, log_path: &str) -> Result<()> {
|
|
let mut collector = LinuxCollector::new()?;
|
|
let mut table = ConnectionTable::new();
|
|
// ... bootstrap, channel setup, spawn collector (same as TUI mode)
|
|
|
|
let mut csv_writer = csv::Writer::from_path(log_path)?; // Overwrites per D-04
|
|
let mut tick = interval(Duration::from_secs(cli.interval));
|
|
|
|
// Signal handlers for graceful shutdown
|
|
let mut sigint = tokio::signal::unix::signal(SignalKind::interrupt())?;
|
|
let mut sigterm = tokio::signal::unix::signal(SignalKind::terminate())?;
|
|
|
|
loop {
|
|
tokio::select! {
|
|
Some(event) = rx.recv() => { table.update(event); }
|
|
_ = tick.tick() => {
|
|
let (active, _closed) = table.tick();
|
|
let timestamp = Utc::now().to_rfc3339();
|
|
for record in &active {
|
|
csv_writer.serialize(CsvRow::from_record(record, ×tamp))?;
|
|
}
|
|
csv_writer.flush()?; // Ensure data is written each tick
|
|
}
|
|
_ = sigint.recv() => break,
|
|
_ = sigterm.recv() => break,
|
|
}
|
|
}
|
|
Ok(())
|
|
}
|
|
```
|
|
|
|
### Pattern 4: cargo-deb Configuration in Cargo.toml
|
|
**What:** Package metadata in `tcptop/Cargo.toml` for `.deb` generation.
|
|
|
|
```toml
|
|
[package]
|
|
name = "tcptop"
|
|
version = "0.1.0"
|
|
edition = "2021"
|
|
description = "Real-time per-connection network monitor using eBPF"
|
|
license = "MIT"
|
|
repository = "https://github.com/OWNER/tcptop"
|
|
readme = "../README.md"
|
|
|
|
[package.metadata.deb]
|
|
maintainer = "Your Name <email@example.com>"
|
|
section = "net"
|
|
priority = "optional"
|
|
depends = "$auto"
|
|
assets = [
|
|
["target/release/tcptop", "usr/bin/", "755"],
|
|
["../doc/tcptop.1", "usr/share/man/man1/tcptop.1", "644"],
|
|
]
|
|
```
|
|
|
|
### Pattern 5: cargo-generate-rpm Configuration in Cargo.toml
|
|
|
|
```toml
|
|
[package.metadata.generate-rpm]
|
|
assets = [
|
|
{ source = "target/release/tcptop", dest = "/usr/bin/tcptop", mode = "755" },
|
|
{ source = "../doc/tcptop.1", dest = "/usr/share/man/man1/tcptop.1", mode = "644", doc = true },
|
|
]
|
|
```
|
|
|
|
### Anti-Patterns to Avoid
|
|
- **Deriving Serialize directly on ConnectionRecord:** It has `Instant` (not serializable), nested `ConnectionKey`, and internal fields (`prev_bytes_*`, `is_partial`, `is_closed`) that should not appear in CSV. Use a flat `CsvRow` conversion instead.
|
|
- **Initializing ratatui before checking --log:** Terminal raw mode will interfere with headless operation. Branch first.
|
|
- **Appending to CSV without flush:** The csv crate buffers internally. Call `flush()` after each tick to ensure data is written for crash-resilient logging.
|
|
- **Using `csv::Writer::from_writer(File::create(...))` with File::create separately:** `csv::Writer::from_path()` handles file creation and overwrite in one call. Simpler and correct.
|
|
|
|
## Don't Hand-Roll
|
|
|
|
| Problem | Don't Build | Use Instead | Why |
|
|
|---------|-------------|-------------|-----|
|
|
| CSV serialization | Custom string formatting with commas | `csv` crate + serde `Serialize` | Escaping commas in process names, quoting rules, header generation -- all handled. |
|
|
| ISO 8601 timestamps | `format!("{}-{}-{}T{}:{}:{}", ...)` | `chrono::Utc::now().to_rfc3339()` | Timezone handling, leap seconds, formatting edge cases. |
|
|
| .deb package creation | Manual dpkg-deb with control files | `cargo-deb` | Debian package format is surprisingly complex (control, data archives, compression). |
|
|
| .rpm package creation | rpmbuild with spec files | `cargo-generate-rpm` | RPM spec files are arcane. The crate reads Cargo.toml metadata directly. |
|
|
| Man page formatting | HTML-to-man conversion | Hand-written troff | Man pages in troff format are straightforward. The format is simple for single-page docs. |
|
|
|
|
**Key insight:** CSV seems simple but has real edge cases (embedded commas in process names, newlines, quoting). The csv crate handles RFC 4180 compliance. Never format CSV by hand.
|
|
|
|
## Common Pitfalls
|
|
|
|
### Pitfall 1: CSV Writer Not Flushed Before Process Exit
|
|
**What goes wrong:** User sends SIGINT, csv::Writer's internal buffer is not flushed, last tick's data is lost.
|
|
**Why it happens:** csv::Writer buffers writes. Drop impl flushes, but only if the writer is dropped cleanly (not during signal handling).
|
|
**How to avoid:** Call `csv_writer.flush()` after every tick write. This ensures at most one tick's worth of data loss on abrupt termination.
|
|
**Warning signs:** CSV file is truncated or missing the last few rows.
|
|
|
|
### Pitfall 2: Headless Mode Still Tries to Use Terminal
|
|
**What goes wrong:** Importing crossterm EventStream or calling ratatui functions in headless path causes terminal corruption.
|
|
**Why it happens:** Code sharing between TUI and headless paths pulls in terminal setup.
|
|
**How to avoid:** Headless function must be completely separate from TUI function. No shared event stream, no terminal init/restore calls.
|
|
**Warning signs:** Garbled terminal output when using `--log`.
|
|
|
|
### Pitfall 3: Man Page Not Found After Package Install
|
|
**What goes wrong:** `man tcptop` returns "No manual entry" after installing .deb/.rpm.
|
|
**Why it happens:** Man page installed to wrong path, wrong permissions, or man-db cache not updated.
|
|
**How to avoid:** Install to `/usr/share/man/man1/tcptop.1` (not `.1.gz` -- cargo-deb handles compression). Verify with `dpkg -L tcptop | grep man` after install.
|
|
**Warning signs:** File exists but `man` can't find it.
|
|
|
|
### Pitfall 4: cargo install Fails Due to Missing Package Metadata
|
|
**What goes wrong:** `cargo install tcptop` (from crates.io) fails with missing description or license.
|
|
**Why it happens:** crates.io requires `description` and `license` (or `license-file`) in `[package]`.
|
|
**How to avoid:** Add all required fields before attempting `cargo publish`. Verify with `cargo package --list`.
|
|
**Warning signs:** `cargo publish --dry-run` errors.
|
|
|
|
### Pitfall 5: Rate Values as Raw Floats in CSV
|
|
**What goes wrong:** CSV contains values like `1234.567890123456` with excessive precision, making files large and hard to read.
|
|
**Why it happens:** f64 default serialization has full precision.
|
|
**How to avoid:** Round rate values to reasonable precision (2 decimal places) in `CsvRow::from_record()`. Use raw byte counts (u64, exact) but rounded rates.
|
|
**Warning signs:** CSV files are unexpectedly large; columns don't align in spreadsheet tools.
|
|
|
|
### Pitfall 6: Timestamp Granularity Mismatch
|
|
**What goes wrong:** All rows in one snapshot have slightly different timestamps because timestamp is generated per-row.
|
|
**Why it happens:** Calling `Utc::now()` inside the row loop instead of once per tick.
|
|
**How to avoid:** Generate timestamp ONCE per tick, pass to all `CsvRow::from_record()` calls.
|
|
**Warning signs:** Rows that should be from the same snapshot have different timestamps.
|
|
|
|
## Code Examples
|
|
|
|
### CSV Writer Module
|
|
```rust
|
|
// src/csv_writer.rs
|
|
use crate::model::{ConnectionRecord, Protocol};
|
|
use anyhow::Result;
|
|
use serde::Serialize;
|
|
use std::path::Path;
|
|
|
|
#[derive(Serialize)]
|
|
pub struct CsvRow {
|
|
pub timestamp: String,
|
|
pub protocol: &'static str,
|
|
pub local_addr: String,
|
|
pub local_port: u16,
|
|
pub remote_addr: String,
|
|
pub remote_port: u16,
|
|
pub pid: u32,
|
|
pub process_name: String,
|
|
pub state: String,
|
|
pub bytes_in: u64,
|
|
pub bytes_out: u64,
|
|
pub packets_in: u64,
|
|
pub packets_out: u64,
|
|
pub rate_in_bytes_sec: f64,
|
|
pub rate_out_bytes_sec: f64,
|
|
pub rtt_us: String,
|
|
}
|
|
|
|
impl CsvRow {
|
|
pub fn from_record(record: &ConnectionRecord, timestamp: &str) -> Self {
|
|
CsvRow {
|
|
timestamp: timestamp.to_string(),
|
|
protocol: match record.key.protocol {
|
|
Protocol::Tcp => "TCP",
|
|
Protocol::Udp => "UDP",
|
|
},
|
|
local_addr: record.key.local_addr.to_string(),
|
|
local_port: record.key.local_port,
|
|
remote_addr: record.key.remote_addr.to_string(),
|
|
remote_port: record.key.remote_port,
|
|
pid: record.pid,
|
|
process_name: record.process_name.clone(),
|
|
state: record.tcp_state
|
|
.map_or("UDP".to_string(), |s| s.as_str().to_string()),
|
|
bytes_in: record.bytes_in,
|
|
bytes_out: record.bytes_out,
|
|
packets_in: record.packets_in,
|
|
packets_out: record.packets_out,
|
|
rate_in_bytes_sec: (record.rate_in * 100.0).round() / 100.0,
|
|
rate_out_bytes_sec: (record.rate_out * 100.0).round() / 100.0,
|
|
rtt_us: record.rtt_us
|
|
.map_or("N/A".to_string(), |v| v.to_string()),
|
|
}
|
|
}
|
|
}
|
|
|
|
pub struct CsvLogger {
|
|
writer: csv::Writer<std::fs::File>,
|
|
}
|
|
|
|
impl CsvLogger {
|
|
pub fn new(path: &Path) -> Result<Self> {
|
|
let writer = csv::Writer::from_path(path)?; // Creates/overwrites per D-04
|
|
Ok(CsvLogger { writer })
|
|
}
|
|
|
|
pub fn write_snapshot(&mut self, records: &[&ConnectionRecord], timestamp: &str) -> Result<()> {
|
|
for record in records {
|
|
self.writer.serialize(CsvRow::from_record(record, timestamp))?;
|
|
}
|
|
self.writer.flush()?; // Flush per Pitfall 1
|
|
Ok(())
|
|
}
|
|
}
|
|
```
|
|
|
|
### Man Page Format (troff)
|
|
```troff
|
|
.TH TCPTOP 1 "2026-03-22" "tcptop 0.1.0" "User Commands"
|
|
.SH NAME
|
|
tcptop \- real-time per-connection network monitor
|
|
.SH SYNOPSIS
|
|
.B tcptop
|
|
[\fIOPTIONS\fR]
|
|
.SH DESCRIPTION
|
|
.B tcptop
|
|
displays live per-connection TCP and UDP statistics in a sortable terminal
|
|
table. It uses eBPF to trace kernel network events with minimal overhead,
|
|
providing real-time visibility into bytes, packets, bandwidth rates, RTT,
|
|
and connection state per process.
|
|
.PP
|
|
Requires root privileges (or CAP_BPF + CAP_PERFMON capabilities) to attach
|
|
eBPF programs to kernel probes. Exits with code 77 if insufficient privileges.
|
|
.SH OPTIONS
|
|
.TP
|
|
.BR \-\-port " " \fIPORT\fR
|
|
Filter by port number (matches source or destination).
|
|
.TP
|
|
.BR \-\-pid " " \fIPID\fR
|
|
Filter by process ID.
|
|
.TP
|
|
.BR \-\-process " " \fINAME\fR
|
|
Filter by process name (substring match).
|
|
.TP
|
|
.BR \-i ", " \-\-interface " " \fIIFACE\fR
|
|
Network interface to monitor.
|
|
.TP
|
|
.BR \-\-tcp
|
|
Show only TCP connections.
|
|
.TP
|
|
.BR \-\-udp
|
|
Show only UDP connections.
|
|
.TP
|
|
.BR \-\-interval " " \fISECS\fR
|
|
Refresh interval in seconds (default: 1).
|
|
.TP
|
|
.BR \-\-log " " \fIPATH\fR
|
|
Log connection data to CSV file. Runs in headless mode (no TUI).
|
|
Overwrites existing file. Snapshots all connections at \-\-interval cadence.
|
|
.SH INTERACTIVE KEYS
|
|
.TP
|
|
.B q, Ctrl-C
|
|
Quit.
|
|
.TP
|
|
.B Up/Down
|
|
Scroll connection table.
|
|
.TP
|
|
.B Left/Right
|
|
Change sort column.
|
|
.TP
|
|
.B /
|
|
Enter filter mode (filter by IP, port, or process name).
|
|
.TP
|
|
.B Esc
|
|
Clear filter.
|
|
.SH EXAMPLES
|
|
Monitor all connections:
|
|
.PP
|
|
.RS
|
|
.B sudo tcptop
|
|
.RE
|
|
.PP
|
|
Monitor only TCP connections to port 443:
|
|
.PP
|
|
.RS
|
|
.B sudo tcptop --tcp --port 443
|
|
.RE
|
|
.PP
|
|
Monitor a specific process:
|
|
.PP
|
|
.RS
|
|
.B sudo tcptop --pid 1234
|
|
.RE
|
|
.PP
|
|
Log to CSV with 2-second intervals:
|
|
.PP
|
|
.RS
|
|
.B sudo tcptop --port 443 --log capture.csv --interval 2
|
|
.RE
|
|
.SH EXIT STATUS
|
|
.TP
|
|
.B 0
|
|
Normal exit.
|
|
.TP
|
|
.B 77
|
|
Insufficient privileges (not root and missing required capabilities).
|
|
.SH REQUIREMENTS
|
|
Linux kernel 5.8+ with eBPF support. The tool attaches kprobes and
|
|
tracepoints to kernel functions (tcp_sendmsg, tcp_recvmsg, udp_sendmsg,
|
|
udp_recvmsg, inet_sock_set_state).
|
|
.SH SEE ALSO
|
|
.BR top (1),
|
|
.BR ss (8),
|
|
.BR netstat (8),
|
|
.BR bpftool (8)
|
|
.SH AUTHORS
|
|
tcptop contributors.
|
|
```
|
|
|
|
### Test Pattern for CSV Writer
|
|
```rust
|
|
// tests/csv_test.rs
|
|
use std::io::Read;
|
|
use tempfile::NamedTempFile;
|
|
|
|
#[test]
|
|
fn test_csv_header_and_fields() {
|
|
let tmp = NamedTempFile::new().unwrap();
|
|
let path = tmp.path().to_path_buf();
|
|
|
|
let mut logger = tcptop::csv_writer::CsvLogger::new(&path).unwrap();
|
|
|
|
// Create a synthetic record (same pattern as pipeline_test.rs)
|
|
let record = create_test_record(); // helper
|
|
logger.write_snapshot(&[&record], "2026-03-22T00:00:00+00:00").unwrap();
|
|
drop(logger); // Ensure flush
|
|
|
|
let contents = std::fs::read_to_string(&path).unwrap();
|
|
let lines: Vec<&str> = contents.lines().collect();
|
|
|
|
// Header row present (OUTP-02)
|
|
assert!(lines[0].contains("timestamp"));
|
|
assert!(lines[0].contains("protocol"));
|
|
assert!(lines[0].contains("pid"));
|
|
|
|
// Data row present with correct field count
|
|
assert_eq!(lines.len(), 2); // header + 1 data row
|
|
let fields: Vec<&str> = lines[1].split(',').collect();
|
|
assert_eq!(fields.len(), 16); // All D-02 fields
|
|
}
|
|
|
|
#[test]
|
|
fn test_csv_overwrite_existing_file() {
|
|
let tmp = NamedTempFile::new().unwrap();
|
|
let path = tmp.path().to_path_buf();
|
|
|
|
// Write initial content
|
|
std::fs::write(&path, "old,data\n").unwrap();
|
|
|
|
// CsvLogger should overwrite (D-04)
|
|
let logger = tcptop::csv_writer::CsvLogger::new(&path).unwrap();
|
|
drop(logger);
|
|
|
|
let contents = std::fs::read_to_string(&path).unwrap();
|
|
assert!(!contents.contains("old,data"));
|
|
}
|
|
|
|
#[test]
|
|
fn test_csv_timestamp_consistency() {
|
|
// All rows in one snapshot should have the same timestamp (Pitfall 6)
|
|
let tmp = NamedTempFile::new().unwrap();
|
|
let path = tmp.path().to_path_buf();
|
|
|
|
let mut logger = tcptop::csv_writer::CsvLogger::new(&path).unwrap();
|
|
let records = vec![create_test_record(), create_test_record_2()];
|
|
let refs: Vec<&_> = records.iter().collect();
|
|
|
|
let ts = "2026-03-22T12:00:00+00:00";
|
|
logger.write_snapshot(&refs.iter().map(|r| *r).collect::<Vec<_>>(), ts).unwrap();
|
|
drop(logger);
|
|
|
|
let contents = std::fs::read_to_string(&path).unwrap();
|
|
for line in contents.lines().skip(1) { // skip header
|
|
assert!(line.starts_with(ts));
|
|
}
|
|
}
|
|
```
|
|
|
|
## State of the Art
|
|
|
|
| Old Approach | Current Approach | When Changed | Impact |
|
|
|--------------|------------------|--------------|--------|
|
|
| `csv` 1.3.x | `csv` 1.4.0 | 2024 | Minor: improved error messages, no API changes |
|
|
| `cargo-deb` 1.x-2.x | `cargo-deb` 3.6.3 | 2024-2025 | Major rewrite: better cross-compilation support, faster builds |
|
|
| `cargo-generate-rpm` 0.14.x | `cargo-generate-rpm` 0.20.0 | 2025 | Enhanced asset handling, relative path support |
|
|
| `chrono` 0.4.31 | `chrono` 0.4.44 | 2024-2025 | Security fixes, `default-features = false` recommended to avoid old `time` crate |
|
|
|
|
**Deprecated/outdated:**
|
|
- `cargo-rpm` (separate crate): Largely abandoned, `cargo-generate-rpm` is the maintained alternative.
|
|
- `time` crate for formatting: `chrono` remains the standard for datetime formatting in Rust applications (the `time` vs `chrono` debate has settled -- both are viable, but chrono has more ecosystem momentum for ISO 8601 use cases).
|
|
|
|
## Open Questions
|
|
|
|
1. **Man page location: `doc/tcptop.1` vs `man/tcptop.1`**
|
|
- What we know: Both conventions exist in the Rust ecosystem. cargo-deb references source path in assets config.
|
|
- What's unclear: No project convention established yet.
|
|
- Recommendation: Use `doc/tcptop.1` at workspace root -- conventional and referenced in CONTEXT.md.
|
|
|
|
2. **chrono vs manual SystemTime formatting**
|
|
- What we know: `chrono` adds a dependency. `SystemTime` with manual formatting avoids it but is verbose.
|
|
- What's unclear: Whether the dependency size matters for this project.
|
|
- Recommendation: Use `chrono` with `default-features = false, features = ["clock"]` -- minimal footprint, clean API, standard choice. The project already has 20+ dependencies; one more for correct timestamps is warranted.
|
|
|
|
3. **Build automation for .deb/.rpm**
|
|
- What we know: No xtask directory exists. cargo-deb and cargo-generate-rpm are CLI tools run manually.
|
|
- What's unclear: Whether to add xtask commands or just document the build commands.
|
|
- Recommendation: Document commands in a Makefile or README. xtask is overkill for two commands. Phase 3 scope should be minimal -- just configure Cargo.toml metadata and document `cargo deb` / `cargo generate-rpm`.
|
|
|
|
## Sources
|
|
|
|
### Primary (HIGH confidence)
|
|
- [csv crate on crates.io](https://crates.io/crates/csv) - v1.4.0 verified current
|
|
- [cargo-deb GitHub](https://github.com/kornelski/cargo-deb) - v3.6.3, Cargo.toml configuration format
|
|
- [cargo-generate-rpm GitHub](https://github.com/cat-in-136/cargo-generate-rpm) - v0.20.0, asset configuration format
|
|
- Existing codebase: `model.rs`, `output.rs`, `main.rs`, `aggregator.rs`, `pipeline_test.rs` -- read directly
|
|
|
|
### Secondary (MEDIUM confidence)
|
|
- [chrono crate on crates.io](https://crates.io/crates/chrono) - v0.4.44 verified current
|
|
- Man page troff format -- stable specification, well-documented in `man 7 groff_man`
|
|
|
|
### Tertiary (LOW confidence)
|
|
- None -- all critical claims verified against official sources or existing codebase.
|
|
|
|
## Metadata
|
|
|
|
**Confidence breakdown:**
|
|
- Standard stack: HIGH - csv, serde, chrono are undisputed choices; versions verified against crates.io
|
|
- Architecture: HIGH - existing codebase is well-understood; headless branching pattern is straightforward
|
|
- Pitfalls: HIGH - common CSV/packaging issues are well-documented in the ecosystem
|
|
- Packaging config: MEDIUM - cargo-deb/cargo-generate-rpm config syntax verified from GitHub READMEs but not tested against this specific project structure
|
|
|
|
**Research date:** 2026-03-22
|
|
**Valid until:** 2026-04-22 (stable domain, slow-moving dependencies)
|