Files
Zachary D. Rowitsch 38e6dcc34a chore: archive v1.0 phase directories to milestones/v1.0-phases/
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 01:33:15 -04:00

97 lines
4.8 KiB
Markdown

# Phase 1: Data Pipeline - Context
**Gathered:** 2026-03-21
**Status:** Ready for planning
<domain>
## Phase Boundary
eBPF-based kernel data collection with platform abstraction, delivering real per-connection stats on Linux. Running `tcptop` prints streaming per-connection data to stdout as proof-of-life. The interactive TUI, CSV logging, and macOS backend are separate phases.
</domain>
<decisions>
## Implementation Decisions
### Proof-of-life output format
- **D-01:** Streaming lines to stdout — each connection update prints a new line (like `tail -f`). No screen clearing or cursor manipulation.
- **D-02:** Human-readable sizes by default (`1.2 MB`, `340 KB/s`). Show raw value alongside when it fits cleanly without noise (e.g., `1258291 (1.2M)`). If too noisy, drop to human-readable only.
- **D-03:** Full detail per connection — all fields: local/remote addr+port, PID, process name, TCP state, bytes in/out, packets in/out, RTT, bandwidth rate.
- **D-04:** Output is assumed throwaway scaffolding. Build it simple; decide whether to keep as `--batch` mode when TUI lands in Phase 2.
### UDP flow definition
- **D-05:** Full 4-tuple grouping — (src IP, src port, dst IP, dst port) = one flow. Most granular.
- **D-06:** 5-second idle timeout — UDP flows disappear 5s after last packet. Should be tunable via flag in the future (not Phase 1).
- **D-07:** No synthesized state for UDP — show `-` or `UDP` in the state column. Don't pretend UDP has connection states.
- **D-08:** Flat bidirectional flow tracking — count bytes/packets in each direction, no request/response inference.
### Privilege error experience
- **D-09:** Minimal error message on missing privileges: `error: tcptop requires root privileges. Run with sudo.` No explanations, no offers to re-exec.
- **D-10:** If Linux capabilities (`CAP_BPF`, `CAP_PERFMON`) are present, proceed without root. But don't suggest or document capability setup in the error message.
- **D-11:** Exit code 77 on privilege failure (distinguishable from generic errors for scripting).
### Connection lifecycle
- **D-12:** Closed TCP connections linger for one display/refresh cycle, then are removed. Duration scales with refresh rate.
- **D-13:** Visual distinction for new and closing connections — new connections get a brief highlight/marker, closing connections shown differently (dimmed, color change). Exact styling deferred to Phase 2.
- **D-14:** Connection close events print a line in streaming output: `[CLOSED] 192.168.1.1:443 → ...` so the full lifecycle is visible in the stream.
- **D-15:** Pre-existing connections (started before tcptop) are shown, sourced from `/proc/net/tcp` on startup. Marked so the user knows byte/packet counts are partial (started from zero, missed earlier traffic).
### Claude's Discretion
- eBPF hook point selection (kprobes vs tracepoints)
- Platform abstraction trait design and boundaries
- Ring buffer vs perf event array for kernel-to-userspace transport
- RTT estimation implementation approach
- Exact format/layout of streaming output lines
- `/proc/net/tcp` parsing strategy for pre-existing connections
- Internal data structures and concurrency model
</decisions>
<specifics>
## Specific Ideas
- Human-readable vs raw byte display should eventually be toggleable (keypress or flag) — not required in Phase 1, but keep the formatting logic separable so it's easy to add later.
- The streaming output should feel like `tail -f` for network connections — familiar to anyone who monitors logs.
</specifics>
<canonical_refs>
## Canonical References
No external specs — requirements are fully captured in decisions above and in:
- `.planning/REQUIREMENTS.md` — DATA-01 through DATA-07, PLAT-01, PLAT-03, OPS-01, OPS-02
- `.planning/ROADMAP.md` §Phase 1 — success criteria and plan structure
- `CLAUDE.md` §Technology Stack — Aya, tokio, recommended stack and version pins
</canonical_refs>
<code_context>
## Existing Code Insights
### Reusable Assets
- None — greenfield project, no code exists yet
### Established Patterns
- None yet — Phase 1 establishes the foundational patterns (workspace structure, eBPF build pipeline, platform abstraction trait)
### Integration Points
- Phase 2 (TUI) will consume the data structures and collection trait defined here
- Phase 4 (macOS) will implement the platform abstraction trait defined here
</code_context>
<deferred>
## Deferred Ideas
- UDP flow idle timeout should be user-configurable via CLI flag — capture for Phase 2 or 3
- Protocol hint column for well-known UDP ports (DNS, NTP, etc.) — future enhancement, not v1
- Toggle between human-readable and raw byte display via keypress — Phase 2 TUI feature
- `--batch` or `--once` mode if proof-of-life output proves useful — evaluate after Phase 2
</deferred>
---
*Phase: 01-data-pipeline*
*Context gathered: 2026-03-21*