The bytecode pipeline (§3.1) now handles all program-level execution. Gate execute_program with #[cfg(test)], add deprecation docs to the AST-walking statements/expressions modules, and update architecture docs to reflect the bytecode VM as primary. Also adds story 3-2 (generators) as ready-for-dev. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
217 lines
12 KiB
Markdown
217 lines
12 KiB
Markdown
# Architecture — rust_browser
|
|
|
|
**Generated:** 2026-03-05 | **Scan Level:** Deep
|
|
|
|
## Executive Summary
|
|
|
|
rust_browser is a from-scratch web browser engine built as a Rust Cargo workspace with 22 crates organized in a strict 4-layer hierarchy. The architecture follows a deterministic, single-threaded rendering pipeline (HTML → DOM → CSS → Style → Layout → Display List → Render) with arena-based data flow between phases. Worker threads handle only network I/O and image decoding. The project prioritizes correctness, testability, and clear separation of concerns over early performance optimization.
|
|
|
|
## Architecture Overview
|
|
|
|
```
|
|
┌────────────────────────────────────────────────────────────────┐
|
|
│ Layer 3: app_browser │
|
|
│ Desktop shell, CLI, event loop, pipeline orchestration │
|
|
├────────────────────────────────────────────────────────────────┤
|
|
│ Layer 2: browser_runtime │
|
|
│ Tab management, navigation lifecycle, history, browsing ctx │
|
|
├────────────────────────────────────────────────────────────────┤
|
|
│ Layer 1: Engine Crates │
|
|
│ │
|
|
│ ┌─── JS Engine ───┐ ┌──── Rendering Pipeline ─────────────┐ │
|
|
│ │ js_parser (AST) │ │ html → dom → css → selectors │ │
|
|
│ │ js_vm (interp) │ │ → style → layout → display_list │ │
|
|
│ │ js (facade) │ │ → render → graphics │ │
|
|
│ └──────────────────┘ └─────────────────────────────────────┘ │
|
|
│ │
|
|
│ ┌─── Infrastructure ──┐ ┌── Web APIs ──┐ │
|
|
│ │ net (HTTP/file) │ │ web_api │ │
|
|
│ │ image (decode) │ │ (JS↔DOM │ │
|
|
│ │ fonts (rasterize) │ │ bridge) │ │
|
|
│ │ storage (placeholder)│ └──────────────┘ │
|
|
│ │ platform (windowing) │ │
|
|
│ └──────────────────────┘ │
|
|
├────────────────────────────────────────────────────────────────┤
|
|
│ Layer 0: shared │
|
|
│ Common types: Point, Size, Rect, Color, NodeId, StyleId, │
|
|
│ LayoutId, BrowserUrl, errors │
|
|
└────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Rendering Pipeline
|
|
|
|
The pipeline is intentionally staged and deterministic. Each phase produces a stable intermediate representation:
|
|
|
|
```
|
|
1. Input → HTML source string
|
|
2. html crate → DOM Tree (arena-based, NodeId indices)
|
|
3. css crate → Parsed Stylesheets (Rules, Selectors, Declarations)
|
|
4. selectors → Matched Rules per node (with Specificity)
|
|
5. style crate → StyledDocument (NodeId → ComputedStyles)
|
|
6. layout crate → Layout Tree (LayoutNode, box geometry, positioning)
|
|
7. display_list → Display List (Vec<DisplayItem>: rects, text, borders, images)
|
|
8. render crate → Pixel Buffer (RGBA8 pixels)
|
|
9. platform → On-screen presentation (via softbuffer + winit)
|
|
```
|
|
|
|
In `--render` mode, steps 2-7 produce artifact dumps (`.layout.txt` and `.dl.txt`) for golden regression testing.
|
|
|
|
## Crate Dependency Graph
|
|
|
|
### Layer Rules (enforced by `scripts/check_deps.sh`)
|
|
- **No upward dependencies** — Layer 1 cannot depend on Layer 2 or 3
|
|
- **Lateral dependencies within Layer 1** are allowed where necessary (e.g., `style` depends on `css`, `dom`, `selectors`)
|
|
- **Layer 0 (`shared`)** has no internal dependencies
|
|
|
|
### Key Dependencies
|
|
|
|
| Crate | Depends On |
|
|
|---|---|
|
|
| `app_browser` (L3) | browser_runtime, platform, shared, net, html, dom, css, style, layout, display_list, render, image |
|
|
| `browser_runtime` (L2) | web_api, dom, net, storage, shared |
|
|
| `web_api` (L1) | dom, html, js, js_parser, shared |
|
|
| `style` (L1) | css, dom, selectors, shared |
|
|
| `layout` (L1) | dom, style, image, fonts, shared |
|
|
| `display_list` (L1) | layout, style, shared |
|
|
| `render` (L1) | display_list, fonts, graphics, image, shared |
|
|
| `selectors` (L1) | dom, shared |
|
|
| `js_vm` (L1) | js_parser |
|
|
| `js` (L1) | js_parser, js_vm |
|
|
|
|
## Data Flow Patterns
|
|
|
|
### Arena-Based Storage
|
|
Each crate uses arena-based storage with integer IDs for cross-crate references:
|
|
- **`NodeId`** — DOM node identity (used by dom, html, style, layout, display_list)
|
|
- **`StyleId`** — Style identity
|
|
- **`LayoutId`** — Layout node identity
|
|
- **`ImageId`** — Decoded image identity (used by image, render)
|
|
|
|
This avoids lifetime-based references across crate boundaries, simplifying the API surface.
|
|
|
|
### Phase-Based Mutation
|
|
The pipeline follows a strict phase model:
|
|
1. **Parse phase** — HTML parser builds DOM tree (mutable)
|
|
2. **Style phase** — Style engine reads DOM, produces `StyledDocument` (DOM is immutable)
|
|
3. **Layout phase** — Layout engine reads styled DOM, produces layout tree
|
|
4. **Paint phase** — Display list builder reads layout tree, produces draw commands
|
|
5. **Render phase** — Rasterizer reads display list, produces pixels
|
|
|
|
Each phase reads the previous phase's output and produces new, stable output.
|
|
|
|
## JavaScript Engine Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────┐
|
|
│ js crate (facade) │
|
|
│ JsEngine: combines parser + VM │
|
|
├─────────────────────────────────────────────────┤
|
|
│ js_vm crate │
|
|
│ JsVm: runtime state machine │
|
|
│ bytecode::Compiler: AST → bytecode (Chunk) │
|
|
│ bytecode_exec: bytecode execution loop │
|
|
│ Environment: variable scoping │
|
|
│ JsValue: value representation │
|
|
│ HostEnvironment: host object binding │
|
|
│ Runtime: builtins, coercion, property access │
|
|
│ [deprecated] AST interpreter: statements.rs, │
|
|
│ expressions/ (see note below) │
|
|
├─────────────────────────────────────────────────┤
|
|
│ js_parser crate (parsing) │
|
|
│ JsParser: source → AST │
|
|
│ Tokenizer: source → tokens │
|
|
│ AST types: Program, Statement, Expression │
|
|
└─────────────────────────────────────────────────┘
|
|
```
|
|
|
|
- **Bytecode pipeline** — `JsParser` produces an AST, `bytecode::Compiler` compiles it
|
|
to a `Chunk` (bytecode + constant pool), `bytecode_exec` runs the bytecode.
|
|
The old AST-walking interpreter (`execute_program`, `execute_stmt`, `eval_expr`)
|
|
is deprecated. It still exists because `call_function` (used by the bytecode
|
|
executor for function invocations) delegates to `execute_function_body` →
|
|
`execute_stmt`. Once function calls are fully compiled to bytecode, the AST
|
|
interpreter can be removed.
|
|
- **Configurable limits** — `VmConfig` with `max_statements`, `max_call_depth`
|
|
- **Host bindings** — `HostEnvironment` trait for injecting browser APIs
|
|
- **Regex** — ECMAScript-compatible via `regress` crate
|
|
|
|
## Web API Bridge
|
|
|
|
The `web_api` crate connects the JS engine to the DOM:
|
|
|
|
- **DOM Host Objects** — `window`, `document` exposed as JS host objects
|
|
- **Event System** — `EventTarget`, `EventListenerRegistry`, `DispatchResult`
|
|
- **Scheduling** — `TaskQueue` (setTimeout/setInterval), `MicrotaskQueue` (Promise callbacks)
|
|
- **Promises** — `PromiseRegistry` for async operation tracking
|
|
- **Script Execution** — `WebApiFacade::execute_script()` as the unified entry point
|
|
|
|
## Threading Model
|
|
|
|
```
|
|
Main Thread (single-threaded):
|
|
├── JavaScript execution (bytecode VM)
|
|
├── DOM manipulation
|
|
├── Style computation
|
|
├── Layout
|
|
├── Display list generation
|
|
├── CPU rasterization
|
|
└── Event dispatch
|
|
|
|
Worker Threads (I/O only):
|
|
├── Network requests (HTTP via ureq)
|
|
└── Image decoding
|
|
```
|
|
|
|
The single-threaded model ensures determinism and simplifies reasoning about state. Worker threads are used only for operations that would block the main thread.
|
|
|
|
## State Management
|
|
|
|
### Navigation Lifecycle
|
|
```
|
|
BrowserRuntime
|
|
└── BrowsingContext
|
|
├── NavigationState: Loading → ReceivingData → Complete | Failed
|
|
├── Current URL
|
|
└── WebApiFacade
|
|
├── JS Engine
|
|
├── DOM Document
|
|
├── Event Listeners
|
|
├── Task Queue
|
|
└── Microtask Queue
|
|
```
|
|
|
|
### Application State (app_browser)
|
|
- `AppState` — mutable browser state (current page, scroll position, input focus)
|
|
- `BrowserChrome` — UI chrome overlay (URL bar, status)
|
|
- `HitTest` — Click target resolution from display list
|
|
- `FocusOutline` — Keyboard focus visualization
|
|
- `FormState` — Form input handling
|
|
|
|
## Platform Abstraction
|
|
|
|
Platform-specific code is isolated in `crates/platform/`:
|
|
- **Windowing** — `winit` for cross-platform window creation
|
|
- **Pixel presentation** — `softbuffer` for CPU-rendered pixel blitting
|
|
- **Event mapping** — OS events → `WindowEvent` enum
|
|
- **Text input** — IME state management
|
|
|
|
The `platform` and `graphics` crates are the only ones allowed to use `unsafe` code.
|
|
|
|
## Testing Architecture
|
|
|
|
| Layer | Type | Location | Purpose |
|
|
|---|---|---|---|
|
|
| Unit | `#[cfg(test)] mod tests` | Inline in crate source | Private API testing |
|
|
| Integration | `tests/*.rs` | Root `tests/` directory | Cross-crate behavior |
|
|
| Golden | `tests/goldens.rs` | Golden fixtures | Layout/display list regression |
|
|
| JS Conformance | `tests/js262_harness.rs` | Test262 manifests | ECMAScript spec compliance |
|
|
| Web Platform | `tests/wpt_harness.rs` | WPT fixtures | CSS/HTML spec compliance |
|
|
|
|
## Safety Guarantees
|
|
|
|
1. **`unsafe` forbidden globally** — workspace lint `unsafe_code = "forbid"`
|
|
2. **Exceptions**: `platform/` and `graphics/` crates only (via per-crate lint override)
|
|
3. **Enforced by CI** — `scripts/check_unsafe.sh` audits unsafe usage
|
|
4. **Dependency layering** — `scripts/check_deps.sh` prevents architecture violations
|
|
5. **License policy** — `deny.toml` prevents license/advisory issues
|