Files

Zachary D. Rowitsch 7e318179a6 Mark AST-walking JS interpreter as deprecated in favor of bytecode VM

The bytecode pipeline (§3.1) now handles all program-level execution.
Gate execute_program with #[cfg(test)], add deprecation docs to the
AST-walking statements/expressions modules, and update architecture
docs to reflect the bytecode VM as primary. Also adds story 3-2
(generators) as ready-for-dev.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-15 10:47:46 -04:00

12 KiB

Raw Permalink Blame History

Architecture — rust_browser

Generated: 2026-03-05 | Scan Level: Deep

Executive Summary

rust_browser is a from-scratch web browser engine built as a Rust Cargo workspace with 22 crates organized in a strict 4-layer hierarchy. The architecture follows a deterministic, single-threaded rendering pipeline (HTML → DOM → CSS → Style → Layout → Display List → Render) with arena-based data flow between phases. Worker threads handle only network I/O and image decoding. The project prioritizes correctness, testability, and clear separation of concerns over early performance optimization.

Architecture Overview

┌────────────────────────────────────────────────────────────────┐
│ Layer 3: app_browser                                           │
│   Desktop shell, CLI, event loop, pipeline orchestration       │
├────────────────────────────────────────────────────────────────┤
│ Layer 2: browser_runtime                                       │
│   Tab management, navigation lifecycle, history, browsing ctx  │
├────────────────────────────────────────────────────────────────┤
│ Layer 1: Engine Crates                                         │
│                                                                │
│  ┌─── JS Engine ───┐  ┌──── Rendering Pipeline ─────────────┐ │
│  │ js_parser (AST)  │  │ html → dom → css → selectors        │ │
│  │ js_vm (interp)   │  │   → style → layout → display_list   │ │
│  │ js (facade)      │  │   → render → graphics                │ │
│  └──────────────────┘  └─────────────────────────────────────┘ │
│                                                                │
│  ┌─── Infrastructure ──┐  ┌── Web APIs ──┐                    │
│  │ net (HTTP/file)      │  │ web_api      │                    │
│  │ image (decode)       │  │ (JS↔DOM      │                    │
│  │ fonts (rasterize)    │  │  bridge)     │                    │
│  │ storage (placeholder)│  └──────────────┘                    │
│  │ platform (windowing) │                                      │
│  └──────────────────────┘                                      │
├────────────────────────────────────────────────────────────────┤
│ Layer 0: shared                                                │
│   Common types: Point, Size, Rect, Color, NodeId, StyleId,    │
│   LayoutId, BrowserUrl, errors                                 │
└────────────────────────────────────────────────────────────────┘

Rendering Pipeline

The pipeline is intentionally staged and deterministic. Each phase produces a stable intermediate representation:

1. Input          → HTML source string
2. html crate     → DOM Tree (arena-based, NodeId indices)
3. css crate      → Parsed Stylesheets (Rules, Selectors, Declarations)
4. selectors      → Matched Rules per node (with Specificity)
5. style crate    → StyledDocument (NodeId → ComputedStyles)
6. layout crate   → Layout Tree (LayoutNode, box geometry, positioning)
7. display_list   → Display List (Vec<DisplayItem>: rects, text, borders, images)
8. render crate   → Pixel Buffer (RGBA8 pixels)
9. platform       → On-screen presentation (via softbuffer + winit)

In --render mode, steps 2-7 produce artifact dumps (.layout.txt and .dl.txt) for golden regression testing.

Crate Dependency Graph

Layer Rules (enforced by `scripts/check_deps.sh`)

No upward dependencies — Layer 1 cannot depend on Layer 2 or 3
Lateral dependencies within Layer 1 are allowed where necessary (e.g., style depends on css, dom, selectors)
Layer 0 (shared) has no internal dependencies

Key Dependencies

Crate	Depends On
`app_browser` (L3)	browser_runtime, platform, shared, net, html, dom, css, style, layout, display_list, render, image
`browser_runtime` (L2)	web_api, dom, net, storage, shared
`web_api` (L1)	dom, html, js, js_parser, shared
`style` (L1)	css, dom, selectors, shared
`layout` (L1)	dom, style, image, fonts, shared
`display_list` (L1)	layout, style, shared
`render` (L1)	display_list, fonts, graphics, image, shared
`selectors` (L1)	dom, shared
`js_vm` (L1)	js_parser
`js` (L1)	js_parser, js_vm

Data Flow Patterns

Arena-Based Storage

Each crate uses arena-based storage with integer IDs for cross-crate references:

NodeId — DOM node identity (used by dom, html, style, layout, display_list)
StyleId — Style identity
LayoutId — Layout node identity
ImageId — Decoded image identity (used by image, render)

This avoids lifetime-based references across crate boundaries, simplifying the API surface.

Phase-Based Mutation

The pipeline follows a strict phase model:

Parse phase — HTML parser builds DOM tree (mutable)
Style phase — Style engine reads DOM, produces StyledDocument (DOM is immutable)
Layout phase — Layout engine reads styled DOM, produces layout tree
Paint phase — Display list builder reads layout tree, produces draw commands
Render phase — Rasterizer reads display list, produces pixels

Each phase reads the previous phase's output and produces new, stable output.

JavaScript Engine Architecture

┌─────────────────────────────────────────────────┐
│ js crate (facade)                               │
│   JsEngine: combines parser + VM                │
├─────────────────────────────────────────────────┤
│ js_vm crate                                     │
│   JsVm: runtime state machine                   │
│   bytecode::Compiler: AST → bytecode (Chunk)    │
│   bytecode_exec: bytecode execution loop         │
│   Environment: variable scoping                  │
│   JsValue: value representation                  │
│   HostEnvironment: host object binding           │
│   Runtime: builtins, coercion, property access   │
│   [deprecated] AST interpreter: statements.rs,  │
│     expressions/ (see note below)                │
├─────────────────────────────────────────────────┤
│ js_parser crate (parsing)                       │
│   JsParser: source → AST                         │
│   Tokenizer: source → tokens                     │
│   AST types: Program, Statement, Expression      │
└─────────────────────────────────────────────────┘

Bytecode pipeline — JsParser produces an AST, bytecode::Compiler compiles it to a Chunk (bytecode + constant pool), bytecode_exec runs the bytecode. The old AST-walking interpreter (execute_program, execute_stmt, eval_expr) is deprecated. It still exists because call_function (used by the bytecode executor for function invocations) delegates to execute_function_body → execute_stmt. Once function calls are fully compiled to bytecode, the AST interpreter can be removed.
Configurable limits — VmConfig with max_statements, max_call_depth
Host bindings — HostEnvironment trait for injecting browser APIs
Regex — ECMAScript-compatible via regress crate

Web API Bridge

The web_api crate connects the JS engine to the DOM:

DOM Host Objects — window, document exposed as JS host objects
Event System — EventTarget, EventListenerRegistry, DispatchResult
Scheduling — TaskQueue (setTimeout/setInterval), MicrotaskQueue (Promise callbacks)
Promises — PromiseRegistry for async operation tracking
Script Execution — WebApiFacade::execute_script() as the unified entry point

Threading Model

Main Thread (single-threaded):
  ├── JavaScript execution (bytecode VM)
  ├── DOM manipulation
  ├── Style computation
  ├── Layout
  ├── Display list generation
  ├── CPU rasterization
  └── Event dispatch

Worker Threads (I/O only):
  ├── Network requests (HTTP via ureq)
  └── Image decoding

The single-threaded model ensures determinism and simplifies reasoning about state. Worker threads are used only for operations that would block the main thread.

State Management

BrowserRuntime
  └── BrowsingContext
        ├── NavigationState: Loading → ReceivingData → Complete | Failed
        ├── Current URL
        └── WebApiFacade
              ├── JS Engine
              ├── DOM Document
              ├── Event Listeners
              ├── Task Queue
              └── Microtask Queue

Application State (app_browser)

AppState — mutable browser state (current page, scroll position, input focus)
BrowserChrome — UI chrome overlay (URL bar, status)
HitTest — Click target resolution from display list
FocusOutline — Keyboard focus visualization
FormState — Form input handling

Platform Abstraction

Platform-specific code is isolated in crates/platform/:

Windowing — winit for cross-platform window creation
Pixel presentation — softbuffer for CPU-rendered pixel blitting
Event mapping — OS events → WindowEvent enum
Text input — IME state management

The platform and graphics crates are the only ones allowed to use unsafe code.

Testing Architecture

Layer	Type	Location	Purpose
Unit	`#[cfg(test)] mod tests`	Inline in crate source	Private API testing
Integration	`tests/*.rs`	Root `tests/` directory	Cross-crate behavior
Golden	`tests/goldens.rs`	Golden fixtures	Layout/display list regression
JS Conformance	`tests/js262_harness.rs`	Test262 manifests	ECMAScript spec compliance
Web Platform	`tests/wpt_harness.rs`	WPT fixtures	CSS/HTML spec compliance

Safety Guarantees

unsafe forbidden globally — workspace lint unsafe_code = "forbid"
Exceptions: platform/ and graphics/ crates only (via per-crate lint override)
Enforced by CI — scripts/check_unsafe.sh audits unsafe usage
Dependency layering — scripts/check_deps.sh prevents architecture violations
License policy — deny.toml prevents license/advisory issues

12 KiB Raw Permalink Blame History