Files
rust_browser/docs/browser_testing_strategy_high_level_to_detailed.md
2026-01-29 00:43:50 -05:00

5.6 KiB
Raw Permalink Blame History

Browser Testing Strategy High Level to Detailed

Purpose of this Document

This document defines the testing philosophy, structure, and concrete practices for a new desktop web browser with a custom JavaScript engine.

The intent is to:

  • Prevent correctness regressions
  • Catch crashes, undefined behavior, and security issues early
  • Enable confident refactoring and future performance work
  • Scale testing effort as features grow

Core Testing Principles

  1. Correctness before performance
  2. Automation over manual testing
  3. Determinism over flakiness
  4. Regression tests for every bug fixed
  5. Sanitizers and fuzzing as first-class tools

Testing Layers Overview

Testing is organized in layers, from smallest and fastest to largest and slowest:

  1. Unit tests
  2. Component and integration tests
  3. Conformance test suites
  4. Fuzzing and property-based testing
  5. Rendering regression tests
  6. Soak, stability, and performance tests

Each layer has a distinct purpose and failure signal.

JavaScript Engine Testing

Language Conformance

  • Test262 is the primary language conformance suite
  • Run in tiers:
    • Small shard per pull request
    • Larger shard nightly
    • Full or near-full runs periodically

Differential Testing

  • Execute randomly generated JavaScript programs in:

    • The custom interpreter
    • A reference engine
  • Compare:

    • Output
    • Exception vs non-exception
    • Error class (TypeError, RangeError, etc.)
    • Numeric edge cases (NaN, -0, BigInt)

This catches subtle semantic bugs quickly.

Fuzzing Targets (JS Engine)

  • Tokenizer and parser
  • Scope and name resolution
  • Bytecode or IR generation
  • GC interaction under allocation pressure

All fuzzing runs under sanitizers.

Web Platform & DOM Testing

Unit and Integration Tests

  • DOM tree manipulation
  • Event dispatch and propagation
  • WebIDL type conversions
  • Exception ordering and types
  • Promise and microtask semantics

Web Platform Tests (WPT)

  • Long-term conformance target

  • Introduced incrementally:

    • URL and encoding
    • DOM and events
    • HTML parsing
    • Fetch (basic)
    • CSS parsing and selectors
  • Known failures tracked explicitly

  • Policy: no new failures introduced

Rendering and Layout Testing

Structural Regression Tests

Preferred early strategy:

  • Dump layout tree metrics (box sizes and positions)
  • Dump normalized display lists
  • Compare text output in CI

Advantages:

  • Highly stable across platforms
  • Easy to review and debug

Pixel-Based Snapshot Tests

Used selectively for:

  • Stacking contexts and z-ordering
  • Transforms and opacity
  • Clipping and compositing

Kept small and curated due to cross-platform variance.

Fuzzing Strategy (Browser-Wide)

High-Value Fuzz Targets

  • HTML tokenizer and parser
  • CSS tokenizer and parser
  • URL, header, and cookie parsing
  • Image and font decoders
  • Networking protocol state machines

Goals

  • No crashes
  • No hangs
  • No unbounded memory growth

Sanitizers and Hardening

Always-On in CI

  • AddressSanitizer (ASan)
  • UndefinedBehaviorSanitizer (UBSan)
  • LeakSanitizer (via ASan)

Conditional / Periodic

  • ThreadSanitizer (TSan) once threading increases

Merges are blocked on new sanitizer findings.

Determinism and Flake Control

A dedicated test mode provides:

  • Fake or controllable clocks
  • Deterministic task and microtask ordering
  • Deterministic RNG seeds
  • Explicit garbage collection triggers

Key test suites are re-run multiple times to measure flake rate.

Stability and Soak Testing

Automated Browse Bots

  • Open and close tabs
  • Navigate, scroll, zoom
  • Execute JS-heavy pages

Metrics Collected

  • Crash frequency
  • Memory growth over time
  • CPU usage while idle

Continuous Integration Structure

Per Pull Request

  1. Build + unit tests with ASan/UBSan
  2. JS engine Test262 smoke shard
  3. JS parser fuzzer (short run)
  4. Determinism smoke tests

Nightly / Periodic

  1. Larger Test262 runs
  2. Differential JS fuzzing
  3. Heap and GC stress tests
  4. Soak tests under sanitizers

Regression Discipline

  • Every fixed bug adds a permanent test
  • Security and crash bugs receive minimal repro tests
  • Test failures are investigated before code changes, not after

Test Placement Conventions (Rust Workspace)

This project is a Rust workspace with multiple crates. Tests must be placed correctly:

Unit Tests → Inline in Crate Source Files

crates/<crate>/src/lib.rs (or other .rs files)
└── #[cfg(test)]
    mod tests {
        // Unit tests go here
    }

Use for:

  • Testing internal/private functions
  • Edge cases for parsers, tokenizers, etc.
  • Module-specific behavior
  • Fast, isolated tests

Integration Tests → Root tests/ Directory

tests/
├── goldens.rs      # End-to-end rendering tests
└── main_cli_tests.rs  # CLI binary tests

Use for:

  • Testing the compiled CLI binary
  • Golden/snapshot tests for full pipeline
  • Cross-crate integration requiring the full workspace

Important: Any new files in tests/ require a [[test]] entry in the root Cargo.toml.

Why This Matters

  • Inline tests run faster and can access private APIs
  • Inline tests stay close to the code they verify
  • Integration tests in tests/ require full compilation of all dependencies
  • Misplaced tests create confusion and maintenance burden

Evolving the Strategy

This testing strategy is expected to grow alongside the browser. As performance features (JIT, compositor threads, GPU execution) are introduced, new testing layers will be added, but existing ones remain mandatory.

Correctness and stability are treated as compounding investments.