5.6 KiB
Browser Testing Strategy – High Level to Detailed
Purpose of this Document
This document defines the testing philosophy, structure, and concrete practices for a new desktop web browser with a custom JavaScript engine.
The intent is to:
- Prevent correctness regressions
- Catch crashes, undefined behavior, and security issues early
- Enable confident refactoring and future performance work
- Scale testing effort as features grow
Core Testing Principles
- Correctness before performance
- Automation over manual testing
- Determinism over flakiness
- Regression tests for every bug fixed
- Sanitizers and fuzzing as first-class tools
Testing Layers Overview
Testing is organized in layers, from smallest and fastest to largest and slowest:
- Unit tests
- Component and integration tests
- Conformance test suites
- Fuzzing and property-based testing
- Rendering regression tests
- Soak, stability, and performance tests
Each layer has a distinct purpose and failure signal.
JavaScript Engine Testing
Language Conformance
- Test262 is the primary language conformance suite
- Run in tiers:
- Small shard per pull request
- Larger shard nightly
- Full or near-full runs periodically
Differential Testing
-
Execute randomly generated JavaScript programs in:
- The custom interpreter
- A reference engine
-
Compare:
- Output
- Exception vs non-exception
- Error class (TypeError, RangeError, etc.)
- Numeric edge cases (NaN, -0, BigInt)
This catches subtle semantic bugs quickly.
Fuzzing Targets (JS Engine)
- Tokenizer and parser
- Scope and name resolution
- Bytecode or IR generation
- GC interaction under allocation pressure
All fuzzing runs under sanitizers.
Web Platform & DOM Testing
Unit and Integration Tests
- DOM tree manipulation
- Event dispatch and propagation
- WebIDL type conversions
- Exception ordering and types
- Promise and microtask semantics
Web Platform Tests (WPT)
-
Long-term conformance target
-
Introduced incrementally:
- URL and encoding
- DOM and events
- HTML parsing
- Fetch (basic)
- CSS parsing and selectors
-
Known failures tracked explicitly
-
Policy: no new failures introduced
Rendering and Layout Testing
Structural Regression Tests
Preferred early strategy:
- Dump layout tree metrics (box sizes and positions)
- Dump normalized display lists
- Compare text output in CI
Advantages:
- Highly stable across platforms
- Easy to review and debug
Pixel-Based Snapshot Tests
Used selectively for:
- Stacking contexts and z-ordering
- Transforms and opacity
- Clipping and compositing
Kept small and curated due to cross-platform variance.
Fuzzing Strategy (Browser-Wide)
High-Value Fuzz Targets
- HTML tokenizer and parser
- CSS tokenizer and parser
- URL, header, and cookie parsing
- Image and font decoders
- Networking protocol state machines
Goals
- No crashes
- No hangs
- No unbounded memory growth
Sanitizers and Hardening
Always-On in CI
- AddressSanitizer (ASan)
- UndefinedBehaviorSanitizer (UBSan)
- LeakSanitizer (via ASan)
Conditional / Periodic
- ThreadSanitizer (TSan) once threading increases
Merges are blocked on new sanitizer findings.
Determinism and Flake Control
A dedicated test mode provides:
- Fake or controllable clocks
- Deterministic task and microtask ordering
- Deterministic RNG seeds
- Explicit garbage collection triggers
Key test suites are re-run multiple times to measure flake rate.
Stability and Soak Testing
Automated Browse Bots
- Open and close tabs
- Navigate, scroll, zoom
- Execute JS-heavy pages
Metrics Collected
- Crash frequency
- Memory growth over time
- CPU usage while idle
Continuous Integration Structure
Per Pull Request
- Build + unit tests with ASan/UBSan
- JS engine Test262 smoke shard
- JS parser fuzzer (short run)
- Determinism smoke tests
Nightly / Periodic
- Larger Test262 runs
- Differential JS fuzzing
- Heap and GC stress tests
- Soak tests under sanitizers
Regression Discipline
- Every fixed bug adds a permanent test
- Security and crash bugs receive minimal repro tests
- Test failures are investigated before code changes, not after
Test Placement Conventions (Rust Workspace)
This project is a Rust workspace with multiple crates. Tests must be placed correctly:
Unit Tests → Inline in Crate Source Files
crates/<crate>/src/lib.rs (or other .rs files)
└── #[cfg(test)]
mod tests {
// Unit tests go here
}
Use for:
- Testing internal/private functions
- Edge cases for parsers, tokenizers, etc.
- Module-specific behavior
- Fast, isolated tests
Integration Tests → Root tests/ Directory
tests/
├── goldens.rs # End-to-end rendering tests
└── main_cli_tests.rs # CLI binary tests
Use for:
- Testing the compiled CLI binary
- Golden/snapshot tests for full pipeline
- Cross-crate integration requiring the full workspace
Important: Any new files in tests/ require a [[test]] entry in the root Cargo.toml.
Why This Matters
- Inline tests run faster and can access private APIs
- Inline tests stay close to the code they verify
- Integration tests in
tests/require full compilation of all dependencies - Misplaced tests create confusion and maintenance burden
Evolving the Strategy
This testing strategy is expected to grow alongside the browser. As performance features (JIT, compositor threads, GPU execution) are introduced, new testing layers will be added, but existing ones remain mandatory.
Correctness and stability are treated as compounding investments.