Re-evaluate the full WPT test suite breakdown with current numbers (+128 tests promoted since Feb 14), add detailed per-category feature gap analysis with sub-feature counts, cross-cutting themes, and prioritized recommendations for highest-impact fixes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
257 lines
15 KiB
Markdown
257 lines
15 KiB
Markdown
# WPT Known-Fail Test Analysis
|
|
|
|
Last updated: 2026-02-25
|
|
|
|
## Current State
|
|
|
|
- **1,287 pass**, **1,626 known_fail**, **1 skip** (2,914 total)
|
|
- All known_fail tests are reftests that fail pixel comparison — they require CSS features or layout modes the engine doesn't yet support
|
|
- Test suite runs in ~12 seconds with parallel execution
|
|
|
|
## Completed Fixes
|
|
|
|
### Fix 1: Node ID Normalization in Reftest Comparison (Done)
|
|
|
|
In reftests, the test HTML and reference HTML have different `<head>` content (different numbers of `<link>`, `<meta>`, `<title>` elements), causing DOM node IDs to differ. The layout trees are structurally identical in content/dimensions, but `node=#23` vs `node=#15` caused string comparison to fail.
|
|
|
|
**Fix:** `tests/wpt_harness/runner.rs` — `normalize_node_ids()` replaces `node=#N` with sequential IDs based on order of appearance before comparing layout and display list dumps.
|
|
|
|
### Fix 2: Strip CDATA Markers from Style Text (Done)
|
|
|
|
Many WPT reference HTML files (originally XHTML) wrap CSS in `<![CDATA[...]]>` inside `<style>` tags. The HTML parser extracts raw text including CDATA markers, and the CSS parser received `<![CDATA[ div { color: red; } ]]>` and failed to parse any rules.
|
|
|
|
**Fix:** `crates/style/src/context.rs` — strip `<![CDATA[` and `]]>` from CSS text in `extract_stylesheets()` before parsing.
|
|
|
|
### Fix 3: CSS Unit Conversion — cm, mm, in, pt, pc (Done)
|
|
|
|
The CSS parser only recognized `px`, `em`, `rem`, `%`. Other units (`cm`, `mm`, `in`, `pt`, `pc`) fell through to a default of `Px`, treating `2.54cm` as `2.54px` instead of `96px`.
|
|
|
|
**Fix:**
|
|
1. `crates/css/src/types.rs` — Added `Cm`, `Mm`, `In`, `Pt`, `Pc` variants with `to_px()` conversions
|
|
2. `crates/css/src/parser.rs` — Added unit string matching in dimension parsing and grid template parsing
|
|
|
|
### Results
|
|
|
|
Fixes 1-3 combined promoted **120 tests** from known_fail to pass (36 → 156 total).
|
|
|
|
The original estimates (~485 cumulative) were too optimistic — most tests affected by CDATA or unit issues also have other layout differences that prevent passing. The fixes were necessary prerequisites but not sufficient alone for those tests.
|
|
|
|
### Fix 4: `display: contents` + Implicit `<head>` Insertion (Done)
|
|
|
|
Two changes combined:
|
|
|
|
1. **`display: contents`** — Added `Display::Contents` variant. Elements with this value generate no box; their children are promoted into the parent's layout. Implemented via `build_children_into()` in `crates/layout/src/engine/mod.rs`.
|
|
|
|
2. **Implicit `<head>` insertion** — The HTML parser now creates an implicit `<head>` when encountering head-only elements (`title`, `style`, `link`, `meta`, `script`) without an existing `<head>`. Previously, `<title>` text was rendered visibly in the body for documents lacking explicit `<head>` tags, causing reftest mismatches since test and reference files have different titles. This was the larger win.
|
|
|
|
**Fix:**
|
|
1. `crates/style/src/types.rs` — Added `Display::Contents` variant and keyword parsing
|
|
2. `crates/layout/src/engine/mod.rs` — `build_children_into()` flattens display:contents children
|
|
3. `crates/html/src/lib.rs` — Implicit `<head>` creation and proper head-closing on non-head elements
|
|
|
|
**Results:** 53 tests promoted, 5 false-pass tests demoted (were only passing because both sides had content hidden in `<head>`). Net: **+48 tests** (156 → 204 total).
|
|
|
|
### Fix 5: Pixel-Based Reftest Comparison + Parallel Test Execution (Done)
|
|
|
|
Most WPT reftests use different CSS techniques (borders vs backgrounds, etc.) to achieve the same visual result, making layout-tree text comparison fundamentally unable to match. The fix adds pixel-based comparison as a fallback.
|
|
|
|
**Changes:**
|
|
1. `tests/wpt_harness/runner.rs` — Added `rasterize_html()` and `compare_pixels()` functions. Reftests now try layout-tree comparison first (fast path), then fall back to pixel comparison by rasterizing both test and reference HTML to 800x600 pixel buffers and comparing per-pixel with a channel tolerance of 2.
|
|
2. `tests/wpt_harness.rs` — Parallelized test execution using `std::thread::scope`, with progress reporting every 100 tests. Skips artifact writing for known_fail tests to reduce I/O.
|
|
3. `crates/style/src/context.rs` — Added CDATA stripping to `extract_stylesheet_sources()` (was already in `extract_stylesheets()` but missing from the Pipeline code path).
|
|
|
|
**Results:** 955 tests promoted from known_fail to pass (204 → 1,159 total). Test suite runs in ~11 seconds with parallel execution.
|
|
|
|
### Fix 6: Incremental Engine Improvements (Done — Feb 14 → Feb 25)
|
|
|
|
Multiple engine improvements collectively promoted **128 tests** (1,159 → 1,287). Key changes:
|
|
|
|
- **Canvas background propagation** (CSS 2.1 §14.2) — body background paints at viewport level
|
|
- **Border shorthand fix** — omitted sub-properties now properly reset
|
|
- **Extended border styles** — `inset`, `outset`, `groove`, `ridge`, `double`, `hidden`
|
|
- **Table cell sizing** — height as minimum (CSS 2.1 §17.5.3), column width overflow fix
|
|
- **Linear gradients** — `linear-gradient()` through the full rendering pipeline
|
|
- **Background-position/repeat** — CSS sprite support
|
|
- **Float sizing fixes** — intrinsic aspect ratio, shrink-to-fit width for block children
|
|
- **Block-in-inline splitting** (CSS 2.1 §9.2.1.1) — anonymous block generation
|
|
- **Flex-column fix** — `min-height` with `flex-grow` distribution
|
|
- **CSS cascade ordering** — per CSS Cascading Level 4
|
|
|
|
## Remaining Known-Fail Analysis (1,626 tests)
|
|
|
|
### Breakdown by CSS Specification Area
|
|
|
|
| Category | Count | Key Missing Features |
|
|
|----------|------:|------|
|
|
| **css-flexbox** | 569 | Shrink/grow algorithm, alignment, wrap, gap, writing-modes, intrinsic sizing |
|
|
| **css-text** | 268 | `break-spaces`, `keep-all`, `hyphens`, text-transform tailoring, `tab-size` |
|
|
| **css2-margin-padding** | 154 | Full margin collapsing, margin/padding on table-internal elements |
|
|
| **css-tables** | 91 | `border-collapse` painting, height distribution, `visibility: collapse` |
|
|
| **css2-positioning** | 89 | `top`/`left` on table-internal elements, abspos overflow, relpos edge cases |
|
|
| **css-backgrounds** | 75 | `background-clip`, `border-radius`, `border-image`, border keyword widths |
|
|
| **css2-normal-flow** | 70 | Inline-table, inline-block baseline, `min-width`/`max-width` edge cases |
|
|
| **css2-floats** | 67 | BFC-float exclusion, clearance computation, float+margin collapsing |
|
|
| **css-display** | 59 | `display: run-in` (41), `display: contents` edge cases (15), `flow-root` (3) |
|
|
| **css-position** | 58 | Relative pos on table elements (27), abspos static position in flex (14) |
|
|
| **css-box** | 38 | `margin-trim` (all 38 — unimplemented CSS4 property) |
|
|
| **css2-box-display** | 24 | Block-in-inline edge cases, containing block determination |
|
|
| **css-inline** | 6 | Phantom line boxes |
|
|
| **pseudo-elements** | 1 | `::before` edge case (feature is implemented) |
|
|
|
|
### Detailed Feature Gap Analysis
|
|
|
|
#### 1. Flexbox (569 tests) — Largest Gap
|
|
|
|
The engine has a functional flexbox implementation (~1,764 lines in `crates/layout/src/engine/flex.rs`) with `gap`, `justify-content`, `align-items`, `align-content`, `flex-wrap`, and `flex-direction` support. The failing tests exercise:
|
|
|
|
| Sub-feature | ~Count | What's Missing |
|
|
|---|---:|---|
|
|
| Flex sizing algorithm edge cases | 111 | `flex: initial`/`none`/`auto` shorthand resolution, shrink below min-content |
|
|
| Alignment (`align-items`, `align-self`, `align-content`) | 60 | `baseline` alignment, `stretch` with cross-axis constraints |
|
|
| `justify-content` edge cases | 30 | `space-evenly`, interaction with `margin: auto` |
|
|
| Writing modes (`writing-mode`, `direction: rtl`) | 13 | Flex axis mapping with vertical/RTL writing modes |
|
|
| Gap with writing modes | 33 | Gap in non-default writing modes |
|
|
| Intrinsic sizing | 21 | `min-content`/`max-content` width of flex containers |
|
|
| Table as flex item | 19 | Tables inside flex containers |
|
|
| Baseline alignment | 12 | First/last baseline computation for flex items |
|
|
| Percentage height resolution | 10 | Definite-size propagation through flex items |
|
|
| `aspect-ratio` interaction | 5 | Aspect ratio with flex sizing |
|
|
|
|
#### 2. CSS Text (268 tests)
|
|
|
|
The engine has basic text rendering with `letter-spacing` support. Missing:
|
|
|
|
| Sub-feature | ~Count | Notes |
|
|
|---|---:|---|
|
|
| `white-space: break-spaces` | 79 | Preserved spaces that wrap |
|
|
| `hyphens: auto/manual` | 40 | Language-dependent auto-hyphenation |
|
|
| `word-break: keep-all` | 25 | CJK-aware word breaking |
|
|
| `word-spacing` | 21 | Word spacing with bidi/writing-modes |
|
|
| `text-transform` tailoring | 18 | Language-sensitive capitalization (Dutch IJ, Turkish i) |
|
|
| Line breaking (CJK) | 14 | `line-break: strict/loose/anywhere` |
|
|
| `letter-spacing` bidi | 11 | Letter-spacing after bidi reordering |
|
|
| `text-align-last` | 11 | Last-line alignment, `match-parent`, `justify` |
|
|
| `overflow-wrap: anywhere` | 10 | Wrapping anywhere vs break-word |
|
|
| `text-autospace` | 8 | CJK↔Latin auto-spacing |
|
|
| `tab-size` | 7 | Tab character width |
|
|
| `hanging-punctuation` | 3 | Punctuation outside content box |
|
|
|
|
#### 3. Margin Collapsing & Table-Internal Margins (154 tests)
|
|
|
|
The engine has basic margin collapsing (`collapse_margins()` in `block.rs`). Missing:
|
|
|
|
- **Parent-child through-flow** collapsing (margins pass through empty blocks)
|
|
- **Negative margin** collapsing rules (most negative + most positive)
|
|
- **`min-height` interaction** — doesn't prevent bottom margin adjacency
|
|
- **Clearance interaction** — clear changes which margins are adjoining
|
|
- **"Does not apply" rules** — margins/padding on `table-row-group`, `table-row`, `table-column`, etc. should be ignored (~48 tests)
|
|
|
|
#### 4. Tables (91 tests)
|
|
|
|
The engine has basic table layout with a collapsed borders module (~1,081 lines). Missing:
|
|
|
|
- Collapsed border **paint ordering** (borders paint in background phase)
|
|
- **Height distribution** to row groups (extra height allocation)
|
|
- Abspos inside table cells
|
|
- `visibility: collapse` on rows/columns
|
|
- `box-sizing` interaction with `display: table`
|
|
|
|
#### 5. Positioning (89 css2 + 58 css-position = 147 tests)
|
|
|
|
The engine has `position: absolute/relative/sticky/fixed`. Missing:
|
|
|
|
- `top`/`left`/`right`/`bottom` **application rules for table-internal elements** (~51 tests)
|
|
- Abspos **containing block** for inline-level ancestors
|
|
- Abspos **overflow** handling
|
|
- **Relative positioning of table-internal elements** (td, tr, thead, etc.)
|
|
- Static position of **inline-level abspos in block-level context** (14 tests)
|
|
|
|
#### 6. Backgrounds & Borders (75 tests)
|
|
|
|
The engine has `background-color`, `background-image`, `background-position`, `background-repeat`, `linear-gradient()`, `box-shadow`, and extended border styles. Missing:
|
|
|
|
- `background-clip: content-box/padding-box/text` (17 tests)
|
|
- `border-image` (5 tests)
|
|
- `border-radius` and rounded-corner clipping (3 tests)
|
|
- Border width keywords `thin`/`medium`/`thick` = 1/3/5px (9 tests — may be partially working)
|
|
- Sub-pixel border snapping
|
|
- `background-attachment: fixed/local` (3 tests)
|
|
|
|
#### 7. Normal Flow (70 tests)
|
|
|
|
Block-in-inline splitting is now implemented. Remaining:
|
|
|
|
- `display: inline-table` (11 tests)
|
|
- Inline-block **baseline** computation (9 tests)
|
|
- `min-width`/`max-width`/`min-height`/`max-height` edge cases (17 tests)
|
|
- Inline replaced element sizing (3 tests)
|
|
|
|
#### 8. Floats (67 tests)
|
|
|
|
The engine has float layout with BFC avoidance. Missing:
|
|
|
|
- BFC border boxes must not overlap float margin boxes (29 tests)
|
|
- Complex clearance computation with margin collapsing (16 tests)
|
|
- Float + table BFC interaction
|
|
- Float suppression on abspos elements
|
|
|
|
#### 9. Display (59 tests)
|
|
|
|
- **`display: run-in`** — 41 tests. Run-in boxes merge into the following block as inline content. This is a rarely-used CSS2 feature; most browsers dropped support. Low priority.
|
|
- **`display: contents` edge cases** — 15 tests. Feature is implemented but fails for: `::first-letter`/`::first-line` interaction, `<fieldset>`/`<button>`/`<details>` special behavior, and flex/table-cell containers.
|
|
- **`display: flow-root`** — 3 tests. Not yet parsed.
|
|
|
|
#### 10. CSS Box Model (38 tests)
|
|
|
|
All 38 tests are for **`margin-trim`** — a CSS4 property that trims child margins at container edges. Not yet implemented. Low priority (newer spec, limited browser support).
|
|
|
|
### Cross-Cutting Themes
|
|
|
|
1. **Writing modes** (`writing-mode: vertical-lr/rl`, `direction: rtl`) — affects flexbox, text, gap, positioning. No writing-mode support exists; ~50+ tests across categories.
|
|
|
|
2. **Table-internal element rules** — margins, padding, and position offsets on `table-row-group`, `table-row`, `table-column`, etc. should be ignored per spec. ~75+ tests across margin-padding and positioning categories.
|
|
|
|
3. **Intrinsic sizing** (`min-content`/`max-content`) — affects flexbox intrinsic sizing (21), normal flow `min-width`/`max-width` (17). Partial support exists but edge cases fail.
|
|
|
|
4. **BFC establishment effects** — BFC blocks avoiding float overlap (29), height computation, margin collapsing with clearance (~18).
|
|
|
|
## Priority Recommendations
|
|
|
|
### High-Impact (most tests per effort)
|
|
|
|
1. **Table-internal "does not apply" rules** (~75 tests) — Relatively straightforward: skip margin/padding/position-offset for elements with `display: table-row-group`, `table-row`, `table-column`, `table-column-group`, `table-header-group`, `table-footer-group`.
|
|
|
|
2. **Margin collapsing completeness** (~154 tests) — The full algorithm (CSS 2.1 §8.3.1) handles parent-child, negative margins, `min-height` interaction, and clearance. Complex but high payoff.
|
|
|
|
3. **`background-clip: content-box/padding-box`** (17 tests) — Clip background to content or padding area. Moderate implementation effort.
|
|
|
|
4. **Border width keywords** (9 tests) — Map `thin`→1px, `medium`→3px, `thick`→5px. Trivial fix.
|
|
|
|
5. **`display: flow-root`** (3 tests) — Parse as a BFC-establishing block. Trivial.
|
|
|
|
### Medium-Impact
|
|
|
|
6. **Flexbox algorithm refinements** (569 tests) — Incremental: fix `flex: initial`/`none`, stretch alignment, baseline, then writing-modes. Each sub-fix could promote 10-50 tests.
|
|
|
|
7. **Float/BFC exclusion** (29+ tests) — BFC blocks must not overlap float margins.
|
|
|
|
8. **Collapsed border paint order** (18 tests) — Borders paint in background phase.
|
|
|
|
### Low Priority
|
|
|
|
9. **`display: run-in`** (41 tests) — Dropped by most browsers. Skip.
|
|
10. **`margin-trim`** (38 tests) — CSS4, limited browser support.
|
|
11. **Writing modes** (50+ tests) — Pervasive impact but massive implementation effort.
|
|
|
|
## Summary Table
|
|
|
|
| Fix | Tests Promoted | Status | Cumulative Pass |
|
|
|-----|---------------|--------|-----------------|
|
|
| 1. Node ID normalization | (combined) | Done | — |
|
|
| 2. CDATA stripping | (combined) | Done | — |
|
|
| 3. CSS unit conversion | (combined) | Done | 156 pass |
|
|
| 4. display:contents + implicit head | +48 net | Done | 204 pass |
|
|
| 5. Pixel-based comparison + parallel | +955 | Done | 1,159 pass |
|
|
| 6. Incremental engine improvements | +128 | Done | 1,287 pass |
|
|
| — | — | Remaining | 1,626 known_fail |
|