Re-evaluate the full WPT test suite breakdown with current numbers (+128 tests promoted since Feb 14), add detailed per-category feature gap analysis with sub-feature counts, cross-cutting themes, and prioritized recommendations for highest-impact fixes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
15 KiB
WPT Known-Fail Test Analysis
Last updated: 2026-02-25
Current State
- 1,287 pass, 1,626 known_fail, 1 skip (2,914 total)
- All known_fail tests are reftests that fail pixel comparison — they require CSS features or layout modes the engine doesn't yet support
- Test suite runs in ~12 seconds with parallel execution
Completed Fixes
Fix 1: Node ID Normalization in Reftest Comparison (Done)
In reftests, the test HTML and reference HTML have different <head> content (different numbers of <link>, <meta>, <title> elements), causing DOM node IDs to differ. The layout trees are structurally identical in content/dimensions, but node=#23 vs node=#15 caused string comparison to fail.
Fix: tests/wpt_harness/runner.rs — normalize_node_ids() replaces node=#N with sequential IDs based on order of appearance before comparing layout and display list dumps.
Fix 2: Strip CDATA Markers from Style Text (Done)
Many WPT reference HTML files (originally XHTML) wrap CSS in <![CDATA[...]]> inside <style> tags. The HTML parser extracts raw text including CDATA markers, and the CSS parser received <![CDATA[ div { color: red; } ]]> and failed to parse any rules.
Fix: crates/style/src/context.rs — strip <![CDATA[ and ]]> from CSS text in extract_stylesheets() before parsing.
Fix 3: CSS Unit Conversion — cm, mm, in, pt, pc (Done)
The CSS parser only recognized px, em, rem, %. Other units (cm, mm, in, pt, pc) fell through to a default of Px, treating 2.54cm as 2.54px instead of 96px.
Fix:
crates/css/src/types.rs— AddedCm,Mm,In,Pt,Pcvariants withto_px()conversionscrates/css/src/parser.rs— Added unit string matching in dimension parsing and grid template parsing
Results
Fixes 1-3 combined promoted 120 tests from known_fail to pass (36 → 156 total).
The original estimates (~485 cumulative) were too optimistic — most tests affected by CDATA or unit issues also have other layout differences that prevent passing. The fixes were necessary prerequisites but not sufficient alone for those tests.
Fix 4: display: contents + Implicit <head> Insertion (Done)
Two changes combined:
-
display: contents— AddedDisplay::Contentsvariant. Elements with this value generate no box; their children are promoted into the parent's layout. Implemented viabuild_children_into()incrates/layout/src/engine/mod.rs. -
Implicit
<head>insertion — The HTML parser now creates an implicit<head>when encountering head-only elements (title,style,link,meta,script) without an existing<head>. Previously,<title>text was rendered visibly in the body for documents lacking explicit<head>tags, causing reftest mismatches since test and reference files have different titles. This was the larger win.
Fix:
crates/style/src/types.rs— AddedDisplay::Contentsvariant and keyword parsingcrates/layout/src/engine/mod.rs—build_children_into()flattens display:contents childrencrates/html/src/lib.rs— Implicit<head>creation and proper head-closing on non-head elements
Results: 53 tests promoted, 5 false-pass tests demoted (were only passing because both sides had content hidden in <head>). Net: +48 tests (156 → 204 total).
Fix 5: Pixel-Based Reftest Comparison + Parallel Test Execution (Done)
Most WPT reftests use different CSS techniques (borders vs backgrounds, etc.) to achieve the same visual result, making layout-tree text comparison fundamentally unable to match. The fix adds pixel-based comparison as a fallback.
Changes:
tests/wpt_harness/runner.rs— Addedrasterize_html()andcompare_pixels()functions. Reftests now try layout-tree comparison first (fast path), then fall back to pixel comparison by rasterizing both test and reference HTML to 800x600 pixel buffers and comparing per-pixel with a channel tolerance of 2.tests/wpt_harness.rs— Parallelized test execution usingstd::thread::scope, with progress reporting every 100 tests. Skips artifact writing for known_fail tests to reduce I/O.crates/style/src/context.rs— Added CDATA stripping toextract_stylesheet_sources()(was already inextract_stylesheets()but missing from the Pipeline code path).
Results: 955 tests promoted from known_fail to pass (204 → 1,159 total). Test suite runs in ~11 seconds with parallel execution.
Fix 6: Incremental Engine Improvements (Done — Feb 14 → Feb 25)
Multiple engine improvements collectively promoted 128 tests (1,159 → 1,287). Key changes:
- Canvas background propagation (CSS 2.1 §14.2) — body background paints at viewport level
- Border shorthand fix — omitted sub-properties now properly reset
- Extended border styles —
inset,outset,groove,ridge,double,hidden - Table cell sizing — height as minimum (CSS 2.1 §17.5.3), column width overflow fix
- Linear gradients —
linear-gradient()through the full rendering pipeline - Background-position/repeat — CSS sprite support
- Float sizing fixes — intrinsic aspect ratio, shrink-to-fit width for block children
- Block-in-inline splitting (CSS 2.1 §9.2.1.1) — anonymous block generation
- Flex-column fix —
min-heightwithflex-growdistribution - CSS cascade ordering — per CSS Cascading Level 4
Remaining Known-Fail Analysis (1,626 tests)
Breakdown by CSS Specification Area
| Category | Count | Key Missing Features |
|---|---|---|
| css-flexbox | 569 | Shrink/grow algorithm, alignment, wrap, gap, writing-modes, intrinsic sizing |
| css-text | 268 | break-spaces, keep-all, hyphens, text-transform tailoring, tab-size |
| css2-margin-padding | 154 | Full margin collapsing, margin/padding on table-internal elements |
| css-tables | 91 | border-collapse painting, height distribution, visibility: collapse |
| css2-positioning | 89 | top/left on table-internal elements, abspos overflow, relpos edge cases |
| css-backgrounds | 75 | background-clip, border-radius, border-image, border keyword widths |
| css2-normal-flow | 70 | Inline-table, inline-block baseline, min-width/max-width edge cases |
| css2-floats | 67 | BFC-float exclusion, clearance computation, float+margin collapsing |
| css-display | 59 | display: run-in (41), display: contents edge cases (15), flow-root (3) |
| css-position | 58 | Relative pos on table elements (27), abspos static position in flex (14) |
| css-box | 38 | margin-trim (all 38 — unimplemented CSS4 property) |
| css2-box-display | 24 | Block-in-inline edge cases, containing block determination |
| css-inline | 6 | Phantom line boxes |
| pseudo-elements | 1 | ::before edge case (feature is implemented) |
Detailed Feature Gap Analysis
1. Flexbox (569 tests) — Largest Gap
The engine has a functional flexbox implementation (~1,764 lines in crates/layout/src/engine/flex.rs) with gap, justify-content, align-items, align-content, flex-wrap, and flex-direction support. The failing tests exercise:
| Sub-feature | ~Count | What's Missing |
|---|---|---|
| Flex sizing algorithm edge cases | 111 | flex: initial/none/auto shorthand resolution, shrink below min-content |
Alignment (align-items, align-self, align-content) |
60 | baseline alignment, stretch with cross-axis constraints |
justify-content edge cases |
30 | space-evenly, interaction with margin: auto |
Writing modes (writing-mode, direction: rtl) |
13 | Flex axis mapping with vertical/RTL writing modes |
| Gap with writing modes | 33 | Gap in non-default writing modes |
| Intrinsic sizing | 21 | min-content/max-content width of flex containers |
| Table as flex item | 19 | Tables inside flex containers |
| Baseline alignment | 12 | First/last baseline computation for flex items |
| Percentage height resolution | 10 | Definite-size propagation through flex items |
aspect-ratio interaction |
5 | Aspect ratio with flex sizing |
2. CSS Text (268 tests)
The engine has basic text rendering with letter-spacing support. Missing:
| Sub-feature | ~Count | Notes |
|---|---|---|
white-space: break-spaces |
79 | Preserved spaces that wrap |
hyphens: auto/manual |
40 | Language-dependent auto-hyphenation |
word-break: keep-all |
25 | CJK-aware word breaking |
word-spacing |
21 | Word spacing with bidi/writing-modes |
text-transform tailoring |
18 | Language-sensitive capitalization (Dutch IJ, Turkish i) |
| Line breaking (CJK) | 14 | line-break: strict/loose/anywhere |
letter-spacing bidi |
11 | Letter-spacing after bidi reordering |
text-align-last |
11 | Last-line alignment, match-parent, justify |
overflow-wrap: anywhere |
10 | Wrapping anywhere vs break-word |
text-autospace |
8 | CJK↔Latin auto-spacing |
tab-size |
7 | Tab character width |
hanging-punctuation |
3 | Punctuation outside content box |
3. Margin Collapsing & Table-Internal Margins (154 tests)
The engine has basic margin collapsing (collapse_margins() in block.rs). Missing:
- Parent-child through-flow collapsing (margins pass through empty blocks)
- Negative margin collapsing rules (most negative + most positive)
min-heightinteraction — doesn't prevent bottom margin adjacency- Clearance interaction — clear changes which margins are adjoining
- "Does not apply" rules — margins/padding on
table-row-group,table-row,table-column, etc. should be ignored (~48 tests)
4. Tables (91 tests)
The engine has basic table layout with a collapsed borders module (~1,081 lines). Missing:
- Collapsed border paint ordering (borders paint in background phase)
- Height distribution to row groups (extra height allocation)
- Abspos inside table cells
visibility: collapseon rows/columnsbox-sizinginteraction withdisplay: table
5. Positioning (89 css2 + 58 css-position = 147 tests)
The engine has position: absolute/relative/sticky/fixed. Missing:
top/left/right/bottomapplication rules for table-internal elements (~51 tests)- Abspos containing block for inline-level ancestors
- Abspos overflow handling
- Relative positioning of table-internal elements (td, tr, thead, etc.)
- Static position of inline-level abspos in block-level context (14 tests)
6. Backgrounds & Borders (75 tests)
The engine has background-color, background-image, background-position, background-repeat, linear-gradient(), box-shadow, and extended border styles. Missing:
background-clip: content-box/padding-box/text(17 tests)border-image(5 tests)border-radiusand rounded-corner clipping (3 tests)- Border width keywords
thin/medium/thick= 1/3/5px (9 tests — may be partially working) - Sub-pixel border snapping
background-attachment: fixed/local(3 tests)
7. Normal Flow (70 tests)
Block-in-inline splitting is now implemented. Remaining:
display: inline-table(11 tests)- Inline-block baseline computation (9 tests)
min-width/max-width/min-height/max-heightedge cases (17 tests)- Inline replaced element sizing (3 tests)
8. Floats (67 tests)
The engine has float layout with BFC avoidance. Missing:
- BFC border boxes must not overlap float margin boxes (29 tests)
- Complex clearance computation with margin collapsing (16 tests)
- Float + table BFC interaction
- Float suppression on abspos elements
9. Display (59 tests)
display: run-in— 41 tests. Run-in boxes merge into the following block as inline content. This is a rarely-used CSS2 feature; most browsers dropped support. Low priority.display: contentsedge cases — 15 tests. Feature is implemented but fails for:::first-letter/::first-lineinteraction,<fieldset>/<button>/<details>special behavior, and flex/table-cell containers.display: flow-root— 3 tests. Not yet parsed.
10. CSS Box Model (38 tests)
All 38 tests are for margin-trim — a CSS4 property that trims child margins at container edges. Not yet implemented. Low priority (newer spec, limited browser support).
Cross-Cutting Themes
-
Writing modes (
writing-mode: vertical-lr/rl,direction: rtl) — affects flexbox, text, gap, positioning. No writing-mode support exists; ~50+ tests across categories. -
Table-internal element rules — margins, padding, and position offsets on
table-row-group,table-row,table-column, etc. should be ignored per spec. ~75+ tests across margin-padding and positioning categories. -
Intrinsic sizing (
min-content/max-content) — affects flexbox intrinsic sizing (21), normal flowmin-width/max-width(17). Partial support exists but edge cases fail. -
BFC establishment effects — BFC blocks avoiding float overlap (29), height computation, margin collapsing with clearance (~18).
Priority Recommendations
High-Impact (most tests per effort)
-
Table-internal "does not apply" rules (~75 tests) — Relatively straightforward: skip margin/padding/position-offset for elements with
display: table-row-group,table-row,table-column,table-column-group,table-header-group,table-footer-group. -
Margin collapsing completeness (~154 tests) — The full algorithm (CSS 2.1 §8.3.1) handles parent-child, negative margins,
min-heightinteraction, and clearance. Complex but high payoff. -
background-clip: content-box/padding-box(17 tests) — Clip background to content or padding area. Moderate implementation effort. -
Border width keywords (9 tests) — Map
thin→1px,medium→3px,thick→5px. Trivial fix. -
display: flow-root(3 tests) — Parse as a BFC-establishing block. Trivial.
Medium-Impact
-
Flexbox algorithm refinements (569 tests) — Incremental: fix
flex: initial/none, stretch alignment, baseline, then writing-modes. Each sub-fix could promote 10-50 tests. -
Float/BFC exclusion (29+ tests) — BFC blocks must not overlap float margins.
-
Collapsed border paint order (18 tests) — Borders paint in background phase.
Low Priority
display: run-in(41 tests) — Dropped by most browsers. Skip.margin-trim(38 tests) — CSS4, limited browser support.- Writing modes (50+ tests) — Pervasive impact but massive implementation effort.
Summary Table
| Fix | Tests Promoted | Status | Cumulative Pass |
|---|---|---|---|
| 1. Node ID normalization | (combined) | Done | — |
| 2. CDATA stripping | (combined) | Done | — |
| 3. CSS unit conversion | (combined) | Done | 156 pass |
| 4. display:contents + implicit head | +48 net | Done | 204 pass |
| 5. Pixel-based comparison + parallel | +955 | Done | 1,159 pass |
| 6. Incremental engine improvements | +128 | Done | 1,287 pass |
| — | — | Remaining | 1,626 known_fail |