Files
rust_browser/docs/HTML5_Implementation_Checklist.md
Zachary D. Rowitsch 56453a5677
All checks were successful
ci / fast (linux) (push) Successful in 6m58s
Add <col>/<colgroup> column width hints and span support (CSS 2.1 §17)
Extract width and span metadata from <col>/<colgroup> DOM elements and
apply them as column width hints during table layout. This enables
authors to declaratively set column widths via HTML attributes or CSS
without relying on cell content.

Key changes:
- UA stylesheet: col/colgroup get display:none (metadata-only, no boxes)
- HTML attrs: dedicated width-only extraction for col/colgroup elements
- ColumnInfo struct on LayoutBox stores per-column width hints
- DOM walker extracts col/colgroup metadata with span expansion and
  colgroup-to-col width inheritance (CSS 2.1 §17.3)
- Width hints integrated into both intrinsic and available-width table
  sizing algorithms with correct fixed-before-percentage ordering
- HTML parser: implicit colgroup closing by tr/tbody/thead/tfoot

Deferred: column backgrounds, visibility:collapse, column borders in
collapsed border model.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 10:57:39 -05:00

15 KiB
Raw Permalink Blame History

Full HTML5 Support Checklist (Browser Engine)

Checked items in this document mean end-to-end support in the current tree, not parser-only or DOM-only support. If parsing exists but styling, layout, rendering, runtime behavior, or web-exposed APIs are missing, leave the item unchecked and note the missing stages.

Phase 0: Spec Targets & Test Harness

  • Define “HTML5” target: WHATWG HTML Living Standard (recommended) vs W3C HTML5 snapshot
  • Add Web Platform Tests (WPT) runner integration for HTML (parse + DOM + rendering)
  • Add layout/reftest harness for HTML rendering features (with known-fail list)
  • Add conformance reporting: pass rate by area (parsing/DOM/forms/media/etc.)

Phase 1: HTML Parsing (Tokenizer + Tree Builder)

  • Implement HTML tokenization (all states, incl. script/rawtext/rcdata/plaintext)
    • Current status: simplified tokenizer exists, including comments, entities, attributes, and raw-text handling for script/style.
    • Missing pipeline steps: full HTML5 tokenizer state machine, including RCDATA (title/textarea), plaintext, and script-data escape states.
  • Implement tree builder insertion modes:
    • Current status: simplified implicit html/head/body creation plus limited in body and table heuristics.
    • initial / before html / before head / in head / in head noscript
    • after head / in body / text
    • in table / in table text / in caption / in column group / in table body / in row / in cell
    • in select / in select in table
    • in template / after body / in frameset / after frameset / after after body / after after frameset
  • Foster parenting rules for tables
  • Adoption agency algorithm (mis-nested formatting elements)
  • Handling of void elements (no end tags)
  • Optional end tags rules
    • Current status: implemented for p and a subset of table tags (td/th/tr/tbody/thead/tfoot).
    • Missing pipeline steps: li, dt, dd, option, optgroup, body, html, and other optional-end-tag cases.
  • Parse errors behavior matches spec (recover, dont crash)
    • Current status: parser is resilient and has input/token/nesting limits.
    • Missing pipeline steps: spec-accurate parse-error handling and recovery behavior.
  • Fragment parsing for:
    • innerHTML
      • Current status: Element.innerHTML getter/setter is wired to fragment parse/serialize.
      • Missing pipeline steps: spec-accurate context-sensitive fragment parsing.
    • Range/createContextualFragment
    • template contents

Phase 2: DOM Core + HTML DOM Integration

  • DOM tree: Node/Element/Document/DocumentFragment/Text/Comment
    • Current status: Document, Element, Text, and Comment exist.
    • Missing pipeline steps: DocumentFragment.
  • DOM mutation algorithms (append/insert/remove/replace)
    • Current status: append/remove/text replacement exist.
    • Missing pipeline steps: insertBefore/replaceChild plumbing in DOM and JS host APIs.
  • Live NodeLists/HTMLCollections where required
  • DOMTokenList (classList, relList, etc.)
  • Attributes: NamedNodeMap semantics + namespace-aware APIs
    • Current status: basic get/set attribute storage exists.
    • Missing pipeline steps: NamedNodeMap, namespace-aware APIs, and removal/reflection surface.
  • Custom element name validation plumbing (even if CE is later)
  • Document/Element query APIs:
    • getElementById/getElementsByTagName/getElementsByClassName
      • Current status: core DOM supports these, but only document.getElementById is exposed to JS.
    • querySelector/querySelectorAll
    • matches/closest
  • HTML document quirks:
    • quirks mode / limited-quirks / standards mode from doctype
    • case-insensitive tag/attr behavior in HTML documents
      • Current status: parser lowercases HTML tag/attribute names.
      • Missing pipeline steps: full HTML DOM case-insensitive attribute/query behavior.

Phase 3: Events (Needed for HTML)

  • EventTarget + add/removeEventListener
  • Event dispatch (capture/target/bubble)
    • Current status: target + bubble are implemented for click dispatch.
    • Missing pipeline steps: capture phase.
  • Default actions + preventDefault/cancelable rules
    • Current status: preventDefault() works and app-level default actions are suppressed.
    • Missing pipeline steps: broader DOM-native default-action coverage beyond app-browser click handling.
  • Keyboard/mouse basic events used by forms/links/focus
    • Current status: click is implemented.
    • Missing pipeline steps: keyboard, broader mouse, and focus event families.
  • Focus management:
    • activeElement
    • focus/blur events
    • tab navigation ordering basics

Phase 4: Core Document Lifecycle

  • Navigation primitives: load URL → fetch → parse → commit Document
  • Document readiness:
    • DOMContentLoaded
    • load event
    • readyState transitions
  • Resource loading for:
    • <script src>
  • Base URL resolution:
    • document.baseURI
      • Current status: URL resolution uses the page URL for scripts, styles, images, and forms.
      • Missing pipeline steps: <base href> integration and document.baseURI.

Phase 5: Scripting & Script Loading

  • <script> inline execution
  • <script src> external execution
  • async scripts
  • defer scripts
  • module scripts (optional if strictly “HTML5-era” only; recommended for modern web)
  • document.write() (including parser insertion + blocking semantics)
  • noscript behavior when scripting disabled

Phase 6: HTML Elements (Parsing + DOM + Default Behaviors)

6.1 Document metadata / structure

  • html, head, body
  • title, base, link, meta, style
    • Current status: link and style are integrated into stylesheet loading; style is hidden in rendering.
    • Missing pipeline steps: title document integration, <base> behavior, and metadata APIs.
  • template (contents DocumentFragment + inert parsing)

6.2 Sectioning / semantics

  • header, footer, main, nav, section, article, aside
  • h1h6, hgroup (legacy), address
  • div, span

6.3 Text content

  • p, pre, blockquote
  • ol, ul, li
  • dl, dt, dd
  • figure, figcaption
  • hr

6.4 Inline text semantics

  • a (linking + navigation)
  • em, strong, small, s
  • cite, q
    • Current status: generic inline rendering exists; cite gets UA italics.
    • Missing pipeline steps: q quotation/default behavior.
  • dfn, abbr
  • data, time
  • code, var, samp, kbd
  • sub, sup
  • i, b, u
  • mark
  • bdi, bdo, ruby, rt, rp
  • br, wbr
    • Current status: br is parsed as a void element.
    • Missing pipeline steps: verified line-break behavior for br and soft-wrap behavior for wbr.

6.5 Edits

  • ins, del

6.6 Embedded content

  • img (+ srcset/sizes optional but expected today)
  • picture (optional but strongly recommended)
  • iframe
  • embed, object, param
    • Current status: object[data] can render image payloads and otherwise falls back to children.
    • Missing pipeline steps: embed, param, and non-image object behavior.
  • video, audio, source, track (see Phase 10)
  • canvas (see Phase 11)
  • svg integration points (optional but expected)
  • math integration points (optional)

6.7 Tabular data

  • table, caption, colgroup, col, tbody, thead, tfoot, tr, td, th
    • Current status: table, caption, colgroup, col, tbody, thead, tfoot, tr, td, and th are end-to-end through layout and paint.
    • colgroup/col support: width hints and span attributes. Deferred: column backgrounds, visibility: collapse, column borders in collapsed model.

6.8 Forms content (see Phase 8/9)

  • form, label, input, button, select, datalist, optgroup, option, textarea
    • Current status: form, label, input, button, select, optgroup, option, and textarea have partial parse/layout/runtime support.
    • Missing pipeline steps: datalist, full control behavior, full submission semantics, and validation plumbing.
  • fieldset, legend, output, progress, meter
    • Current status: fieldset/legend have UA styling and generic rendering.
    • Missing pipeline steps: output, progress, and meter behavior/rendering.

6.9 Interactive

  • details, summary
  • dialog (optional but common)
  • menu (largely obsolete; safe parse/support minimal)

Phase 7: Attributes, Reflecting, and “Content Attributes”

  • Global attributes: id, class, style, title, lang, dir, hidden, tabindex, contenteditable, draggable, spellcheck
    • Current status: id, class, and style are wired; lang/dir participate in selector matching; contenteditable is selector-only.
    • Missing pipeline steps: hidden, tabindex, draggable, spellcheck, and web-exposed behavior/reflection.
  • URL attributes resolution (href/src/action/poster/etc.)
    • Current status: href/src/action resolve against the page URL in navigation/resource-loading paths.
    • Missing pipeline steps: <base> integration and broader DOM/IDL URL reflection.
  • Boolean attribute semantics (present/absent)
  • Reflecting IDL attributes for major elements (e.g., HTMLAnchorElement.href)
  • dataset support (data-* attributes)

Phase 8: Forms (Structure + Submission)

  • Form ownership rules (including form attribute)
  • Successful controls rules
  • Form submission:
    • GET
    • POST (application/x-www-form-urlencoded)
    • multipart/form-data (file inputs)
    • submit event + preventDefault
      • Current status: button/input clicks can trigger GET/POST submission, and click preventDefault() suppresses default submission.
      • Missing pipeline steps: actual submit event dispatch and full successful-controls processing.
  • Form-associated custom elements plumbing (optional if CE implemented)

Phase 9: Constraint Validation & Input Types

  • constraint validation API (checkValidity/reportValidity/validity)
  • required/disabled/readonly handling
  • pattern/min/max/step/minlength/maxlength
  • Input types:
    • text, search, tel, url, email, password
    • checkbox, radio
    • submit, reset, button
    • number, range
    • date, time, datetime-local, month, week
    • color
    • file
    • hidden
  • select/option default selection rules
    • Current status: collapsed <select> rendering picks the selected option or the first option.
    • Missing pipeline steps: full DOM/value/selection behavior.
  • textarea value/selection APIs
  • datalist suggestions (optional UI, but DOM behavior required)

Phase 10: Media Elements (Audio/Video)

  • element core
  • Media resource selection algorithm (source elements + type sniffing)
  • Playback state machine (paused/seeking/ended)
  • Media events (play/pause/timeupdate/canplay/etc.)
  • Controls attribute (basic UI optional; API required)
  • track element parsing + TextTrack plumbing (can start minimal)

Phase 11: Canvas

  • canvas element sizing + fallback content
  • 2D context:
    • path APIs (beginPath/moveTo/lineTo/arc/rect/etc.)
    • fill/stroke styles
    • transforms
    • text drawing (fillText/strokeText/measureText)
    • image drawing (drawImage)
    • pixel APIs (getImageData/putImageData)
  • toDataURL / toBlob

Phase 12: Navigation, History, and Location

  • Location API (href/assign/replace/reload)
  • History API:
    • back/forward/go
      • Current status: internal browsing-context history exists for the app shell.
      • Missing pipeline steps: web-exposed window.history API.
    • pushState/replaceState
    • popstate event
  • Fragment navigation (#hash) + scroll to element
  • target=_blank/window browsing context basics (can be single-window at first, but model should exist)

Phase 13: Loading, Preload Scanner, and Fetch Integration (HTML-facing)

  • Resource prioritization basics (parser-blocking vs async)
    • Current status: classic scripts are loaded/executed synchronously after parsing.
    • Missing pipeline steps: explicit parser-blocking/async/defer scheduling model.
  • preload scanner for //<script> (optional early, important later)
  • CORS mode plumbing for script/img/media where applicable (even if enforcement is minimal initially)

Phase 14: Security Model Basics (HTML-level)

  • Same-origin policy hooks for:
    • iframe access
    • window.opener relationships
  • sandbox attribute parsing + enforcement hooks (partial ok early, full later)
  • Content Security Policy integration hooks (optional but common)
  • Referrer policy (optional but common)
    • Current status: dangerous URL schemes are blocked for links/forms.
    • Missing pipeline steps: same-origin, sandbox, CSP, and referrer-policy model.

Phase 15: Internationalization & Text Direction

  • lang/dir propagation
    • Current status: :lang() and :dir() selector matching works via inherited attribute lookup.
    • Missing pipeline steps: actual layout/rendering text-direction propagation.
  • dir=auto behavior (optional but helpful)
  • basic bidi text rendering integration (with CSS direction/unicode-bidi)

Phase 16: Editing APIs (Optional but often expected)

  • contenteditable basics (DOM behavior)
    • Current status: only selector matching/plumbing exists.
    • Missing pipeline steps: editing behavior and DOM interaction model.
  • execCommand is legacy (safe ignore), but ensure pages dont crash
  • selection + ranges (see below)

Phase 17: Selection and Ranges (Needed for many pages)

  • Range API (createRange/setStart/setEnd/cloneContents/etc.)
  • Selection API (window.getSelection, ranges, basic editing interactions)
  • caret browsing basics (optional)

Phase 18: Web Components (Not “HTML5 classic”, but modern “full HTML” expectation)

  • Custom Elements v1 (define/upgrade/lifecycle)
  • Shadow DOM (attachShadow, slots)
  • HTML templates + cloning integration
  • Scoped event retargeting + composed paths

Phase 19: Storage & Offline (Commonly expected)

  • Web Storage: localStorage/sessionStorage
  • IndexedDB (bigger; optional but strongly expected for modern web)
  • Service Workers (large; optional unless targeting modern “full web”)

Phase 20: Workers & Messaging (Commonly expected)

  • postMessage between windows/frames
  • Web Workers (DedicatedWorker)
  • MessageChannel/MessagePort

Phase 21: Final “Full HTML5” Exit Criteria

  • HTML parser passes WPT parsing tests at target threshold
  • DOM + Events pass core WPT suites at target threshold
  • Forms + validation pass WPT at target threshold
  • Media/Canvas pass WPT at target threshold (or documented exclusions)
  • No-crash guarantee on malformed HTML/DOM operations
  • Publish a conformance report with pass rates + remaining gaps