Files
rust_browser/_bmad-output/implementation-artifacts/2-5-dom-query-apis-and-live-collections.md
Zachary D. Rowitsch 48313ea109 Implement DOM query APIs and live collections with code review fixes (§4.4, §4.2.6)
Add querySelector, querySelectorAll, getElementsByTagName, and getElementsByClassName
on Document, Element, and DocumentFragment. Live HTMLCollections re-evaluate on every
access. Code review fixed: collection persistence across script invocations, multi-class
getElementsByClassName matching, DocumentFragment query support, and added 17 integration tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 23:53:40 -04:00

16 KiB

Story 2.5: DOM Query APIs & Live Collections

Status: done

Story

As a web developer using JavaScript, I want to query the DOM by CSS selectors and access live element collections, So that scripts can efficiently find and track elements in the document.

Acceptance Criteria

  1. document.querySelector(selector) with a valid CSS selector returns the first matching element, or null if no match exists.

  2. document.querySelectorAll(selector) returns a static NodeList of all matching elements in document order.

  3. document.getElementsByTagName(name) and document.getElementsByClassName(name) return live HTMLCollections that automatically reflect DOM changes (elements added/removed).

  4. Complex CSS selectors (combinators, pseudo-classes like :first-child, :nth-child, attribute selectors) work correctly with querySelector/querySelectorAll, using the existing selector matching engine.

  5. Integration tests verify query APIs and live collection behavior. DOM checklist updated. just ci passes.

Tasks / Subtasks

  • Task 1: Wire querySelector/querySelectorAll to JavaScript (AC: #1, #2, #4)

    • 1.1 Add selectors crate as a dependency of web_api (if not already) — the SelectorEngine with query() and query_all() methods already exists and works
    • 1.2 Wire "querySelector" in host_environment.rs call_method for "Document" type:
      • Extract selector string from args
      • Parse with Selector::parse(selector_text)
      • Call SelectorEngine::query(doc, &selector) → returns Option<NodeId>
      • Return JsValue::HostObject or JsValue::Null
    • 1.3 Wire "querySelectorAll" in call_method for "Document":
      • Parse selector, call SelectorEngine::query_all(doc, &selector) → returns Vec<NodeId>
      • Return JsValue::Array of JsValue::HostObject elements (static snapshot — spec says querySelectorAll returns a static NodeList)
    • 1.4 Wire "querySelector" and "querySelectorAll" on "Element" type too — scope the query to descendants of that element (filter query_all results to descendants, or iterate element's subtree)
    • 1.5 Handle invalid selectors: if Selector::parse() returns None, return Err(RuntimeError) with SyntaxError message
    • 1.6 Unit tests: querySelector for tag, class, id, combinators, pseudo-classes; querySelectorAll returning multiple results; scoped element queries; invalid selector error
  • Task 2: Implement HTMLCollection for live collections (AC: #3)

    • 2.1 Create a LiveCollection type (in web_api or dom crate) that stores the query parameters:
      enum LiveCollectionQuery {
          TagName(String),
          ClassName(String),
      }
      struct LiveCollection {
          root: NodeId,  // scoping root (document or element)
          query: LiveCollectionQuery,
      }
      
    • 2.2 Implement LiveCollection::evaluate(&self, doc: &Document) -> Vec<NodeId> — re-queries the document on each call using existing get_elements_by_tag_name() / get_elements_by_class_name()
    • 2.3 Store live collections in DomHost with a registry: collections: Vec<LiveCollection> indexed by collection ID
    • 2.4 Wire "getElementsByTagName" in call_method for "Document":
      • Create LiveCollection with TagName(tag) query
      • Store in collections registry
      • Return JsValue::HostObject { id: collection_id, type_name: "HTMLCollection" }
    • 2.5 Wire "getElementsByClassName" similarly
    • 2.6 Handle "HTMLCollection" in get_property:
      • "length" → evaluate collection, return count as JsValue::Number
      • Numeric index (e.g., "0", "1") → evaluate collection, return element at index or JsValue::Undefined
    • 2.7 Handle "HTMLCollection" in call_method:
      • "item" → evaluate collection, return element at index arg or JsValue::Null
    • 2.8 Also wire getElementsByTagName/getElementsByClassName on "Element" type — scope to descendants
  • Task 3: Expose existing query methods not yet in JS (AC: #3)

    • 3.1 Verify "getElementById" is already wired (it is — just confirm still works)
    • 3.2 Add "Element" support for querySelector/querySelectorAll if not done in Task 1.4
    • 3.3 Ensure getElementsByTagName("*") returns all elements (wildcard per spec)
  • Task 4: Tests and documentation (AC: #5)

    • 4.1 JS binding tests for querySelector: simple selectors, combinators, pseudo-classes, no-match returns null
    • 4.2 JS binding tests for querySelectorAll: multiple matches in document order, empty results
    • 4.3 JS binding tests for live collections:
      • Get collection, check length
      • Add element to DOM, re-check length (should increase)
      • Remove element from DOM, re-check length (should decrease)
    • 4.4 Integration test: JS script that queries DOM, modifies it, verifies live collection updates
    • 4.5 Golden test if query results affect rendering (unlikely but check)
    • 4.6 Update docs/HTML5_Implementation_Checklist.md — check off querySelector, querySelectorAll, live collection items
    • 4.7 Run just ci and ensure all tests pass

Dev Notes

The Selector Engine Already Exists

The selectors crate (crates/selectors/src/) provides a complete, battle-tested CSS selector matching engine. Do NOT reimplement selector matching. The engine already supports:

  • All simple selectors: tag, class, id, universal, all attribute selector variants
  • All combinators: descendant (space), child (>), adjacent sibling (+), general sibling (~)
  • 30+ pseudo-classes: :first-child, :last-child, :nth-child(An+B), :not(), :is(), :where(), :has(), :empty, :checked, :disabled, :enabled, :root, :lang(), :dir(), and more
  • Pseudo-elements: ::before, ::after, ::first-line, ::first-letter

Key API in crates/selectors/src/engine.rs:

impl SelectorEngine {
    pub fn query(&self, doc: &Document, selector: &Selector) -> Option<NodeId>
    pub fn query_all(&self, doc: &Document, selector: &Selector) -> Vec<NodeId>
    pub fn matches(&self, doc: &Document, node_id: NodeId, selector: &Selector) -> bool
}

Selector parsing in crates/selectors/src/selector.rs:

impl Selector {
    pub fn parse(input: &str) -> Option<Selector>
}

Element-Scoped Queries

querySelector and querySelectorAll on an Element should only return descendants of that element. Two approaches:

Option A (simple): Use SelectorEngine::query_all() on the full document, then filter to descendants of the element. Requires a is_descendant_of(node, ancestor, doc) helper.

Option B (efficient): Implement a subtree iterator on Document and use selector.matches() against each descendant. Better for large documents.

Choose Option B if performance matters, Option A for simplicity. The existing iter_tree() iterates the full document — a scoped variant iterating from a specific root would be ideal.

Live Collection Design

Live collections in browsers re-evaluate on every access. The simplest correct implementation:

// On each .length or .item(n) access:
fn evaluate(&self, doc: &Document) -> Vec<NodeId> {
    match &self.query {
        LiveCollectionQuery::TagName(tag) => doc.get_elements_by_tag_name(tag),
        LiveCollectionQuery::ClassName(cls) => doc.get_elements_by_class_name(cls),
    }
}

This is correct and matches browser behavior. Performance optimization (caching with dirty-flag invalidation) is unnecessary for now — real browsers also re-evaluate on access for correctness.

Collection storage: The DomHost struct needs a Vec<LiveCollection> to store collections by ID. Collection IDs must NOT conflict with NodeId values — use a separate ID space (e.g., offset by u64::MAX / 2 or use a separate HostObject type discriminated by type_name).

Static vs Live Return Types

Per spec:

  • querySelectorAll() → static NodeList (snapshot, doesn't update)
  • getElementsByTagName() → live HTMLCollection (updates automatically)
  • getElementsByClassName() → live HTMLCollection (updates automatically)
  • querySelector() → single Element or null (no collection)

For querySelectorAll, returning a JsValue::Array of elements is sufficient — it's a snapshot by definition.

Architecture Constraints

  • Layer 1: selectors crate is Layer 1, same as dom — cross-dependency is allowed
  • web_api already depends on dom — adding selectors dependency is fine (both Layer 1)
  • No unsafe — enforced by CI
  • Spec citations// DOM §4.4 for querySelector, // HTML §4.10 for getElementsBy*

What NOT to Change

  • Do NOT modify the selector matching engine — it's complete and working
  • Do NOT add matches() JS method on Element — not in scope (could be added later)
  • Do NOT implement NodeList as a separate type — for querySelectorAll, a JS Array suffices
  • Do NOT add closest(), contains(), compareDocumentPosition() — future work

Files to Modify

  • crates/web_api/src/dom_host/host_environment.rs — wire querySelector, querySelectorAll, getElementsByTagName, getElementsByClassName to JS; handle HTMLCollection type
  • crates/web_api/src/dom_host/ — possibly new file live_collection.rs for LiveCollection type
  • crates/web_api/Cargo.toml — add selectors dependency if not present
  • crates/web_api/src/dom_host/tests/dom_tests.rs — JS binding tests
  • docs/HTML5_Implementation_Checklist.md — update checked items

Previous Story Intelligence

From Story 2.4:

  • JS binding pattern established: call_method dispatch on obj_type + method name
  • HostObject pattern: { id: u64, type_name: String }
  • DocumentFragment type added — similar pattern for HTMLCollection (new HostObject type)

From Epic 1:

  • Code review catches edge cases
  • Always update checklists at the end

Testing Strategy

  • JS binding tests in crates/web_api/src/dom_host/tests/dom_tests.rs
  • Key test scenarios:
    • querySelector("div.active") → first matching element
    • querySelector("#nonexistent") → null
    • querySelectorAll("p") → array of all <p> elements
    • querySelectorAll(".cls") → empty array if no matches
    • element.querySelector("span") → scoped to element descendants
    • getElementsByTagName("div") → collection; add div → length increases
    • getElementsByClassName("active") → collection; remove class → length decreases
    • getElementsByTagName("*") → all elements
    • Invalid selector → SyntaxError thrown

References

  • DOM Living Standard §4.4 — Node interface (querySelector/querySelectorAll)
  • DOM Living Standard §4.2.6 — HTMLCollection
  • [Source: crates/selectors/src/engine.rs] — SelectorEngine with query/query_all
  • [Source: crates/selectors/src/selector.rs] — Selector parsing and matching (~56KB)
  • [Source: crates/selectors/src/types.rs] — SimpleSelector, Combinator, PseudoClass enums
  • [Source: crates/dom/src/document.rs] — existing get_element_by_id, get_elements_by_tag_name, etc.
  • [Source: crates/web_api/src/dom_host/host_environment.rs] — JS binding dispatch
  • [Source: docs/HTML5_Implementation_Checklist.md] — checklist to update

Dev Agent Record

Agent Model Used

Claude Opus 4.6 (1M context)

Debug Log References

N/A — clean implementation, no debug issues encountered.

Completion Notes List

  • Implemented querySelector and querySelectorAll on both Document and Element, leveraging existing selectors crate SelectorEngine
  • Element-scoped queries use new iter_subtree/iter_subtree_elements methods on Document
  • Created LiveCollection type with LiveCollectionQuery enum supporting TagName and ClassName variants
  • Live collections re-evaluate on every access (.length, index, .item()), matching browser behavior
  • getElementsByTagName("*") wildcard correctly returns all elements
  • querySelectorAll returns static JS arrays (snapshot), getElementsBy* return live HTMLCollections
  • Invalid selectors produce SyntaxError
  • Added JsObject re-export from js crate for array construction
  • Updated two integration tests that previously tested querySelector/querySelectorAll as "unknown methods"
  • Updated HTML5 Implementation Checklist with new query API and live collection support
  • All 19 new unit tests pass; all existing tests pass; just ci passes

Code Review Fixes (2026-03-14)

  • H1 Fixed: Added 17 integration tests in tests/js_dom_tests.rs covering querySelector, querySelectorAll, getElementsByTagName, getElementsByClassName, live collection add/remove, multi-class queries, element-scoped queries, cross-script collection persistence, and error cases
  • H2 Fixed: Live collection registry (Vec<LiveCollection>) moved from DomHost (recreated per script) to WebApiFacade (persisted) — collections now survive across execute_script / tick / event dispatch calls
  • M1 Fixed: getElementsByClassName("foo bar") now correctly matches elements with ALL specified classes (was incorrectly treating space-separated names as a single class)
  • M2 Fixed: DocumentFragment now supports querySelector/querySelectorAll via delegation to Element handling (ParentNode mixin)
  • M3 Fixed: js_array_from uses into_iter() instead of iter().clone() to avoid unnecessary allocations
  • L1: SelectorEngine::new() creation per-call noted but not changed — constructor is trivial
  • L2 Fixed: Corrected spec comments from HTML §4.10 to DOM §4.4 for getElementsByTagName/getElementsByClassName
  • L3 Fixed: Added debug_assert! guard to prevent collection ID overflow into promise ID space

Change Log

  • 2026-03-14: Story 2.5 implemented — DOM query APIs (querySelector, querySelectorAll, getElementsByTagName, getElementsByClassName) and live HTMLCollection support
  • 2026-03-14: Code review fixes — 8 issues resolved (2 HIGH, 3 MEDIUM, 3 LOW)

File List

  • crates/web_api/Cargo.toml (modified — added selectors dependency)
  • crates/web_api/src/dom_host/mod.rs (modified — collections field changed from owned Vec to &mut reference; added debug_assert guard; added clippy allow)
  • crates/web_api/src/dom_host/host_environment.rs (modified — wired querySelector, querySelectorAll, getElementsByTagName, getElementsByClassName on Document, Element, and DocumentFragment; fixed js_array_from cloning; fixed spec comments)
  • crates/web_api/src/dom_host/live_collection.rs (new — LiveCollection and LiveCollectionQuery types with evaluate method; fixed multi-class matching)
  • crates/web_api/src/lib.rs (modified — added collections field to WebApiFacade; threaded through all DomHost creation sites)
  • crates/web_api/src/event_dispatch.rs (modified — threaded collections parameter through dispatch_event and invoke_listeners)
  • crates/dom/src/document.rs (modified — added iter_subtree and iter_subtree_elements methods; fixed get_elements_by_class_name multi-class support)
  • crates/js/src/lib.rs (modified — added JsObject re-export)
  • crates/web_api/src/dom_host/tests/dom_tests.rs (modified — 19 new unit tests for query APIs and live collections)
  • crates/web_api/src/dom_host/tests/mod.rs (modified — updated test helper with collections parameter)
  • crates/web_api/src/dom_host/tests/event_tests.rs (modified — updated for collections parameter)
  • crates/web_api/src/dom_host/tests/promise_tests.rs (modified — updated for collections parameter)
  • crates/web_api/src/dom_host/tests/scheduling_tests.rs (modified — updated for collections parameter)
  • crates/web_api/src/dom_host/tests/window_tests.rs (modified — updated for collections parameter)
  • tests/js_dom_tests.rs (modified — 17 new integration tests for query APIs and live collections; updated 2 existing tests)
  • docs/HTML5_Implementation_Checklist.md (modified — checked off query API and live collection items)