Files
rust_browser/_bmad-output/implementation-artifacts/2-5-dom-query-apis-and-live-collections.md
Zachary D. Rowitsch 48313ea109 Implement DOM query APIs and live collections with code review fixes (§4.4, §4.2.6)
Add querySelector, querySelectorAll, getElementsByTagName, and getElementsByClassName
on Document, Element, and DocumentFragment. Live HTMLCollections re-evaluate on every
access. Code review fixed: collection persistence across script invocations, multi-class
getElementsByClassName matching, DocumentFragment query support, and added 17 integration tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 23:53:40 -04:00

257 lines
16 KiB
Markdown

# Story 2.5: DOM Query APIs & Live Collections
Status: done
## Story
As a web developer using JavaScript,
I want to query the DOM by CSS selectors and access live element collections,
So that scripts can efficiently find and track elements in the document.
## Acceptance Criteria
1. **`document.querySelector(selector)`** with a valid CSS selector returns the first matching element, or `null` if no match exists.
2. **`document.querySelectorAll(selector)`** returns a static NodeList of all matching elements in document order.
3. **`document.getElementsByTagName(name)` and `document.getElementsByClassName(name)`** return live HTMLCollections that automatically reflect DOM changes (elements added/removed).
4. **Complex CSS selectors** (combinators, pseudo-classes like `:first-child`, `:nth-child`, attribute selectors) work correctly with querySelector/querySelectorAll, using the existing selector matching engine.
5. **Integration tests** verify query APIs and live collection behavior. DOM checklist updated. `just ci` passes.
## Tasks / Subtasks
- [x] Task 1: Wire querySelector/querySelectorAll to JavaScript (AC: #1, #2, #4)
- [x] 1.1 Add `selectors` crate as a dependency of `web_api` (if not already) — the `SelectorEngine` with `query()` and `query_all()` methods already exists and works
- [x] 1.2 Wire `"querySelector"` in `host_environment.rs` `call_method` for `"Document"` type:
- Extract selector string from args
- Parse with `Selector::parse(selector_text)`
- Call `SelectorEngine::query(doc, &selector)` → returns `Option<NodeId>`
- Return `JsValue::HostObject` or `JsValue::Null`
- [x] 1.3 Wire `"querySelectorAll"` in `call_method` for `"Document"`:
- Parse selector, call `SelectorEngine::query_all(doc, &selector)` → returns `Vec<NodeId>`
- Return `JsValue::Array` of `JsValue::HostObject` elements (static snapshot — spec says querySelectorAll returns a static NodeList)
- [x] 1.4 Wire `"querySelector"` and `"querySelectorAll"` on `"Element"` type too — scope the query to descendants of that element (filter `query_all` results to descendants, or iterate element's subtree)
- [x] 1.5 Handle invalid selectors: if `Selector::parse()` returns `None`, return `Err(RuntimeError)` with `SyntaxError` message
- [x] 1.6 Unit tests: querySelector for tag, class, id, combinators, pseudo-classes; querySelectorAll returning multiple results; scoped element queries; invalid selector error
- [x] Task 2: Implement HTMLCollection for live collections (AC: #3)
- [x] 2.1 Create a `LiveCollection` type (in `web_api` or `dom` crate) that stores the query parameters:
```rust
enum LiveCollectionQuery {
TagName(String),
ClassName(String),
}
struct LiveCollection {
root: NodeId, // scoping root (document or element)
query: LiveCollectionQuery,
}
```
- [x] 2.2 Implement `LiveCollection::evaluate(&self, doc: &Document) -> Vec<NodeId>` — re-queries the document on each call using existing `get_elements_by_tag_name()` / `get_elements_by_class_name()`
- [x] 2.3 Store live collections in `DomHost` with a registry: `collections: Vec<LiveCollection>` indexed by collection ID
- [x] 2.4 Wire `"getElementsByTagName"` in `call_method` for `"Document"`:
- Create `LiveCollection` with `TagName(tag)` query
- Store in collections registry
- Return `JsValue::HostObject { id: collection_id, type_name: "HTMLCollection" }`
- [x] 2.5 Wire `"getElementsByClassName"` similarly
- [x] 2.6 Handle `"HTMLCollection"` in `get_property`:
- `"length"` → evaluate collection, return count as `JsValue::Number`
- Numeric index (e.g., `"0"`, `"1"`) → evaluate collection, return element at index or `JsValue::Undefined`
- [x] 2.7 Handle `"HTMLCollection"` in `call_method`:
- `"item"` → evaluate collection, return element at index arg or `JsValue::Null`
- [x] 2.8 Also wire `getElementsByTagName`/`getElementsByClassName` on `"Element"` type — scope to descendants
- [x] Task 3: Expose existing query methods not yet in JS (AC: #3)
- [x] 3.1 Verify `"getElementById"` is already wired (it is — just confirm still works)
- [x] 3.2 Add `"Element"` support for `querySelector`/`querySelectorAll` if not done in Task 1.4
- [x] 3.3 Ensure `getElementsByTagName("*")` returns all elements (wildcard per spec)
- [x] Task 4: Tests and documentation (AC: #5)
- [x] 4.1 JS binding tests for querySelector: simple selectors, combinators, pseudo-classes, no-match returns null
- [x] 4.2 JS binding tests for querySelectorAll: multiple matches in document order, empty results
- [x] 4.3 JS binding tests for live collections:
- Get collection, check length
- Add element to DOM, re-check length (should increase)
- Remove element from DOM, re-check length (should decrease)
- [x] 4.4 Integration test: JS script that queries DOM, modifies it, verifies live collection updates
- [x] 4.5 Golden test if query results affect rendering (unlikely but check)
- [x] 4.6 Update `docs/HTML5_Implementation_Checklist.md` — check off querySelector, querySelectorAll, live collection items
- [x] 4.7 Run `just ci` and ensure all tests pass
## Dev Notes
### The Selector Engine Already Exists
The `selectors` crate (`crates/selectors/src/`) provides a complete, battle-tested CSS selector matching engine. **Do NOT reimplement selector matching.** The engine already supports:
- **All simple selectors:** tag, class, id, universal, all attribute selector variants
- **All combinators:** descendant (space), child (`>`), adjacent sibling (`+`), general sibling (`~`)
- **30+ pseudo-classes:** `:first-child`, `:last-child`, `:nth-child(An+B)`, `:not()`, `:is()`, `:where()`, `:has()`, `:empty`, `:checked`, `:disabled`, `:enabled`, `:root`, `:lang()`, `:dir()`, and more
- **Pseudo-elements:** `::before`, `::after`, `::first-line`, `::first-letter`
**Key API in `crates/selectors/src/engine.rs`:**
```rust
impl SelectorEngine {
pub fn query(&self, doc: &Document, selector: &Selector) -> Option<NodeId>
pub fn query_all(&self, doc: &Document, selector: &Selector) -> Vec<NodeId>
pub fn matches(&self, doc: &Document, node_id: NodeId, selector: &Selector) -> bool
}
```
**Selector parsing in `crates/selectors/src/selector.rs`:**
```rust
impl Selector {
pub fn parse(input: &str) -> Option<Selector>
}
```
### Element-Scoped Queries
`querySelector` and `querySelectorAll` on an Element should only return descendants of that element. Two approaches:
**Option A (simple):** Use `SelectorEngine::query_all()` on the full document, then filter to descendants of the element. Requires a `is_descendant_of(node, ancestor, doc)` helper.
**Option B (efficient):** Implement a subtree iterator on Document and use `selector.matches()` against each descendant. Better for large documents.
Choose Option B if performance matters, Option A for simplicity. The existing `iter_tree()` iterates the full document — a scoped variant iterating from a specific root would be ideal.
### Live Collection Design
Live collections in browsers re-evaluate on every access. The simplest correct implementation:
```rust
// On each .length or .item(n) access:
fn evaluate(&self, doc: &Document) -> Vec<NodeId> {
match &self.query {
LiveCollectionQuery::TagName(tag) => doc.get_elements_by_tag_name(tag),
LiveCollectionQuery::ClassName(cls) => doc.get_elements_by_class_name(cls),
}
}
```
This is correct and matches browser behavior. Performance optimization (caching with dirty-flag invalidation) is unnecessary for now — real browsers also re-evaluate on access for correctness.
**Collection storage:** The `DomHost` struct needs a `Vec<LiveCollection>` to store collections by ID. Collection IDs must NOT conflict with NodeId values — use a separate ID space (e.g., offset by `u64::MAX / 2` or use a separate HostObject type discriminated by `type_name`).
### Static vs Live Return Types
Per spec:
- **`querySelectorAll()`** → static `NodeList` (snapshot, doesn't update)
- **`getElementsByTagName()`** → live `HTMLCollection` (updates automatically)
- **`getElementsByClassName()`** → live `HTMLCollection` (updates automatically)
- **`querySelector()`** → single Element or null (no collection)
For `querySelectorAll`, returning a `JsValue::Array` of elements is sufficient — it's a snapshot by definition.
### Architecture Constraints
- **Layer 1:** `selectors` crate is Layer 1, same as `dom` — cross-dependency is allowed
- **`web_api` already depends on `dom`** — adding `selectors` dependency is fine (both Layer 1)
- **No unsafe** — enforced by CI
- **Spec citations** — `// DOM §4.4` for querySelector, `// HTML §4.10` for getElementsBy*
### What NOT to Change
- **Do NOT modify the selector matching engine** — it's complete and working
- **Do NOT add `matches()` JS method on Element** — not in scope (could be added later)
- **Do NOT implement `NodeList` as a separate type** — for querySelectorAll, a JS Array suffices
- **Do NOT add `closest()`, `contains()`, `compareDocumentPosition()`** — future work
### Files to Modify
- `crates/web_api/src/dom_host/host_environment.rs` — wire querySelector, querySelectorAll, getElementsByTagName, getElementsByClassName to JS; handle HTMLCollection type
- `crates/web_api/src/dom_host/` — possibly new file `live_collection.rs` for LiveCollection type
- `crates/web_api/Cargo.toml` — add `selectors` dependency if not present
- `crates/web_api/src/dom_host/tests/dom_tests.rs` — JS binding tests
- `docs/HTML5_Implementation_Checklist.md` — update checked items
### Previous Story Intelligence
**From Story 2.4:**
- JS binding pattern established: `call_method` dispatch on `obj_type` + `method` name
- HostObject pattern: `{ id: u64, type_name: String }`
- DocumentFragment type added — similar pattern for HTMLCollection (new HostObject type)
**From Epic 1:**
- Code review catches edge cases
- Always update checklists at the end
### Testing Strategy
- **JS binding tests** in `crates/web_api/src/dom_host/tests/dom_tests.rs`
- **Key test scenarios:**
- `querySelector("div.active")` → first matching element
- `querySelector("#nonexistent")` → null
- `querySelectorAll("p")` → array of all `<p>` elements
- `querySelectorAll(".cls")` → empty array if no matches
- `element.querySelector("span")` → scoped to element descendants
- `getElementsByTagName("div")` → collection; add div → length increases
- `getElementsByClassName("active")` → collection; remove class → length decreases
- `getElementsByTagName("*")` → all elements
- Invalid selector → SyntaxError thrown
### References
- [DOM Living Standard §4.4 — Node interface (querySelector/querySelectorAll)](https://dom.spec.whatwg.org/#interface-node)
- [DOM Living Standard §4.2.6 — HTMLCollection](https://dom.spec.whatwg.org/#interface-htmlcollection)
- [Source: crates/selectors/src/engine.rs] — SelectorEngine with query/query_all
- [Source: crates/selectors/src/selector.rs] — Selector parsing and matching (~56KB)
- [Source: crates/selectors/src/types.rs] — SimpleSelector, Combinator, PseudoClass enums
- [Source: crates/dom/src/document.rs] — existing get_element_by_id, get_elements_by_tag_name, etc.
- [Source: crates/web_api/src/dom_host/host_environment.rs] — JS binding dispatch
- [Source: docs/HTML5_Implementation_Checklist.md] — checklist to update
## Dev Agent Record
### Agent Model Used
Claude Opus 4.6 (1M context)
### Debug Log References
N/A — clean implementation, no debug issues encountered.
### Completion Notes List
- Implemented `querySelector` and `querySelectorAll` on both Document and Element, leveraging existing `selectors` crate `SelectorEngine`
- Element-scoped queries use new `iter_subtree`/`iter_subtree_elements` methods on Document
- Created `LiveCollection` type with `LiveCollectionQuery` enum supporting `TagName` and `ClassName` variants
- Live collections re-evaluate on every access (`.length`, index, `.item()`), matching browser behavior
- `getElementsByTagName("*")` wildcard correctly returns all elements
- `querySelectorAll` returns static JS arrays (snapshot), `getElementsBy*` return live HTMLCollections
- Invalid selectors produce `SyntaxError`
- Added `JsObject` re-export from `js` crate for array construction
- Updated two integration tests that previously tested querySelector/querySelectorAll as "unknown methods"
- Updated HTML5 Implementation Checklist with new query API and live collection support
- All 19 new unit tests pass; all existing tests pass; `just ci` passes
### Code Review Fixes (2026-03-14)
- **H1 Fixed**: Added 17 integration tests in `tests/js_dom_tests.rs` covering querySelector, querySelectorAll, getElementsByTagName, getElementsByClassName, live collection add/remove, multi-class queries, element-scoped queries, cross-script collection persistence, and error cases
- **H2 Fixed**: Live collection registry (`Vec<LiveCollection>`) moved from `DomHost` (recreated per script) to `WebApiFacade` (persisted) — collections now survive across `execute_script` / `tick` / event dispatch calls
- **M1 Fixed**: `getElementsByClassName("foo bar")` now correctly matches elements with ALL specified classes (was incorrectly treating space-separated names as a single class)
- **M2 Fixed**: `DocumentFragment` now supports `querySelector`/`querySelectorAll` via delegation to Element handling (ParentNode mixin)
- **M3 Fixed**: `js_array_from` uses `into_iter()` instead of `iter().clone()` to avoid unnecessary allocations
- **L1**: `SelectorEngine::new()` creation per-call noted but not changed — constructor is trivial
- **L2 Fixed**: Corrected spec comments from `HTML §4.10` to `DOM §4.4` for getElementsByTagName/getElementsByClassName
- **L3 Fixed**: Added `debug_assert!` guard to prevent collection ID overflow into promise ID space
### Change Log
- 2026-03-14: Story 2.5 implemented — DOM query APIs (querySelector, querySelectorAll, getElementsByTagName, getElementsByClassName) and live HTMLCollection support
- 2026-03-14: Code review fixes — 8 issues resolved (2 HIGH, 3 MEDIUM, 3 LOW)
### File List
- crates/web_api/Cargo.toml (modified — added `selectors` dependency)
- crates/web_api/src/dom_host/mod.rs (modified — collections field changed from owned Vec to `&mut` reference; added debug_assert guard; added clippy allow)
- crates/web_api/src/dom_host/host_environment.rs (modified — wired querySelector, querySelectorAll, getElementsByTagName, getElementsByClassName on Document, Element, and DocumentFragment; fixed js_array_from cloning; fixed spec comments)
- crates/web_api/src/dom_host/live_collection.rs (new — LiveCollection and LiveCollectionQuery types with evaluate method; fixed multi-class matching)
- crates/web_api/src/lib.rs (modified — added collections field to WebApiFacade; threaded through all DomHost creation sites)
- crates/web_api/src/event_dispatch.rs (modified — threaded collections parameter through dispatch_event and invoke_listeners)
- crates/dom/src/document.rs (modified — added iter_subtree and iter_subtree_elements methods; fixed get_elements_by_class_name multi-class support)
- crates/js/src/lib.rs (modified — added JsObject re-export)
- crates/web_api/src/dom_host/tests/dom_tests.rs (modified — 19 new unit tests for query APIs and live collections)
- crates/web_api/src/dom_host/tests/mod.rs (modified — updated test helper with collections parameter)
- crates/web_api/src/dom_host/tests/event_tests.rs (modified — updated for collections parameter)
- crates/web_api/src/dom_host/tests/promise_tests.rs (modified — updated for collections parameter)
- crates/web_api/src/dom_host/tests/scheduling_tests.rs (modified — updated for collections parameter)
- crates/web_api/src/dom_host/tests/window_tests.rs (modified — updated for collections parameter)
- tests/js_dom_tests.rs (modified — 17 new integration tests for query APIs and live collections; updated 2 existing tests)
- docs/HTML5_Implementation_Checklist.md (modified — checked off query API and live collection items)