#300: filtered H3 clusters at world zoom (don't force point mode for broad facets)#302
Merged
Merged
Conversation
…er filtered clusters The explorer will aggregate filtered H3 clusters on the fly off samples_map_lite (GROUP BY the res-appropriate h3 column + the isamplesorg#293 mask predicate) so a broad facet filter at world zoom renders as fast filtered clusters instead of capped raw points. samp_geo already computes h3_res4/h3_res6/h3_res8; lite carried only res8 (point-mode cell lookups). Add res4/res6 (UBIGINT) — they dictionary- compress well, so the size delta is small. - build_samples_map_lite: emit h3_res4, h3_res6 alongside h3_res8 - validate: map_lite re-derivation now covers res4/res6 - header doc + corruption-test schema updated to match - fixture tests: 23/23 Republish of the 202608 lite to R2 follows as a separate data step; the browser feature gates on the columns being present (falls back to today's point-mode behavior otherwise), so this is safe to ship before the republish.
…t + cluster sig Dormant infrastructure for filtered clusters — no behavior change yet because computeTargetMode still forces point mode when a facet is active (relaxed in C2). - filteredClustersReady preflight cell: probes samples_map_lite for h3_res4/res6; sets window.__filteredClustersReady. Hard requirement is only the columns (masks readiness is orthogonal — facetFilterSQL self-falls-back to membership). Safe before the lite republish: flag false → today's point-mode behavior. - Top-level helpers: wantFilteredClusters(), desiredClusterSig() (semantic, not SQL text — kind + sources + tree selections), filteredClusterSQL(res) (masks- backed lite aggregation; INTEGER casts; same columns/grain as build_h3_summary). - loadRes: when wantFilteredClusters(), query filtered lite instead of the summary parquet. Snapshot the sig BEFORE the await; discard on `gen !== loadResGen || sig !== desiredClusterSig()` (filters toggled mid-query). Stamp viewer._clusterFilterSig on success. - phase1: seed viewer._clusterFilterSig from the initial summary load. Render OK. Per Codex design review (P0.1 casts, P0.2/P1.3 snapshot signature).
…orld zoom Relax the isamplesorg#267 force-point rule: with a facet active above EXIT_POINT_ALT (and filtered clusters ready), the map now shows FILTERED h3 clusters instead of capped raw points. Zoom-in still drops to individual dots. Search stays point-latched (out of scope). - computeTargetMode(alt, latch=getMode()): search→point; facet&&!ready→point (pre-republish fallback); else ENTER/EXIT altitude hysteresis. latch param lets a URL restore resolve the band against the saved mode (Codex P1.7). - reconcileGlobeForFilters(): shared transition for both filter-change handlers. Point branch invalidates in-flight cluster loads (loadResGen++) so a stale loadRes can't paint under point mode (P1.4). Cluster branch loads filtered clusters into the hidden layer FIRST, then exits point only if applied — no stale-cluster flash on a failed/superseded load (P1.10) — then chases tryEnterPointModeIfNeeded() (supersession invariant, P1.4). - handleFacetFilterChange / applySearchFilterChange: route through the reconcile. - camera.changed cluster branches: reload when resolution OR filter signature changed (a facet toggle in cluster mode refreshes the filtered cells); point→ cluster uses load-first-then-exit (P1.5/P1.10). - moveEnd gate (was: exit only when no filter): now computeTargetMode-driven, so a sub-10% zoom-out with a facet active loads filtered clusters; listener made async (return value unused) (P1.5). - Readiness→reconcile hook (window.__onFilteredClustersReady) for late preflight / republished lite (P1.9). Render OK. C3 (deep-link/boot restore, filtered click hydration, facet note) next.
…k hydration, facet note - Deep-link/back-forward restore (hashchange): resolve mode via computeTargetMode(restoredAlt, latch=s.mode) so a facet at world zoom restores to FILTERED clusters, not point. Load clusters first then exit point; isStale() before mutation and after the load await; suppress-hash released first as before (Codex P1.6/P1.7). - Cold boot mode hydration: same computeTargetMode(latch) treatment; when a facet is active at cluster altitude, reload the FILTERED clusters over phase1's unfiltered summary load (P1.7). Still enters point for the isamplesorg#203 alt<ENTER loophole. - fetchClusterByH3: filter-aware single-cell aggregation off lite (count / center / dominant_source over the filtered subset, same tie-break as filteredClusterSQL) so a clicked or deep-linked filtered cell shows filtered numbers and a filter-excluded cell resolves to null (P1.8). - handleFacetFilterChange: revalidate the selected cluster card after a facet change (clear if the cell emptied, else re-hydrate), guarded by a freshness token — mirrors the source-filter handler (P1.8). - syncFacetNote: hide the "filter only at neighborhood zoom" apology once filtered clusters are ready — cluster mode is now filter-aware (P1.10). Render OK. Completes the isamplesorg#300 browser implementation (C1+C2+C3).
- P0: delete the dedicated _urlHasFacets boot force-point block — it ran AFTER the new filtered-cluster boot load and switched straight back to points, negating isamplesorg#300 for cold-boot facet deep links. computeTargetMode (via bootTarget) now owns that decision. - P1.2: invalidateClusterLoads() = ++loadResGen + loading=false. A bare loadResGen++ left `loading` stuck true (the superseded load's finally only clears it when its gen is current), wedging every later reload guard. Used in reconcile's point branch and at hashchange entry. - P1.3: clusterSig(kind); phase1 labels its load clusterSig('summary') explicitly so a reconcile can't mistake facet-blind summary clusters for filtered data. - P1.4: readiness fallback reconcile also checks _clusterFilterSig (at world zoom the mode is already 'cluster' though the layer is still summary). - P1.5: boot 'point' uses direct enterPointMode for forced/saved cases (search / facet-not-ready / explicit mode=point) since tryEnterPointModeIfNeeded refuses at alt >= ENTER_POINT_ALT; gentle helper only for altitude-driven entry. - P1.6: moveEnd chases tryEnterPointModeIfNeeded after exitPointMode (an overlapping settle can drop below ENTER during the load await). - P1.7: hashchange invalidates cluster loads at ENTRY so a prior restore callback's loadRes discards instead of replacing data before the late isStale(). - P1.8: handleFacetFilterChange captures the freshness token at entry (a second toggle during the reconcile await must invalidate the first's revalidation). Codex verified correct: integer casts, post-await sig TOCTOU, load-first ordering, filtered fetchClusterByH3, build tests 23/23. Render OK.
…ered clusters THE BUG: with a facet active at world zoom, filtered clusters never loaded — the loadRes filtered query, issued during boot's concurrent query storm, NEVER resolved. The identical query completes in ~2.5s once the connection is idle, and even two concurrent post-boot queries are fine — but DuckDB-WASM (the non-threaded MVP build this page loads) DEADLOCKS when the heavy filtered aggregation (samples_map_lite + sample_facet_masks) runs amid boot's other in-flight queries. THE FIX: serialize every db.query through a FIFO chain (wrap DuckDBClient.query in the `db` cell), so at most one query runs at a time. SQL queries are atomic (none awaits another mid-execution), so chaining can't deadlock; the latency cost is small. Single-point fix; without it isamplesorg#300's filtered clusters never appear. Also (found while debugging the double-fire): - !loading guards on the readiness reconcile triggers (setTimeout0 + onReady hook) so boot's filtered load isn't redundantly issued twice. Verification (tests/playwright/filtered-clusters-300.spec.js, [data], against a local res46 lite served by dev_server.py): - broad facet (anyanthropogenicmaterial) at world zoom → _clusterFilterSig kind:filtered, cluster mode (not forced point), 81 res4 cells; cluster sample_count sum == independent masks-backed COUNT(*) (count conservation). - zoom-in below ENTER_POINT_ALT → point mode. scripts/regen_lite_res46.py: derive h3_res4/res6 onto the existing 202608 lite via the h3 extension (no wide needed); validated against the shipped h3 summaries. DESIGN_300.md: design + Codex-review record.
…ading guards, dedup) - P1: drop the `!loading` guards on the readiness reconcile triggers. They were added to avoid the boot double-fire, but db.query serialization already prevents the deadlock, and the guards are LOSSY — readiness arriving while an older unfiltered loadRes is in flight would skip the only reconcile signal permanently (Codex serialization-review P1). - Add a sig-dedup at the top of reconcileGlobeForFilters' cluster branch: if already in cluster mode with the matching filter signature, no-op. Keeps the now-unguarded redundant reconciles (boot + hook + setTimeout0) cheap instead of re-running the heavy filtered aggregation. - P2: narrow the `db` serialization comment — it wraps `.query` (and `.sql`, which calls `.query`); it does NOT cover queryStream/queryRow/raw connect. All 45 data calls in the page use db.query (verified), so coverage is complete today.
…he-bust) Activates filtered clusters in production: the _v2 lite carries h3_res4/h3_res6. A new filename (not overwriting the original) preserves the immutable-cache contract (isamples_YYYYMM_*.parquet is served immutable/1-yr) — every visitor fetches fresh data, no Cloudflare purge needed. One-off retrofit for 202608; the next generation builds res4/res6 into the canonical name natively. REQUIRES isamples_202608_samples_map_lite_v2.parquet uploaded to R2 (bucket isamples-ry) BEFORE this merges — lite_url is load-bearing (point mode, deep links, filtered clusters all read it), so a missing _v2 would 404 the explorer.
…the filtered load The full db.query FIFO serialization (boot-deadlock fix) queued interactive queries behind boot's whole storm — the pre-deploy smoke gate's "pottery" search exceeded its 90s budget on a cold CI runner, blocking the staging deploy. Surgical replacement: keep all queries CONCURRENT (fast boot + search, smoke gate happy) and gate ONLY the heavy filtered-cluster aggregation on an idle connection. - db cell: in-flight COUNTER (non-serializing) instead of the FIFO chain; exposes instance._inFlight(). - loadRes: when filtered, `await whenConnectionIdle()` before issuing — waits out boot's concurrent query storm (the deadlock trigger), no-op post-boot. Re-checks supersession after the wait. The light summary path is unchanged (safe concurrent). The deadlock only occurs with a facet active at boot; the smoke test (text search, no facet) never hit that path — its failure was purely the serialization slowing search. Verified: filtered-clusters-300 2/2 (idle-gate avoids deadlock); smoke passes ~28s (was ~40s serialized).
This was referenced Jun 19, 2026
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #300. Builds on PR4c #301 (merged) — the centralized
computeTargetMode/filtersForcePointseam.What
When a facet filter is active and the camera is zoomed out (above
EXIT_POINT_ALT), the map now renders an h3-clustered view of the filtered set instead of promoting to capped raw point mode (#267). Zoom-in still drops to individual dots. Search keeps point-mode (out of scope). Foundation: the #293 masks make filtered h3 aggregation fast.How
build_frontend_derived.py): addh3_res4/h3_res6tosamples_map_lite(+ validator + fixtures, 23/23).loadResis filter-aware: with a facet active + masks ready, it aggregates the FILTERED set offsamples_map_lite(filteredClusterSQL, same grain/columns as the summary parquet) instead of loading the facet-blind pre-aggregated summary. A semantic cluster signature (_clusterFilterSig) drives stale-reload.computeTargetMode: facets use the normal ENTER/EXIT altitude hysteresis once filtered clusters are ready (gated onfilteredClustersReady+ masks); search still forces point.fetchClusterByH3), and the#facetNoteapology all made filter-aware.db.querythrough a FIFO chain (all 45 data calls usedb.query; verified). See thedbcell comment.Activation / deploy (ORDER MATTERS)
lite_urlnow points atisamples_202608_samples_map_lite_v2.parquet(a cache-bust filename that keeps the immutable-cache contract). This_v2file must be uploaded to R2 (isamples-ry) BEFORE this PR merges —lite_urlis load-bearing (point mode, deep links, filtered clusters all read it), so a missing_v2would 404 the explorer's lite entirely. The file is built/validated (reproduces the shipped h3 summaries exactly).If
_v2is absent, the feature is dormant (gates on the res4/res6 columns) and the explorer behaves as before #300 — so there's no half-state, but the explorer does need some lite atlite_url.Watch-item
db.queryserialization ships to all users and makes boot queries sequential. Local boot is fast (verify 9–14s); worth eyeballing real-network boot latency. Fallback if too slow: defer only the heavy filtered query until the connection is idle (keep other queries concurrent).Verification
tests/playwright/filtered-clusters-300.spec.js[data] (against a local res46 lite): broad facet (anyanthropogenicmaterial) at world zoom → filtered clusters (kind:filtered, not point), 81 res4 cells, count conservation (cluster sum == masks-backedCOUNT(*)); zoom-in → point. 2/2 pass.(e)facet-hydration is a pre-existing cold-cache flake;listings.json404 is listings.json gets a 404 error #295.)🤖 Generated with Claude Code