transform: equality-saturation MIR optimizer (mz_transform::eqsat) by antiguru · Pull Request #37160 · MaterializeInc/materialize

antiguru · 2026-06-19T15:55:43Z

An e-graph / equality-saturation optimizer over a subset of MIR, wired into the logical optimizer behind enable_eqsat_optimizer (default on) and the physical optimizer behind enable_eqsat_physical_optimizer (default off).

It lowers MIR to a relational IR with real MirScalarExpr payloads, saturates a ported rule engine over an e-graph, extracts the cheapest plan under a memory-primary cost model, and raises back. Unsupported subtrees bail to opaque leaves carried verbatim, so the pass is equivalence preserving on arbitrary input; an arity guard at the transform boundary makes any mismatch a no-op.

The intended win is the cyclic-join decision: on a cyclic (e.g. triangle) join the AGM cost model proves worst-case-optimal (delta) evaluation dominates the binary join on memory and time, where JoinImplementation picks the dominated binary plan. The physical pass commits that decision by tagging the join JoinImplementation::DeltaQuery (synthesized by reusing the real delta planner), which JoinImplementation then leaves unre-planned. WcoJoin is created only for cyclic joins, and a binary join's cost charges its persistent arrangements rather than its transient N-squared output, so the choice is delta where it wins and binary parity elsewhere.

Supporting infrastructure: a polarity-aware extractor that never selects a multiplicity-signed representative as the input of a non-linear Reduce/TopK (with a graceful no-op when a root is otherwise unextractable), a logical/physical phase annotation on rewrite rules, and a constant-column e-class analysis. These are equivalence preserving and scoped to the appropriate pass.

A prior cutover that skipped CanonicalizeMfp under the flag is reverted, since the e-graph does not process every plan and so cannot subsume it.

The logical flag is default-on as the intended steady state (the strangler-fig plan: grow eqsat until it can delete pipeline passes). CI exercises the pass across the SLT corpus with the flag on and surfaces any plan or result diffs; the affected goldens are rewritten to the eqsat-on output. The logical pass is parity, not improvement, today; the genuine plan win is the physical worst-case-optimal-join decision behind the second, default-off flag.

🤖 Generated with Claude Code
https://claude.ai/code/session_016CwwG1wCgqf3v8g6uhf1Nx

antiguru

Critical review: eqsat MIR optimizer

I read the engine end-to-end plus the design-review / status / roadmap / polarity-extractor docs. This is genuinely impressive work and the self-assessment in the status doc is unusually honest. The notes below are the gaps I'd want closed (or explicitly punted with eyes open) before this leaves draft, followed by the bigger-picture framing.

The bigger picture first

By the team's own framing, the live logical pass is "parity, not improvement." Every active rule duplicates a rewrite the pipeline already performs, and the one genuine divergence (the WCOJ→DeltaQuery decision) ships only behind the second, default-off flag (enable_eqsat_physical_optimizer). So in the default-on configuration this PR proposes, the user-visible plan benefit today is ≈0, while it adds a 16.5k-line optimizer surface, real optimize latency, and new soundness obligations. The justification is strangler-fig: grow eqsat until it can delete pipeline passes. That's a sound strategy — but no deletion has landed yet, and the one cutover that had landed (skipping CanonicalizeMfp under the flag) is reverted in this very PR because it was unsound (lib.rs:958 comment). So the current state is "extra work for parity," and the merge decision should be judged on that basis, not on the eventual vision.

That's the lens for the blockers below.

Blockers (internal contradictions that make the PR un-mergeable in either flag state)

1. The flag default contradicts itself across the PR. The PR body says "the flag is temporarily defaulted on so CI exercises the pass… Revert to off before un-drafting," and the status doc lists the same revert under "What is left." But definitions.rs (enable_eqsat_optimizer) now carries the opposite comment: "Defaulted on… the intended steady state… not a pre-merge revert." Pick one. Shipping a brand-new experimental optimizer default-on to every user, for zero plan benefit today, is a big call that needs an explicit owner sign-off — not an ambiguity buried in a comment.

2. The SLT goldens are silently coupled to the default-on flag, and nothing pins it. Ten SLT files are rewritten to eqsat-on output (not-null-propagation.slt +218/-… , locally_optimized_plan.slt 118 lines, joins, chbench, aoc_*, record, subquery, demand, reduction_pushdown), but no SLT file sets enable_eqsat_optimizer (grep across test/sqllogictest/ finds zero references). So those goldens pass only because the global default is true. This makes #1 not just a doc nit but a hard contradiction: if you "revert to off before un-drafting," these tests immediately fail; if you keep it on, you're shipping on. There is no flag state in which the PR as written is both green and consistent with its own stated plan. Resolution options: (a) commit to default-on and fix the PR body/status doc; (b) gate each eqsat-divergent golden behind an explicit per-file SET enable_eqsat_optimizer (or mzcompose override) and revert the default to off; (c) revert the golden rewrites. (b) is the honest one if the intent is really "off at merge."

3. Synthesized empty-constant types are placeholders, and the pass is now live. raise::repr_type_of_arity (raise.rs:391) types every column of a synthesized Empty (from union_cancel / empty_false_filter) as Int64?. Its own doc comment still says "The pass is offline; the surrounding optimizer recomputes column types when this is ever wired live" — but it is wired live now. The boundary arity guard in EqSatTransform (transform.rs:64) catches arity drift but not column-type drift. So when one of those rules collapses a branch whose true column types aren't Int64 (e.g. a text/numeric column), eqsat emits an internally-consistent-but-wrong-typed plan. Because Typecheck recomputes bottom-up, the wrong type is self-consistent and may not be rejected — which means it can silently change a query's result-column type, or feed a type-incompatible arm to a sibling Union. Please carry the real ColumnType through the Empty node (it's available at lower time), and add a regression test: an empty-collapsed branch with a non-Int64 column must round-trip its type and survive the final strict Typecheck.

Soundness surface that is asserted but not enforced

4. Negate-join rules are enabled; their soundness rests entirely on the polarity-aware extractor; the matching Lean obligations are sorry. The status doc says the two unsound negate-join rules "were removed," but they're live in the rule file (relational.rewrite:131 distribute_negate_join, :136 factor_negate_join). The newest workstream (2026-06-21-polarity-aware-extractor.md) re-enabled them, with soundness moved into the extractor ("no Negate-rooted representative directly under a non-linear Reduce/TopK") rather than a rule guard. That's a defensible design — but it makes the extractor's polarity lattice a load-bearing correctness mechanism, and the corresponding Lean theorems (negative-multiplicity / non-linear-reduce) are among the 24 sorrys. So the formal-soundness story for the riskiest rules is currently aspirational.

5. The Lean spec is not gated by CI and can rot. There's no CI job that builds src/transform/lean/, and no check that gen-lean's Generated.lean is in sync with the rule file (unlike the repo's other generated artifacts, which have sync gates). 24/32 theorems are sorry. Without a gate, (a) a rule change can desync Generated.lean unnoticed, and (b) the sorry count can grow silently. Either wire a real gate (build Lean, fail on new sorry, assert gen-lean output is regenerated) or state plainly in the docs that the Lean artifact is advisory, not enforced — right now the prose implies more assurance than CI provides.

6. raise alone can emit un-renderable plans; correctness depends on a bundled cleanup. lib.rs/raise.rs note that raise can produce a single Reduce mixing reduction types that ReducePlan::create_from panics on, and only the bundled logical_fixpoint_02 re-run (raise.rs:306) splits it via ReduceReduction. That's an implicit invariant on a comment. Any future caller of raise that skips the cleanup crashes rendering. Make it an enforced invariant + test ("raise output never contains a mixed-reduction Reduce"), or split the reduction inside raise itself.

Architecture / cost

7. The keystone the design-review prescribed was only half-done. The 2026-06-19 design review's #1 recommendation was to retarget ENode onto real MirRelationExpr/MirScalarExpr and delete the dual Rel/lower/raise IR, calling the dual IR "structurally unsuited to replace production passes" (lossy round-trips, type loss, per-operator bail treadmill). What landed is a partial retarget — scalar payloads are real MirScalarExpr, but the relational dual IR remains. So every problem that recommendation targeted persists: raise still drops constant-singleton identity inputs via join_scalars (raise.rs:170), still synthesizes placeholder types (#3), and still bails per-operator. Please reconcile this explicitly: is direct-on-MIR still the plan, or is the dual IR the accepted steady state? The docs don't close the loop with the design review's verdict.

8. The default-on path runs fixpoint_logical_02 twice, back-to-back, plus a full Demand/ProjectionPushdown and two coalesce_mfp passes. EqSatTransform is pushed right after fixpoint_logical_02() in logical_optimizer (lib.rs:815 then the push), and optimize_logical re-runs the entire fixpoint_logical_02 inside raise (raise.rs:306), plus coalesce_mfp twice and full Demand+ProjectionPushdown (eqsat.rs:139-164). For zero plan benefit today, that's a lot of repeated optimizer work. Combined with MAX_PLAN_SIZE = 200 (which makes the pass a no-op on most non-trivial plans anyway), the net effect at default-on is "added latency on small plans, parity output." Please measure the added optimize latency (the status doc itself cites 27s and ~6.5s episodes that were tamed by caps, not by being fast) and let that inform the default-on decision.

9. The "wins" aren't wins. compare_real reports 3 wins / 0 losses / 17 ties, but the 3 are self-described as "cost-model artifacts (eqsat omits a canonicalizing Project)." An omitted canonicalizing projection is plausibly a less-canonical plan, not an improvement. I'd treat the harness as showing parity, and double-check the omitted Project isn't a latent canonicalization regression rather than a feature.

Repo hygiene / process

10. ~3k lines of agent working docs are committed under docs/superpowers/ (plus .superpowers/sdd/e1-report.md), several already stale relative to the code: the design review predates the retarget; the status doc claims "32 rules / negate rules removed / CanonicalizeMfp cutover done" while the code has 34 rules, negate rules enabled, and the cutover reverted. Decide whether these belong in the product repo at all (vs. doc/developer/design/ per existing convention) and, if kept, mark the historical ones as such so they don't read as current truth.

11. 61 commits / 63 files / 16.5k lines in one PR is effectively un-reviewable by a human as a unit. Strongly consider splitting for landing: (a) flag plumbing + no-op registration, (b) the engine, (c) the Lean spec + its CI gate, (d) the SLT golden changes. Each is independently reviewable and bisectable.

What I'd do to "complete the work"

Resolve #1/#2 as a unit — decide the merge-time flag state and make the goldens consistent with it (I'd default off and gate the divergent goldens explicitly until the first real pass deletion lands, since default-on buys nothing today and adds risk).
Fix the empty-constant type loss (#3) and add the type round-trip test; promote the mixed-Reduce invariant (#6) from comment to test.
Add the Lean CI gate or downgrade the soundness claims in prose (#5); track the negate-join sorrys (#4) as explicit release-blockers for the physical flag, not the logical one.
Write the docs reconciliation: dual-IR-vs-direct-on-MIR decision (#7), refresh the stale status doc, and quantify the latency (#8).
Then the strangler-fig payoff becomes real: land the first actual fixpoint deletion behind the flag with a green SLT gate — that's the milestone that converts this from "parity at a cost" into "a pass worth defaulting on."

Happy to dig into any single thread (the polarity lattice, the cost model's memory axis, or the empty-type path) in more depth.

Generated by Claude Code

…commit Squashed feature branch (eqsat MIR optimizer subsystem) for rebase onto main. Includes: the equality-saturation MIR optimizer (mz_transform::eqsat), the WMR-lift work, the delta-aware join cost + spelling selector (enable_eqsat_delta_join_cost), and the cost-model-native acyclic-join commit (enable_eqsat_native_join_commit) that emits JoinImplementation::Differential at raise time so JoinImplementation no-ops on eqsat-produced joins. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

…hints Design notes for the next eqsat sub-project: commit-time delta detection (crosses==0 && delta_new==0 -> DeltaQuery, recovering JI's free-delta plans on indexed joins without the AGM over-pick) and restoring JoinInputCharacteristics hints in the emitter (K/UK/A now; cardinality/filters when SP-B2/B3 land). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

…on design Strengthen the SP-B1 follow-on design notes per the design review: - Reframe delta condition (a) crosses==0 as a cheap guard, not independently load-bearing: a non-constant cross always needs a broadcast arrangement absent from `available`, so (b) delta_new==0 already rejects it. Stop attributing VOJ correctness to (a). - Add the spelling-selector interaction: the delta test runs over canonicalized equivalences, so the enable_eqsat_delta_join_cost selector is moot under native commit. The two sub-projects do not disagree about the VOJ. - Name the commit-time vs cost-time decoupling seam and bound delta commit to the acyclic case, where the touched input set is order-independent. - Back Design 2's "display-only" claim: characteristics are set post-decision and JI no-ops on committed joins; spec must grep consumers and defer partial fields rather than emit None/none(). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Adds the implementation spec and task plan for the SP-B1 follow-on (delta detection at commit time + JoinInputCharacteristics on committed joins), and adds a supersession banner to the brainstorm notes correcting the crosses==0 framing (viability, not a find_bound_expr panic guard). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Native join commit now populates K/UK/A markers via step_characteristics instead of None. Introduces NativeJoinFlags to thread the commit and prioritize_arranged flags. EXPLAIN-only; result rows unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

delta_join_order mirrors binary_join_order: one keyed left-deep path per driver, or None when the join is disconnected (no delta plan). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

commit_delta_query lowers per-driver paths to DeltaQuery (lookups only, no stored start key), reusing JI's implement_arrangements/permute_order/ install_lifted_mfp. delta_new_arrangements counts distinct missing arrangements like delta_queries::plan. Also adds an n >= 32 guard to delta_join_order to prevent u32 overflow on wide joins, matching delta_join_terms. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

raise.rs now tries a delta commit before differential: commit DeltaQuery iff delta_join_order yields a fully-keyed plan AND it needs no new arrangements (strict) or no more than differential (eager, when enable_eager_delta_joins). Recovers JI's free-delta plans on indexed joins that SP-B1 committed as differential. Result rows unchanged; VOJ stays differential. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

A 1-input "join" produces an unrenderable delta plan (empty per-driver path panics source_keys derivation). A 2-input join gains nothing from delta (no intermediate arrangement to save, adds a second redundant path). Mirror JoinImplementation's num_inputs <= 2 short-circuit (join_implementation.rs:408-429) by guarding the entire delta branch behind inputs.len() > 2. Regenerate four golden files where 2-input joins reverted from type=delta to type=differential. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

…— E4 parity) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

…— strict E4 parity) The differential join orderer predicted an input 'already arranged' by matching referenced COLUMN SETS against index key columns. For an expression edge like foo.a + 4 = baz.a this credited an index on plain foo.a to a start keyed by the expression foo.a + 4, which the index cannot cover. The orderer then steered to an order the commit could not honor, and the input was re-arranged (a full scan). Score the START's arrangement at expression granularity instead. `start_key_exprs` derives the start's first-edge key as expressions exactly as `commit_differential` does (find_bound_expr aligned to the first lookup), and `expr_key_arranged` compares them against the index keys with the same `available.contains(key)` Vec<MirScalarExpr> equality `implement_arrangements` uses. Lookup steps keep the column-set check; only the start moved. This is tightening-only: it removes false-positive arranged predictions, so orders can only improve. Closes the database-issues#2449 residual: eqsat now reuses foo's index on that join (parity with JoinImplementation) instead of a full scan. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Records the capability-hunt scorecard (E0-E5), the consolidation-not-capability verdict, Fix A/B/C dispositions, and the open engine-adoption decision. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01F2yQ2WPoYvdHXU3b2Pvfus

Move the 16 eqsat and JoinImplementation design docs out of the flat design/ directory into design/20260624_eqsat/, and update every path reference (4 source doc-comments, 10 plans/specs, cross-links between the docs, and one .claude/plans note). No content changes beyond the path rewrites. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01F2yQ2WPoYvdHXU3b2Pvfus

Pure rustfmt cleanup of pre-existing formatting drift in the crate (long signatures wrapped, no logic changes). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01F2yQ2WPoYvdHXU3b2Pvfus

Retirement ledger for subsuming the directional optimizer into eqsat: transforms classed A (retirable now) through E (eqsat dependencies, never retirable by subsumption), batched by retire target with a full-corpus rows+errors parity gate. Names the two-phase ceiling: subsumption retires A-D, phase 2 (self-contained engine) is needed for E. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01F2yQ2WPoYvdHXU3b2Pvfus

Port the 21 hand-written scalar rules to the declarative rewrite DSL and run them through the single CombinedLang machinery, deleting the standalone EGraph<ScalarLang>. Behavior-neutral, gated by a per-slice differential oracle against the old engine, with the strict no-rewrite slt-golden gate at the slice-7 production flip. Approach A (one grammar over CNode), builtin-applier escape hatch for the 6 eval/type-context rules, determinism-parity scalar extractor, 7-slice decomposition with slice 1 as the go/no-go. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Address review blockers B1/B2 and non-blockers N1-N4: - B1: corpus construction procedure (collect via instrumented callers, commit static fixture, three-axis coverage acceptance criterion). - B2: operationalize the slice-1 gate as extraction-identity + termination, and pin the copied saturate bounds (600/1000/100) as constants. - N1: re-home call_scalar_type + raise into scalar_extract.rs; builtins import from there. Slice 1 pins the public API. - N2: Lean permanent-sorry ledger convention + CI count check (exactly 6). - N3: concrete relational-golden regression command per slice. - N4: cite the existing rest... splice usage (relational.rewrite:169). B3 (claimed 20 rules) rejected: rules() has 21 function pointers (rules.rs:36-56); the reviewer's enumeration dropped const_fold. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Planning audit found 3 of slice 1's rules need capability beyond the bare seam, so they move to the slice that lands that machinery: - and_or_single (variadic matching) -> slice 2 - not_binary_negate (func-metavar + negate builtin) -> slice 5 - and_or_dedup (operand-dedup builtin) -> slice 6 Slice 1 now ports not_not alone (fixed-func unary), the minimal go/no-go. Add func-metavar binding as an explicit slice-5 machinery line, and a 2.1 note on fixed-func vs func-metavar scalar-call spelling. Tally 1+3+5+2+5+5=21. Rule count reconfirmed 21 by direct count of rules() (const_fold through isnull_fold); the earlier "20" claim dropped const_fold. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Task-by-task plan to port not_not declaratively and prove the one-grammar- over-CNode seam end-to-end: scalar match view methods, Pat::SUnary + grammar, codegen scalar arms + SCALAR_COMPILED_RULES split, lower_scalar, determinism- parity extractor, scalar saturate driver + canonicalize_combined, the rule itself, a differential parity harness against the old engine, Lean emission, and the go/no-go gate. Production stays on the old engine (delete-last is slice 7). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Adds MatchGraph::scalar_class_nodes / nodes_by_scalar_sym and the ScalarIndex/EGraph::scalar_index() builder so the matcher view can address scalar e-nodes in the combined e-graph, mirroring the existing relational Index/rel_index. BaseView threads a scalar_index field (empty in the relational saturation pass, since no scalar rule runs there yet). ColoredView gets sound no-op stubs for the two new trait methods, since colored scalar matching is not wired up until a later task.

Adds Pat::SUnary { func, input } and the Unary[<func>](child) grammar arm, parsing e.g. Unary[not](Unary[not](x)) into the AST. Codegen for the new variant lands in a follow-up task; until then codegen.rs's Pat matches (sym_name, Matcher::node, pat()) and lean.rs's Pat matches (collect_binders, translate_pat) are non-exhaustive, so mz-transform does not build yet.

…orb (SP2b Slice 6e, Task 5)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Port the two variadic null/error-propagation rules from the standalone ScalarEGraph engine into the CombinedLang DSL as Rust builtins, mirroring the already-shipped binary null_prop_binary / err_prop_binary. Behavior-neutral, colored: false. Task 2 wires the declarative rules. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Add null_prop_variadic and err_prop_variadic to scalar.rewrite, wiring Task 1's Rust builtins into two declarative rules. Both mirror the already-shipped null_prop_binary / err_prop_binary rules: a null-propagating variadic call folds to null (resp. the operand's error) when a literal-null (resp. literal-error) operand is present and every other operand cannot error. Behavior-neutral, colored: false. Builtin dispatch is fully generic, so no grammar/codegen/AST change is needed. SCALAR_COMPILED_RULES grows 24 -> 26; COMPILED_RULES unchanged at 37. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Regenerate Generated.lean to add the two permanent-sorry obligations for the null_prop_variadic / err_prop_variadic builtins and bump the trip-wire count 5 -> 7. The emitter (translate_tmpl) maps each Tmpl::Builtin RHS name to an opaque Lean function; Task 2 added the DSL rules but not this mapping, so regeneration panicked. Extend the match with the two variadic names and declare the matching opaque nullPropVariadic / errPropVariadic in Semantics.lean, mirroring the binary precedent exactly. Both obligations carry a permanent sorry (opaque RHS), so no proof is authored. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

The 'Grows to 6 by slice 6' line was chronology and factually stale after the count reached 7. Remove it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Add the slice-6b differential parity corpus and tests for the variadic null/error propagation rules (null_prop_variadic, err_prop_variadic): - scalar_parity_slice6b: MakeTimestamp positive cases (null_prop fires on leading/middle/trailing null; err_prop fires on a literal error), the null-vs-error priority crux (both null and error operand -> the error, since null_prop is gated off and err_prop wins), the blocked case (1 / c0 can error but is not a literal error, so neither fires), Coalesce as a non-null-propagating variadic left untouched, and And/Or negative controls (both propagate_nulls == false, so 6b never fires; they fold via short-circuit/drop_unit/single with parity preserved). - null_prop_variadic_guard_blocks_erroring_operand: the load-bearing other_can_error gate. Deleting the gate makes this test and the null-vs-error parity case fail (null and error literals collide in one class, tripping the scalar analysis merge-conflict assertion), and only those two. - null_err_prop_variadic_terminates: saturation converges (<= 10 iters). - corpus_covers_slice6b + five corpus entries. Gate: 386/386 eqsat unit tests; relational goldens 1187/1187 across 37 transform slt files, no --rewrite; Lean gate exit 0, PERMANENT SORRY == 7. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

The design claimed the emitter was fully generic on the builtin name. The Lean side is not: translate_tmpl uses a per-name match and each builtin needs an opaque decl in Semantics.lean. Correct the Architecture section to match the shipped mechanism. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

…iadics) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

…2b Slice 6d, Task 1) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

…lice 6d, Task 2) Port `scalar::rules::flatten_assoc` (nested same-op flattening) into the `rest_filters` idiom for all 5 associative variadics, and wire the seams so the Task 1 DSL variants (`RestFilter::FlattenSameFunc`, `Cond::FlattenApplies`) have their apply/guard implementations. - `rest_filters.rs`: add `FLATTEN_MAX_OPERANDS`, `flatten_inner` (private, the one-hop circular-ref skip), `flatten_applies`, `rest_flatten` (the cap). The guard and the splice share `flatten_inner` so they never disagree. - `MatchGraph::cond_flatten_applies`: trait + both impls (BaseView delegates, ColoredView inert `false`). - `codegen.rs`: FilterSplice emit, Cond emit, `cond_is_color_exact => false`, serialization arms, and `coalesce`/`greatest`/`least` in `variadic_func_value`. - `lean.rs`: placeholder `unimplemented!` arm so the exhaustive `RestFilter` match compiles. The real Lean render lands in Task 4. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Add flatten_and/flatten_or/flatten_coalesce/flatten_greatest/flatten_least to scalar.rewrite, porting scalar::rules::flatten_assoc (nested same-op flattening) to all 5 associative variadics using the Task 1-2 rest_filters plumbing (`flatten`, `flatten_applies`). This makes `rest_flatten`/`FLATTEN_MAX_OPERANDS` live, clearing Task 2's two dead_code warnings without any #[allow(dead_code)]. Also fixes a gap in build/codegen.rs::variadic_func_pat: Task 2 added coalesce/greatest/least to variadic_func_value (RHS construction) but not to variadic_func_pat (LHS matching), since no prior rule pattern-matched those funcs on the LHS. flatten_coalesce/greatest/least are the first to do so, which a `cargo check` panic ("unknown scalar variadic func keyword: coalesce") caught immediately. Added the 3 missing match arms, mirroring the existing and/or arms. SCALAR_COMPILED_RULES: 26 -> 31. COMPILED_RULES: unchanged (37). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Port the Lean side of `flatten_assoc`. And/Or flatten prove outright via a new list-flatten helper reusing the `innerAnd`/`innerOr` structural spec plus the existing `all`/`any` fold-invariance lemmas. Coalesce/Greatest/Least return non-Bool values (first-non-null/max/min) that the two-valued model cannot faithfully denote, so they are represented with an opaque `variadicOpaqueE` constructor (`VFunc` tag + opaque `denoteVariadicOpaque`) and their obligations are permanent sorries. - Semantics.lean: `VFunc` + `denoteVariadicOpaque`, `variadicOpaqueE` constructor, `denoteSMap` structural walker, `flattenSameFuncAnd`/`_Or` with `denoteSFold_{and,or}_flatten` proofs, opaque `flattenVariadicOpaque`. - lean.rs: `scalar_variadic_ctor` coalesce/greatest/least arms, `FlattenSameFunc` render, `choose_proof` and/or-flatten and opaque-variadic arms. - Generated.lean regenerated: flatten_and/or proved (axioms: propext only), flatten_coalesce/greatest/least carry PERMANENT SORRY. - lean-mir-rewrite.sh: expected_permanent 7 -> 10. Aggregate lake build green (Docker), PERMANENT SORRY count == 10, script exit 0. `#print axioms` confirms rule_flatten_and/or have no sorryAx. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

…oc (SP2b Slice 6d, Task 5) Close SP2b slice 6d with the differential parity corpus, the termination test, and the full-slice gate for the five `flatten_assoc` rules. - scalar_saturate.rs: `scalar_parity_slice6d` asserts `canonicalize_combined` == the standalone `scalar::canonicalize` oracle across the flatten domain: pure-flatten And/Or (leading/middle/trailing) landing flat 4-ary, the new-domain proof on the three non-boolean variadics (Coalesce/Greatest/Least), deep three-level nesting, flatten feeding a cascade (short-circuit, drop_unit, cross-boundary dedup, single/empty collapse), a mixed-op negative control, and Bool-typed `(1 / 0) = (1 / 0)` error operands inside And nesting (the slice-6c collapse-trap type). - scalar_saturate.rs: `flatten_terminates` asserts saturation converges in a small iteration bound on deep nesting, cross-boundary dedup, and the `and_or_single` self-loop shape the circular-ref skip guards, with no exponential operand growth. - scalar_saturate.rs: `corpus_covers_slice6d` pins the fixture substrings. - eqsat_scalar_corpus: slice-6d section mirroring the corpus cases. GATE GREEN: eqsat unit 389/389 (incl the 3 new tests); relational goldens 1187/1187 across 37 transform slt files, NO --rewrite, 0 failures; Lean gate `ci/test/lean-mir-rewrite.sh` exit 0, PERMANENT SORRY == 10. Test-only change, production path untouched. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Correct the factor_and_or precondition comment (dual-connective, not same-connective) and add a NOTE on choose_proof's variadicOpaqueE arm that it sorries every such rule, a latent footgun if a provable variadicOpaque rule is added later. Comment-only, Generated.lean unaffected. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

…ule) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Port `scalar/rules.rs::factor_and_or` (dual-connective distributive factoring, `(a∧b)∨(a∧c) => a∧(b∨c)`) into the DSL as a Rust builtin in `scalar_builtins.rs`, mirroring `null_prop_variadic`. Full-intersection factoring with the non-empty-residual condition (absorption deferred) and the residual-error gate (residual operands must be error-free, common factor exempt: the CLU-137 narrowing). Adds 8 unit tests covering both fire directions and every no-fire path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Add the factor_and_or rule to scalar.rewrite, dispatching to the scalar_builtins::factor_and_or builtin added previously. No grammar/codegen change: the generic Tmpl::Builtin dispatch already handles it. SCALAR_COMPILED_RULES grows 31 -> 32; COMPILED_RULES unchanged at 37. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

…-> 11) Bank the three locked Lean refinements for the last scalar rule. Semantics.lean gets its own explicitly named `opaque factorAndOr : ScalarExpr -> ScalarExpr`, mirroring the `constEval`/`nullPropVariadic` opaque precedent, and lean.rs gets the matching `translate_tmpl` arm `"factor_and_or" => format!("factorAndOr {}", args[0])`. Factor is NAMED and MATCHED, not routed through 6d's `variadicOpaqueE` arm. `choose_proof` gets a factor-specific arm keyed on `rhs.contains( "factorAndOr")`, placed BEFORE the generic `is_builtin_rhs` arm, emitting the THIRD sorry category: distributivity IS provable in the two-valued Bool model, so the sorry is a representation artifact of the builtin RHS (not declaratively expressed), dischargeable by declarativizing. Distinct from opaque-computation (const_fold) and outside-value-domain (6d non-Bool flatten). The rule LHS `Scalar(e)` renders to a plain var `e`, so the `variadicOpaqueE` arm is never reached for this rule. Regenerated Generated.lean: exactly one new obligation `rule_factor_and_or` carrying the third-category comment. Trip-wire `expected_permanent` 10 -> 11 in ci/test/lean-mir-rewrite.sh, comment extended. Aggregate Docker `lake build` green, PERMANENT SORRY marker count == 11. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

…r factor_and_or Add the differential parity corpus, termination-under-saturation test, and corpus-coverage assertions for `factor_and_or`, the last ported scalar rule. - `scalar_parity_slice6f`: runs both the combined engine and the standalone `scalar::canonicalize` oracle and asserts parity per case (the fork-5 neutral-portability proof). Covers firing (Or-of-Ands, dual And-of-Ors, n-ary, non-leading factor), non-firing (no common factor, partial share, mixed, single branch), the residual-error gate (blocks an erroring residual, fires on an erroring common factor per CLU-137), and cascade interactions (feeding/fed-by flatten, short-circuit, drop_unit, dedup, absorb, empty). - `factor_terminates` (fork 4): asserts saturation converges (iters <= 10) over factorable nesting, confirming fire-once and no ping-pong with flatten. - `corpus_covers_slice6f` + a slice-6f section in the corpus fixture. Errors are Bool-typed non-literal predicates to avoid the slice-6c collapse trap. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

The slice-6f corpus commit left an unsorted use-import that bin/fmt --check rejects. Reorder to satisfy the fmt gate. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

…aph) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

…Lang Swap the single injection point (scalar::canonicalize_predicates) from the standalone ScalarEGraph to canonicalize_combined. The standalone engine stays present as the parity oracle, so both engines coexist and the differential tests still gate this reroute. Flag enable_eqsat_scalar_canonicalize (default on) keeps the mz_expr reduce path as the off kill-switch. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

The combined engine is production (slice7 commit A). Delete the now-dead standalone driver (scalar/egraph.rs, rules.rs, raise.rs) and strip scalar.rs to the canonicalize_predicates entry, keeping the shared substrate (SNode, ClassAnalysis, lower_into, ScalarLang) the combined engine uses. Repoint the 13 differential-parity tests from the deleted oracle to frozen snapshots so the adversarial scalar-rule coverage survives. Completes SP2b: one CombinedLang engine, all 20 rules ported. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

…odes) Promoting the combined engine to production (this slice) surfaced that factor_and_or picked one And/Or node from the class via find_map over hash-ordered nodes and committed to it. A class can hold several And/Or nodes at once: a drop_unit rewrite (And(Or,Or)) sits in the same class as its pre-drop form (And(Or,Or,true)), and only the dropped form factors. Picking the pre-drop node yields an empty intersection and no fire, so whether factor fired depended on hash iteration order, making the extracted plan nondeterministic (~30% of runs left the predicate unfactored). Rows/errors were unaffected (the two forms are equivalent), but plan text and EXPLAIN were unstable. Try every eligible And/Or node and union each successful factoring. Union is commutative, so the resulting e-class is identical regardless of node order and extraction is reproducible. This also restores the standalone engine's exact semantics: it invoked the rule once per node, firing on every factorable one. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Reword comments that referenced the deleted scalar/egraph.rs, scalar/rules.rs, scalar/raise.rs and "the old/standalone engine" to describe the current code on its own terms. Comment-only, no behavior change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

The eqsat scalar differential corpus lived under tests/testdata/, which test_runner's datadriven::walk scans, so the walk tried to parse the corpus as a datadriven test file and the run test failed. The corpus is not a datadriven file, it is consumed by an include_str! in the eqsat scalar_saturate unit tests. Relocate it to tests/ (a sibling of testdata/ that neither datadriven walk covers) and update the include path. Behavior-neutral: no corpus content, rule, or sqllogictest change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

EBindings keyed its four metavar maps (rels, payloads, rests, binary_funcs) by String, forcing a heap allocation per bind on every match attempt during saturation. Every metavar name originates as a compile-time string literal in the generated rule code, so the keys are always 'static. Flip the key type to &'static str and drop the per-insert .to_string() in the codegen templates. Lookups are unchanged: map.get("name") still compiles against &'static str keys via Borrow<str>. binding_arities returns a &'static str-keyed map to match. Behavior-neutral: &str and String hash and order identically, so iteration and extraction are byte-for-byte unchanged. Keys only, no value type or logic change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

antiguru force-pushed the claude/mir-equality-optimizer-sodbej branch from 3c41464 to 58565b3 Compare June 20, 2026 10:56

antiguru commented Jun 21, 2026

View reviewed changes

antiguru force-pushed the claude/mir-equality-optimizer-sodbej branch 2 times, most recently from a0dac0d to 1d2ed6a Compare June 30, 2026 15:28

antiguru and others added 14 commits July 1, 2026 14:19

eqsat: add step_characteristics helper for join-input hints

b12e22f

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

eqsat: index-aware differential join orderer (Fix A, closes C_some)

b958157

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

eqsat: reuse index under a liftable projection (Fix B, closes C_skew …

8850814

…— E4 parity) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

eqsat: corpus-wide golden regen for Fix C arranged check

a1e6643

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

antiguru force-pushed the claude/mir-equality-optimizer-sodbej branch from 8cc3ae6 to a1e6643 Compare July 1, 2026 12:21

antiguru and others added 11 commits July 1, 2026 15:22

cargo fmt: mz-transform

5fcc866

Pure rustfmt cleanup of pre-existing formatting drift in the crate (long signatures wrapped, no logic changes). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01F2yQ2WPoYvdHXU3b2Pvfus

docs: SP2b slice-1 plan, replace prose em-dashes per house style

a68c28d

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

antiguru and others added 30 commits July 3, 2026 12:09

eqsat: differential parity + guard mutation-test + slice gate for abs…

652f99b

…orb (SP2b Slice 6e, Task 5)

sdd: slice-6e Task 5 complete (guard mutation-proven)

a6290b8

sdd: slice-6e final review READY TO MERGE; slice complete

8408d03

eqsat SP2b slice6b: spec + plan (null/err_prop_variadic builtins)

3d1d23d

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

eqsat SP2b slice6d: spec + plan (flatten_assoc, all 5 associative var…

5ee3f66

…iadics) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

eqsat dsl: FlattenSameFunc filter + FlattenApplies cond + grammar (SP…

141a976

…2b Slice 6d, Task 1) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

eqsat SP2b slice6f: spec + plan (factor_and_or builtin, last scalar r…

5fda700

…ule) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

eqsat SP2b slice7: spec + plan (reroute + delete standalone ScalarEGr…

837a23e

…aph) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

transform: equality-saturation MIR optimizer (mz_transform::eqsat)#37160

transform: equality-saturation MIR optimizer (mz_transform::eqsat)#37160
antiguru wants to merge 174 commits into
MaterializeInc:mainfrom
antiguru:claude/mir-equality-optimizer-sodbej

antiguru commented Jun 19, 2026 •

edited

Loading

Uh oh!

antiguru left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

antiguru commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

antiguru left a comment

Choose a reason for hiding this comment

Critical review: eqsat MIR optimizer

The bigger picture first

Blockers (internal contradictions that make the PR un-mergeable in either flag state)

Soundness surface that is asserted but not enforced

Architecture / cost

Repo hygiene / process

What I'd do to "complete the work"

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

antiguru commented Jun 19, 2026 •

edited

Loading