Skip to content

transform: equality-saturation MIR optimizer (mz_transform::eqsat)#37160

Draft
antiguru wants to merge 174 commits into
MaterializeInc:mainfrom
antiguru:claude/mir-equality-optimizer-sodbej
Draft

transform: equality-saturation MIR optimizer (mz_transform::eqsat)#37160
antiguru wants to merge 174 commits into
MaterializeInc:mainfrom
antiguru:claude/mir-equality-optimizer-sodbej

Conversation

@antiguru

@antiguru antiguru commented Jun 19, 2026

Copy link
Copy Markdown
Member

An e-graph / equality-saturation optimizer over a subset of MIR, wired into the logical optimizer behind enable_eqsat_optimizer (default on) and the physical optimizer behind enable_eqsat_physical_optimizer (default off).

It lowers MIR to a relational IR with real MirScalarExpr payloads, saturates a ported rule engine over an e-graph, extracts the cheapest plan under a memory-primary cost model, and raises back. Unsupported subtrees bail to opaque leaves carried verbatim, so the pass is equivalence preserving on arbitrary input; an arity guard at the transform boundary makes any mismatch a no-op.

The intended win is the cyclic-join decision: on a cyclic (e.g. triangle) join the AGM cost model proves worst-case-optimal (delta) evaluation dominates the binary join on memory and time, where JoinImplementation picks the dominated binary plan. The physical pass commits that decision by tagging the join JoinImplementation::DeltaQuery (synthesized by reusing the real delta planner), which JoinImplementation then leaves unre-planned. WcoJoin is created only for cyclic joins, and a binary join's cost charges its persistent arrangements rather than its transient N-squared output, so the choice is delta where it wins and binary parity elsewhere.

Supporting infrastructure: a polarity-aware extractor that never selects a multiplicity-signed representative as the input of a non-linear Reduce/TopK (with a graceful no-op when a root is otherwise unextractable), a logical/physical phase annotation on rewrite rules, and a constant-column e-class analysis. These are equivalence preserving and scoped to the appropriate pass.

A prior cutover that skipped CanonicalizeMfp under the flag is reverted, since the e-graph does not process every plan and so cannot subsume it.

The logical flag is default-on as the intended steady state (the strangler-fig plan: grow eqsat until it can delete pipeline passes). CI exercises the pass across the SLT corpus with the flag on and surfaces any plan or result diffs; the affected goldens are rewritten to the eqsat-on output. The logical pass is parity, not improvement, today; the genuine plan win is the physical worst-case-optimal-join decision behind the second, default-off flag.

🤖 Generated with Claude Code
https://claude.ai/code/session_016CwwG1wCgqf3v8g6uhf1Nx

@antiguru antiguru force-pushed the claude/mir-equality-optimizer-sodbej branch from 3c41464 to 58565b3 Compare June 20, 2026 10:56

@antiguru antiguru left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical review: eqsat MIR optimizer

I read the engine end-to-end plus the design-review / status / roadmap / polarity-extractor docs. This is genuinely impressive work and the self-assessment in the status doc is unusually honest. The notes below are the gaps I'd want closed (or explicitly punted with eyes open) before this leaves draft, followed by the bigger-picture framing.

The bigger picture first

By the team's own framing, the live logical pass is "parity, not improvement." Every active rule duplicates a rewrite the pipeline already performs, and the one genuine divergence (the WCOJ→DeltaQuery decision) ships only behind the second, default-off flag (enable_eqsat_physical_optimizer). So in the default-on configuration this PR proposes, the user-visible plan benefit today is ≈0, while it adds a 16.5k-line optimizer surface, real optimize latency, and new soundness obligations. The justification is strangler-fig: grow eqsat until it can delete pipeline passes. That's a sound strategy — but no deletion has landed yet, and the one cutover that had landed (skipping CanonicalizeMfp under the flag) is reverted in this very PR because it was unsound (lib.rs:958 comment). So the current state is "extra work for parity," and the merge decision should be judged on that basis, not on the eventual vision.

That's the lens for the blockers below.

Blockers (internal contradictions that make the PR un-mergeable in either flag state)

1. The flag default contradicts itself across the PR. The PR body says "the flag is temporarily defaulted on so CI exercises the pass… Revert to off before un-drafting," and the status doc lists the same revert under "What is left." But definitions.rs (enable_eqsat_optimizer) now carries the opposite comment: "Defaulted on… the intended steady state… not a pre-merge revert." Pick one. Shipping a brand-new experimental optimizer default-on to every user, for zero plan benefit today, is a big call that needs an explicit owner sign-off — not an ambiguity buried in a comment.

2. The SLT goldens are silently coupled to the default-on flag, and nothing pins it. Ten SLT files are rewritten to eqsat-on output (not-null-propagation.slt +218/-… , locally_optimized_plan.slt 118 lines, joins, chbench, aoc_*, record, subquery, demand, reduction_pushdown), but no SLT file sets enable_eqsat_optimizer (grep across test/sqllogictest/ finds zero references). So those goldens pass only because the global default is true. This makes #1 not just a doc nit but a hard contradiction: if you "revert to off before un-drafting," these tests immediately fail; if you keep it on, you're shipping on. There is no flag state in which the PR as written is both green and consistent with its own stated plan. Resolution options: (a) commit to default-on and fix the PR body/status doc; (b) gate each eqsat-divergent golden behind an explicit per-file SET enable_eqsat_optimizer (or mzcompose override) and revert the default to off; (c) revert the golden rewrites. (b) is the honest one if the intent is really "off at merge."

3. Synthesized empty-constant types are placeholders, and the pass is now live. raise::repr_type_of_arity (raise.rs:391) types every column of a synthesized Empty (from union_cancel / empty_false_filter) as Int64?. Its own doc comment still says "The pass is offline; the surrounding optimizer recomputes column types when this is ever wired live" — but it is wired live now. The boundary arity guard in EqSatTransform (transform.rs:64) catches arity drift but not column-type drift. So when one of those rules collapses a branch whose true column types aren't Int64 (e.g. a text/numeric column), eqsat emits an internally-consistent-but-wrong-typed plan. Because Typecheck recomputes bottom-up, the wrong type is self-consistent and may not be rejected — which means it can silently change a query's result-column type, or feed a type-incompatible arm to a sibling Union. Please carry the real ColumnType through the Empty node (it's available at lower time), and add a regression test: an empty-collapsed branch with a non-Int64 column must round-trip its type and survive the final strict Typecheck.

Soundness surface that is asserted but not enforced

4. Negate-join rules are enabled; their soundness rests entirely on the polarity-aware extractor; the matching Lean obligations are sorry. The status doc says the two unsound negate-join rules "were removed," but they're live in the rule file (relational.rewrite:131 distribute_negate_join, :136 factor_negate_join). The newest workstream (2026-06-21-polarity-aware-extractor.md) re-enabled them, with soundness moved into the extractor ("no Negate-rooted representative directly under a non-linear Reduce/TopK") rather than a rule guard. That's a defensible design — but it makes the extractor's polarity lattice a load-bearing correctness mechanism, and the corresponding Lean theorems (negative-multiplicity / non-linear-reduce) are among the 24 sorrys. So the formal-soundness story for the riskiest rules is currently aspirational.

5. The Lean spec is not gated by CI and can rot. There's no CI job that builds src/transform/lean/, and no check that gen-lean's Generated.lean is in sync with the rule file (unlike the repo's other generated artifacts, which have sync gates). 24/32 theorems are sorry. Without a gate, (a) a rule change can desync Generated.lean unnoticed, and (b) the sorry count can grow silently. Either wire a real gate (build Lean, fail on new sorry, assert gen-lean output is regenerated) or state plainly in the docs that the Lean artifact is advisory, not enforced — right now the prose implies more assurance than CI provides.

6. raise alone can emit un-renderable plans; correctness depends on a bundled cleanup. lib.rs/raise.rs note that raise can produce a single Reduce mixing reduction types that ReducePlan::create_from panics on, and only the bundled logical_fixpoint_02 re-run (raise.rs:306) splits it via ReduceReduction. That's an implicit invariant on a comment. Any future caller of raise that skips the cleanup crashes rendering. Make it an enforced invariant + test ("raise output never contains a mixed-reduction Reduce"), or split the reduction inside raise itself.

Architecture / cost

7. The keystone the design-review prescribed was only half-done. The 2026-06-19 design review's #1 recommendation was to retarget ENode onto real MirRelationExpr/MirScalarExpr and delete the dual Rel/lower/raise IR, calling the dual IR "structurally unsuited to replace production passes" (lossy round-trips, type loss, per-operator bail treadmill). What landed is a partial retarget — scalar payloads are real MirScalarExpr, but the relational dual IR remains. So every problem that recommendation targeted persists: raise still drops constant-singleton identity inputs via join_scalars (raise.rs:170), still synthesizes placeholder types (#3), and still bails per-operator. Please reconcile this explicitly: is direct-on-MIR still the plan, or is the dual IR the accepted steady state? The docs don't close the loop with the design review's verdict.

8. The default-on path runs fixpoint_logical_02 twice, back-to-back, plus a full Demand/ProjectionPushdown and two coalesce_mfp passes. EqSatTransform is pushed right after fixpoint_logical_02() in logical_optimizer (lib.rs:815 then the push), and optimize_logical re-runs the entire fixpoint_logical_02 inside raise (raise.rs:306), plus coalesce_mfp twice and full Demand+ProjectionPushdown (eqsat.rs:139-164). For zero plan benefit today, that's a lot of repeated optimizer work. Combined with MAX_PLAN_SIZE = 200 (which makes the pass a no-op on most non-trivial plans anyway), the net effect at default-on is "added latency on small plans, parity output." Please measure the added optimize latency (the status doc itself cites 27s and ~6.5s episodes that were tamed by caps, not by being fast) and let that inform the default-on decision.

9. The "wins" aren't wins. compare_real reports 3 wins / 0 losses / 17 ties, but the 3 are self-described as "cost-model artifacts (eqsat omits a canonicalizing Project)." An omitted canonicalizing projection is plausibly a less-canonical plan, not an improvement. I'd treat the harness as showing parity, and double-check the omitted Project isn't a latent canonicalization regression rather than a feature.

Repo hygiene / process

10. ~3k lines of agent working docs are committed under docs/superpowers/ (plus .superpowers/sdd/e1-report.md), several already stale relative to the code: the design review predates the retarget; the status doc claims "32 rules / negate rules removed / CanonicalizeMfp cutover done" while the code has 34 rules, negate rules enabled, and the cutover reverted. Decide whether these belong in the product repo at all (vs. doc/developer/design/ per existing convention) and, if kept, mark the historical ones as such so they don't read as current truth.

11. 61 commits / 63 files / 16.5k lines in one PR is effectively un-reviewable by a human as a unit. Strongly consider splitting for landing: (a) flag plumbing + no-op registration, (b) the engine, (c) the Lean spec + its CI gate, (d) the SLT golden changes. Each is independently reviewable and bisectable.

What I'd do to "complete the work"

  1. Resolve #1/#2 as a unit — decide the merge-time flag state and make the goldens consistent with it (I'd default off and gate the divergent goldens explicitly until the first real pass deletion lands, since default-on buys nothing today and adds risk).
  2. Fix the empty-constant type loss (#3) and add the type round-trip test; promote the mixed-Reduce invariant (#6) from comment to test.
  3. Add the Lean CI gate or downgrade the soundness claims in prose (#5); track the negate-join sorrys (#4) as explicit release-blockers for the physical flag, not the logical one.
  4. Write the docs reconciliation: dual-IR-vs-direct-on-MIR decision (#7), refresh the stale status doc, and quantify the latency (#8).
  5. Then the strangler-fig payoff becomes real: land the first actual fixpoint deletion behind the flag with a green SLT gate — that's the milestone that converts this from "parity at a cost" into "a pass worth defaulting on."

Happy to dig into any single thread (the polarity lattice, the cost model's memory axis, or the empty-type path) in more depth.


Generated by Claude Code

@antiguru antiguru force-pushed the claude/mir-equality-optimizer-sodbej branch 2 times, most recently from a0dac0d to 1d2ed6a Compare June 30, 2026 15:28
antiguru and others added 14 commits July 1, 2026 14:19
…commit

Squashed feature branch (eqsat MIR optimizer subsystem) for rebase onto main.
Includes: the equality-saturation MIR optimizer (mz_transform::eqsat), the
WMR-lift work, the delta-aware join cost + spelling selector
(enable_eqsat_delta_join_cost), and the cost-model-native acyclic-join commit
(enable_eqsat_native_join_commit) that emits JoinImplementation::Differential at
raise time so JoinImplementation no-ops on eqsat-produced joins.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
…hints

Design notes for the next eqsat sub-project: commit-time delta detection
(crosses==0 && delta_new==0 -> DeltaQuery, recovering JI's free-delta plans on
indexed joins without the AGM over-pick) and restoring JoinInputCharacteristics
hints in the emitter (K/UK/A now; cardinality/filters when SP-B2/B3 land).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
…on design

Strengthen the SP-B1 follow-on design notes per the design review:

- Reframe delta condition (a) crosses==0 as a cheap guard, not independently
  load-bearing: a non-constant cross always needs a broadcast arrangement
  absent from `available`, so (b) delta_new==0 already rejects it. Stop
  attributing VOJ correctness to (a).
- Add the spelling-selector interaction: the delta test runs over canonicalized
  equivalences, so the enable_eqsat_delta_join_cost selector is moot under
  native commit. The two sub-projects do not disagree about the VOJ.
- Name the commit-time vs cost-time decoupling seam and bound delta commit to
  the acyclic case, where the touched input set is order-independent.
- Back Design 2's "display-only" claim: characteristics are set post-decision
  and JI no-ops on committed joins; spec must grep consumers and defer partial
  fields rather than emit None/none().

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
Adds the implementation spec and task plan for the SP-B1 follow-on (delta
detection at commit time + JoinInputCharacteristics on committed joins), and
adds a supersession banner to the brainstorm notes correcting the crosses==0
framing (viability, not a find_bound_expr panic guard).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
Native join commit now populates K/UK/A markers via step_characteristics
instead of None. Introduces NativeJoinFlags to thread the commit and
prioritize_arranged flags. EXPLAIN-only; result rows unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
delta_join_order mirrors binary_join_order: one keyed left-deep path per
driver, or None when the join is disconnected (no delta plan).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
commit_delta_query lowers per-driver paths to DeltaQuery (lookups only, no
stored start key), reusing JI's implement_arrangements/permute_order/
install_lifted_mfp. delta_new_arrangements counts distinct missing
arrangements like delta_queries::plan.

Also adds an n >= 32 guard to delta_join_order to prevent u32 overflow
on wide joins, matching delta_join_terms.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
raise.rs now tries a delta commit before differential: commit DeltaQuery
iff delta_join_order yields a fully-keyed plan AND it needs no new
arrangements (strict) or no more than differential (eager, when
enable_eager_delta_joins). Recovers JI's free-delta plans on indexed joins
that SP-B1 committed as differential. Result rows unchanged; VOJ stays
differential.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
A 1-input "join" produces an unrenderable delta plan (empty per-driver
path panics source_keys derivation). A 2-input join gains nothing from
delta (no intermediate arrangement to save, adds a second redundant
path). Mirror JoinImplementation's num_inputs <= 2 short-circuit
(join_implementation.rs:408-429) by guarding the entire delta branch
behind inputs.len() > 2. Regenerate four golden files where 2-input
joins reverted from type=delta to type=differential.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
…— E4 parity)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
…— strict E4 parity)

The differential join orderer predicted an input 'already arranged' by matching
referenced COLUMN SETS against index key columns. For an expression edge like
foo.a + 4 = baz.a this credited an index on plain foo.a to a start keyed by the
expression foo.a + 4, which the index cannot cover. The orderer then steered to
an order the commit could not honor, and the input was re-arranged (a full scan).

Score the START's arrangement at expression granularity instead. `start_key_exprs`
derives the start's first-edge key as expressions exactly as `commit_differential`
does (find_bound_expr aligned to the first lookup), and `expr_key_arranged`
compares them against the index keys with the same `available.contains(key)`
Vec<MirScalarExpr> equality `implement_arrangements` uses. Lookup steps keep the
column-set check; only the start moved. This is tightening-only: it removes
false-positive arranged predictions, so orders can only improve.

Closes the database-issues#2449 residual: eqsat now reuses foo's index on that
join (parity with JoinImplementation) instead of a full scan.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
@antiguru antiguru force-pushed the claude/mir-equality-optimizer-sodbej branch from 8cc3ae6 to a1e6643 Compare July 1, 2026 12:21
antiguru and others added 11 commits July 1, 2026 15:22
Records the capability-hunt scorecard (E0-E5), the consolidation-not-capability
verdict, Fix A/B/C dispositions, and the open engine-adoption decision.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01F2yQ2WPoYvdHXU3b2Pvfus
Move the 16 eqsat and JoinImplementation design docs out of the flat
design/ directory into design/20260624_eqsat/, and update every path
reference (4 source doc-comments, 10 plans/specs, cross-links between
the docs, and one .claude/plans note). No content changes beyond the
path rewrites.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01F2yQ2WPoYvdHXU3b2Pvfus
Pure rustfmt cleanup of pre-existing formatting drift in the crate
(long signatures wrapped, no logic changes).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01F2yQ2WPoYvdHXU3b2Pvfus
Retirement ledger for subsuming the directional optimizer into eqsat:
transforms classed A (retirable now) through E (eqsat dependencies,
never retirable by subsumption), batched by retire target with a
full-corpus rows+errors parity gate. Names the two-phase ceiling:
subsumption retires A-D, phase 2 (self-contained engine) is needed
for E.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01F2yQ2WPoYvdHXU3b2Pvfus
Port the 21 hand-written scalar rules to the declarative rewrite DSL and
run them through the single CombinedLang machinery, deleting the standalone
EGraph<ScalarLang>. Behavior-neutral, gated by a per-slice differential
oracle against the old engine, with the strict no-rewrite slt-golden gate at
the slice-7 production flip.

Approach A (one grammar over CNode), builtin-applier escape hatch for the 6
eval/type-context rules, determinism-parity scalar extractor, 7-slice
decomposition with slice 1 as the go/no-go.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
Address review blockers B1/B2 and non-blockers N1-N4:
- B1: corpus construction procedure (collect via instrumented callers,
  commit static fixture, three-axis coverage acceptance criterion).
- B2: operationalize the slice-1 gate as extraction-identity + termination,
  and pin the copied saturate bounds (600/1000/100) as constants.
- N1: re-home call_scalar_type + raise into scalar_extract.rs; builtins
  import from there. Slice 1 pins the public API.
- N2: Lean permanent-sorry ledger convention + CI count check (exactly 6).
- N3: concrete relational-golden regression command per slice.
- N4: cite the existing rest... splice usage (relational.rewrite:169).

B3 (claimed 20 rules) rejected: rules() has 21 function pointers
(rules.rs:36-56); the reviewer's enumeration dropped const_fold.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
Planning audit found 3 of slice 1's rules need capability beyond the bare
seam, so they move to the slice that lands that machinery:
- and_or_single (variadic matching) -> slice 2
- not_binary_negate (func-metavar + negate builtin) -> slice 5
- and_or_dedup (operand-dedup builtin) -> slice 6

Slice 1 now ports not_not alone (fixed-func unary), the minimal go/no-go.
Add func-metavar binding as an explicit slice-5 machinery line, and a 2.1
note on fixed-func vs func-metavar scalar-call spelling. Tally 1+3+5+2+5+5=21.

Rule count reconfirmed 21 by direct count of rules() (const_fold through
isnull_fold); the earlier "20" claim dropped const_fold.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
Task-by-task plan to port not_not declaratively and prove the one-grammar-
over-CNode seam end-to-end: scalar match view methods, Pat::SUnary + grammar,
codegen scalar arms + SCALAR_COMPILED_RULES split, lower_scalar, determinism-
parity extractor, scalar saturate driver + canonicalize_combined, the rule
itself, a differential parity harness against the old engine, Lean emission,
and the go/no-go gate. Production stays on the old engine (delete-last is
slice 7).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
Adds MatchGraph::scalar_class_nodes / nodes_by_scalar_sym and the
ScalarIndex/EGraph::scalar_index() builder so the matcher view can
address scalar e-nodes in the combined e-graph, mirroring the existing
relational Index/rel_index. BaseView threads a scalar_index field
(empty in the relational saturation pass, since no scalar rule runs
there yet). ColoredView gets sound no-op stubs for the two new trait
methods, since colored scalar matching is not wired up until a later
task.
Adds Pat::SUnary { func, input } and the Unary[<func>](child) grammar arm,
parsing e.g. Unary[not](Unary[not](x)) into the AST. Codegen for the new
variant lands in a follow-up task; until then codegen.rs's Pat matches
(sym_name, Matcher::node, pat()) and lean.rs's Pat matches (collect_binders,
translate_pat) are non-exhaustive, so mz-transform does not build yet.
antiguru and others added 30 commits July 3, 2026 12:09
Port the two variadic null/error-propagation rules from the standalone
ScalarEGraph engine into the CombinedLang DSL as Rust builtins, mirroring the
already-shipped binary null_prop_binary / err_prop_binary. Behavior-neutral,
colored: false. Task 2 wires the declarative rules.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
Add null_prop_variadic and err_prop_variadic to scalar.rewrite, wiring
Task 1's Rust builtins into two declarative rules. Both mirror the
already-shipped null_prop_binary / err_prop_binary rules: a
null-propagating variadic call folds to null (resp. the operand's error)
when a literal-null (resp. literal-error) operand is present and every
other operand cannot error. Behavior-neutral, colored: false.

Builtin dispatch is fully generic, so no grammar/codegen/AST change is
needed. SCALAR_COMPILED_RULES grows 24 -> 26; COMPILED_RULES unchanged
at 37.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
Regenerate Generated.lean to add the two permanent-sorry obligations for
the null_prop_variadic / err_prop_variadic builtins and bump the trip-wire
count 5 -> 7.

The emitter (translate_tmpl) maps each Tmpl::Builtin RHS name to an opaque
Lean function; Task 2 added the DSL rules but not this mapping, so
regeneration panicked. Extend the match with the two variadic names and
declare the matching opaque nullPropVariadic / errPropVariadic in
Semantics.lean, mirroring the binary precedent exactly. Both obligations
carry a permanent sorry (opaque RHS), so no proof is authored.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
The 'Grows to 6 by slice 6' line was chronology and factually stale
after the count reached 7. Remove it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
Add the slice-6b differential parity corpus and tests for the variadic
null/error propagation rules (null_prop_variadic, err_prop_variadic):

- scalar_parity_slice6b: MakeTimestamp positive cases (null_prop fires on
  leading/middle/trailing null; err_prop fires on a literal error), the
  null-vs-error priority crux (both null and error operand -> the error,
  since null_prop is gated off and err_prop wins), the blocked case
  (1 / c0 can error but is not a literal error, so neither fires),
  Coalesce as a non-null-propagating variadic left untouched, and And/Or
  negative controls (both propagate_nulls == false, so 6b never fires;
  they fold via short-circuit/drop_unit/single with parity preserved).
- null_prop_variadic_guard_blocks_erroring_operand: the load-bearing
  other_can_error gate. Deleting the gate makes this test and the
  null-vs-error parity case fail (null and error literals collide in one
  class, tripping the scalar analysis merge-conflict assertion), and only
  those two.
- null_err_prop_variadic_terminates: saturation converges (<= 10 iters).
- corpus_covers_slice6b + five corpus entries.

Gate: 386/386 eqsat unit tests; relational goldens 1187/1187 across 37
transform slt files, no --rewrite; Lean gate exit 0, PERMANENT SORRY == 7.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
The design claimed the emitter was fully generic on the builtin name.
The Lean side is not: translate_tmpl uses a per-name match and each
builtin needs an opaque decl in Semantics.lean. Correct the Architecture
section to match the shipped mechanism.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
…2b Slice 6d, Task 1)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
…lice 6d, Task 2)

Port `scalar::rules::flatten_assoc` (nested same-op flattening) into the
`rest_filters` idiom for all 5 associative variadics, and wire the seams so
the Task 1 DSL variants (`RestFilter::FlattenSameFunc`, `Cond::FlattenApplies`)
have their apply/guard implementations.

- `rest_filters.rs`: add `FLATTEN_MAX_OPERANDS`, `flatten_inner` (private, the
  one-hop circular-ref skip), `flatten_applies`, `rest_flatten` (the cap). The
  guard and the splice share `flatten_inner` so they never disagree.
- `MatchGraph::cond_flatten_applies`: trait + both impls (BaseView delegates,
  ColoredView inert `false`).
- `codegen.rs`: FilterSplice emit, Cond emit, `cond_is_color_exact => false`,
  serialization arms, and `coalesce`/`greatest`/`least` in `variadic_func_value`.
- `lean.rs`: placeholder `unimplemented!` arm so the exhaustive `RestFilter`
  match compiles. The real Lean render lands in Task 4.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
Add flatten_and/flatten_or/flatten_coalesce/flatten_greatest/flatten_least
to scalar.rewrite, porting scalar::rules::flatten_assoc (nested same-op
flattening) to all 5 associative variadics using the Task 1-2
rest_filters plumbing (`flatten`, `flatten_applies`). This makes
`rest_flatten`/`FLATTEN_MAX_OPERANDS` live, clearing Task 2's two
dead_code warnings without any #[allow(dead_code)].

Also fixes a gap in build/codegen.rs::variadic_func_pat: Task 2 added
coalesce/greatest/least to variadic_func_value (RHS construction) but
not to variadic_func_pat (LHS matching), since no prior rule
pattern-matched those funcs on the LHS. flatten_coalesce/greatest/least
are the first to do so, which a `cargo check` panic
("unknown scalar variadic func keyword: coalesce") caught immediately.
Added the 3 missing match arms, mirroring the existing and/or arms.

SCALAR_COMPILED_RULES: 26 -> 31. COMPILED_RULES: unchanged (37).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
Port the Lean side of `flatten_assoc`. And/Or flatten prove outright via a
new list-flatten helper reusing the `innerAnd`/`innerOr` structural spec plus
the existing `all`/`any` fold-invariance lemmas. Coalesce/Greatest/Least
return non-Bool values (first-non-null/max/min) that the two-valued model
cannot faithfully denote, so they are represented with an opaque
`variadicOpaqueE` constructor (`VFunc` tag + opaque `denoteVariadicOpaque`)
and their obligations are permanent sorries.

- Semantics.lean: `VFunc` + `denoteVariadicOpaque`, `variadicOpaqueE`
  constructor, `denoteSMap` structural walker, `flattenSameFuncAnd`/`_Or` with
  `denoteSFold_{and,or}_flatten` proofs, opaque `flattenVariadicOpaque`.
- lean.rs: `scalar_variadic_ctor` coalesce/greatest/least arms,
  `FlattenSameFunc` render, `choose_proof` and/or-flatten and opaque-variadic
  arms.
- Generated.lean regenerated: flatten_and/or proved (axioms: propext only),
  flatten_coalesce/greatest/least carry PERMANENT SORRY.
- lean-mir-rewrite.sh: expected_permanent 7 -> 10.

Aggregate lake build green (Docker), PERMANENT SORRY count == 10, script
exit 0. `#print axioms` confirms rule_flatten_and/or have no sorryAx.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
…oc (SP2b Slice 6d, Task 5)

Close SP2b slice 6d with the differential parity corpus, the termination
test, and the full-slice gate for the five `flatten_assoc` rules.

- scalar_saturate.rs: `scalar_parity_slice6d` asserts `canonicalize_combined`
  == the standalone `scalar::canonicalize` oracle across the flatten domain:
  pure-flatten And/Or (leading/middle/trailing) landing flat 4-ary, the
  new-domain proof on the three non-boolean variadics
  (Coalesce/Greatest/Least), deep three-level nesting, flatten feeding a
  cascade (short-circuit, drop_unit, cross-boundary dedup, single/empty
  collapse), a mixed-op negative control, and Bool-typed `(1 / 0) = (1 / 0)`
  error operands inside And nesting (the slice-6c collapse-trap type).
- scalar_saturate.rs: `flatten_terminates` asserts saturation converges in a
  small iteration bound on deep nesting, cross-boundary dedup, and the
  `and_or_single` self-loop shape the circular-ref skip guards, with no
  exponential operand growth.
- scalar_saturate.rs: `corpus_covers_slice6d` pins the fixture substrings.
- eqsat_scalar_corpus: slice-6d section mirroring the corpus cases.

GATE GREEN: eqsat unit 389/389 (incl the 3 new tests); relational goldens
1187/1187 across 37 transform slt files, NO --rewrite, 0 failures; Lean gate
`ci/test/lean-mir-rewrite.sh` exit 0, PERMANENT SORRY == 10. Test-only change,
production path untouched.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
Correct the factor_and_or precondition comment (dual-connective, not
same-connective) and add a NOTE on choose_proof's variadicOpaqueE arm
that it sorries every such rule, a latent footgun if a provable
variadicOpaque rule is added later. Comment-only, Generated.lean
unaffected.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
Port `scalar/rules.rs::factor_and_or` (dual-connective distributive
factoring, `(a∧b)∨(a∧c) => a∧(b∨c)`) into the DSL as a Rust builtin in
`scalar_builtins.rs`, mirroring `null_prop_variadic`. Full-intersection
factoring with the non-empty-residual condition (absorption deferred) and
the residual-error gate (residual operands must be error-free, common
factor exempt: the CLU-137 narrowing). Adds 8 unit tests covering both
fire directions and every no-fire path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
Add the factor_and_or rule to scalar.rewrite, dispatching to the
scalar_builtins::factor_and_or builtin added previously. No
grammar/codegen change: the generic Tmpl::Builtin dispatch already
handles it. SCALAR_COMPILED_RULES grows 31 -> 32; COMPILED_RULES
unchanged at 37.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
…-> 11)

Bank the three locked Lean refinements for the last scalar rule.

Semantics.lean gets its own explicitly named `opaque factorAndOr :
ScalarExpr -> ScalarExpr`, mirroring the `constEval`/`nullPropVariadic`
opaque precedent, and lean.rs gets the matching `translate_tmpl` arm
`"factor_and_or" => format!("factorAndOr {}", args[0])`. Factor is NAMED
and MATCHED, not routed through 6d's `variadicOpaqueE` arm.

`choose_proof` gets a factor-specific arm keyed on `rhs.contains(
"factorAndOr")`, placed BEFORE the generic `is_builtin_rhs` arm, emitting
the THIRD sorry category: distributivity IS provable in the two-valued
Bool model, so the sorry is a representation artifact of the builtin RHS
(not declaratively expressed), dischargeable by declarativizing. Distinct
from opaque-computation (const_fold) and outside-value-domain (6d non-Bool
flatten). The rule LHS `Scalar(e)` renders to a plain var `e`, so the
`variadicOpaqueE` arm is never reached for this rule.

Regenerated Generated.lean: exactly one new obligation `rule_factor_and_or`
carrying the third-category comment. Trip-wire `expected_permanent` 10 ->
11 in ci/test/lean-mir-rewrite.sh, comment extended. Aggregate Docker
`lake build` green, PERMANENT SORRY marker count == 11.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
…r factor_and_or

Add the differential parity corpus, termination-under-saturation test, and
corpus-coverage assertions for `factor_and_or`, the last ported scalar rule.

- `scalar_parity_slice6f`: runs both the combined engine and the standalone
  `scalar::canonicalize` oracle and asserts parity per case (the fork-5
  neutral-portability proof). Covers firing (Or-of-Ands, dual And-of-Ors,
  n-ary, non-leading factor), non-firing (no common factor, partial share,
  mixed, single branch), the residual-error gate (blocks an erroring residual,
  fires on an erroring common factor per CLU-137), and cascade interactions
  (feeding/fed-by flatten, short-circuit, drop_unit, dedup, absorb, empty).
- `factor_terminates` (fork 4): asserts saturation converges (iters <= 10) over
  factorable nesting, confirming fire-once and no ping-pong with flatten.
- `corpus_covers_slice6f` + a slice-6f section in the corpus fixture.

Errors are Bool-typed non-literal predicates to avoid the slice-6c collapse trap.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
The slice-6f corpus commit left an unsorted use-import that bin/fmt
--check rejects. Reorder to satisfy the fmt gate.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
…Lang

Swap the single injection point (scalar::canonicalize_predicates) from the
standalone ScalarEGraph to canonicalize_combined. The standalone engine stays
present as the parity oracle, so both engines coexist and the differential
tests still gate this reroute. Flag enable_eqsat_scalar_canonicalize (default
on) keeps the mz_expr reduce path as the off kill-switch.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
The combined engine is production (slice7 commit A). Delete the now-dead
standalone driver (scalar/egraph.rs, rules.rs, raise.rs) and strip scalar.rs to
the canonicalize_predicates entry, keeping the shared substrate (SNode,
ClassAnalysis, lower_into, ScalarLang) the combined engine uses. Repoint the 13
differential-parity tests from the deleted oracle to frozen snapshots so the
adversarial scalar-rule coverage survives. Completes SP2b: one CombinedLang
engine, all 20 rules ported.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
…odes)

Promoting the combined engine to production (this slice) surfaced that
factor_and_or picked one And/Or node from the class via find_map over
hash-ordered nodes and committed to it. A class can hold several And/Or nodes at
once: a drop_unit rewrite (And(Or,Or)) sits in the same class as its pre-drop
form (And(Or,Or,true)), and only the dropped form factors. Picking the pre-drop
node yields an empty intersection and no fire, so whether factor fired depended
on hash iteration order, making the extracted plan nondeterministic
(~30% of runs left the predicate unfactored). Rows/errors were unaffected (the
two forms are equivalent), but plan text and EXPLAIN were unstable.

Try every eligible And/Or node and union each successful factoring. Union is
commutative, so the resulting e-class is identical regardless of node order and
extraction is reproducible. This also restores the standalone engine's exact
semantics: it invoked the rule once per node, firing on every factorable one.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
Reword comments that referenced the deleted scalar/egraph.rs, scalar/rules.rs,
scalar/raise.rs and "the old/standalone engine" to describe the current code on
its own terms. Comment-only, no behavior change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
The eqsat scalar differential corpus lived under tests/testdata/, which
test_runner's datadriven::walk scans, so the walk tried to parse the
corpus as a datadriven test file and the run test failed. The corpus is
not a datadriven file, it is consumed by an include_str! in the eqsat
scalar_saturate unit tests. Relocate it to tests/ (a sibling of testdata/
that neither datadriven walk covers) and update the include path.

Behavior-neutral: no corpus content, rule, or sqllogictest change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
EBindings keyed its four metavar maps (rels, payloads, rests,
binary_funcs) by String, forcing a heap allocation per bind on every
match attempt during saturation. Every metavar name originates as a
compile-time string literal in the generated rule code, so the keys are
always 'static. Flip the key type to &'static str and drop the
per-insert .to_string() in the codegen templates. Lookups are unchanged:
map.get("name") still compiles against &'static str keys via
Borrow<str>. binding_arities returns a &'static str-keyed map to match.

Behavior-neutral: &str and String hash and order identically, so
iteration and extraction are byte-for-byte unchanged. Keys only, no
value type or logic change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QNZHc5J4bdG29HzPbaCu7H
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant