reduce: add a model-derived reference tactic (differential oracle)#780
Merged
frankmcsherry merged 10 commits intoJul 3, 2026
Merged
Conversation
TimeReplay mirrors the time-facing half of HistoryReplay (time / meet / step /
step_while_time_is / advance_buffer_by / buffer) but carries only *times* and walks
a shrinking view of the built history rather than popping it, so the underlying
history stays intact for a later value walk. advance_buffer_by compacts by join +
dedup -- the time-only analogue of consolidation -- which keeps a join-closure over
the times an antichain, hence non-quadratic. Adds ValueHistory::{walk, replay_times}.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Adds mod reference, a second ReduceTactic whose determination computes reduce's
interesting times directly as formal/Differential/Model.lean's Reached -- the
truncated join-closure over the stored input/output/seed times -- the non-quadratic
way: an ordered walk that keeps its live join-partners an antichain via
meet-compaction, over TimeReplay so it is time-only and leaves the histories intact
for the value walk. It runs at parity with the default cursor tactic, and exists as
a differential oracle for that tactic and an executable form of the model.
Also splits the cursor tactic's per-key compute into the same two phases
(determination, then evaluation); semantically identical, validated by the suite.
Entry points: Collection::{reduce_reference, reduce_named_reference},
Arranged::{reduce_abelian_reference, reduce_core_reference}, reduce_trace_reference.
Optional interesting-time counters in operators::reduce::metrics behind the
reduce-metrics feature.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…actic tests/reduce_reference.rs: a differential harness (cursor vs reference, assert_eq at every time) over Pair<u64,u64> partial-order random input with retractions and an iterative BFS; an oracle checking each tactic against a from-scratch f(acc input) (the executable form of Model.lean's stream-correctness); and a feature-gated over-derivation metric -- measured 1.00x, i.e. the value-blind reference derives the same interesting-time set as the value-aware cursor. examples/reduce_bench.rs: a cursor-vs-reference benchmark, at parity across churn / wide-interval / iterative workloads. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The reference tactic's time-only walk needs a scratch buffer, but it was added as a field on the shared ValueHistory, which the standard cursor tactic allocates and clears per key. Move the buffer to ReferenceThinker (pooled across keys, as before) and pass it into replay_times, which now borrows &self. ValueHistory is byte-identical to before the reference tactic. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The reference tactic is a testing and demonstration oracle, not something to
build on. Mark its entry points `#[doc(hidden)]` so they leave the documented
surface without inviting downstream reliance: Collection::{reduce_reference,
reduce_named_reference}, Arranged::{reduce_abelian_reference,
reduce_core_reference}, and reduce_trace_reference. Still callable, so the
differential and oracle tests are unchanged.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nged
The reference tactic is a testing and demonstration oracle. Inherent methods
on the stable public types (Collection::{reduce_reference, reduce_named_reference},
Arranged::{reduce_abelian_reference, reduce_core_reference}) commit us to them
under semver and invite reliance, so remove them. The one standalone entry
reduce_trace_reference stays (doc-hidden). The arrange / abelian-negate /
as_collection glue those methods provided moves into a local `reduce_reference`
helper in the test and the benchmark, the only callers.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Move reduce_trace_reference into `pub(crate) mod reference` alongside the tactic it constructs, and re-export it (doc-hidden) from the parent as the sole public handle its out-of-crate tests need. Move the reduce-metrics counters off the top of the file to the bottom; they are an optional diagnostic, not the headline. Also drop a stray blank line left in arrangement.rs by the earlier removal. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The interesting-time counters served only the `over_derivation` characterization test, which measured (not verified) the value-blind reference's over-derivation and asserted a non-correctness property. They cost a Cargo feature, two global AtomicUsize statics, and two fetch_add lines in both tactics' per-key hot loops (and would silently sum across workers). Correctness is already covered by the differential and oracle tests. Drop the feature, the metrics module, the instrumentation, and the test; the measurement stays reproducible via ad-hoc instrumentation when wanted. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
It existed only to race the default cursor tactic against the reference tactic. The reference is a testing and demonstration oracle, not a shipped code path, so a standing benchmark of it does not belong in `examples/`. Remove it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…l order) Phase 2 of the reference tactic visits only the active interesting times, so at each it catches the input/output histories up to `next_time`. `history` is sorted by the total `Ord` and `step` pops the least, but the walk stopped on the partial `t.less_equal(next_time)`. The partial order interleaves with the sort: an `Ord`-earlier time incomparable to `next_time` halts the walk early and strands later edits that are genuinely `<= next_time`. The reduce then accumulates an incomplete input and emits a transiently wrong output, self-correcting one interesting-time later. Non-iterative and single-iterate (two-coordinate) shapes never observe the in-between times, so it stayed hidden; SCC's nested iteration (three coordinates) feeds the transient error back and it compounds into a wrong result. Fix: step the `Ord`-prefix `*t <= next_time`; the existing `less_equal` filter when assembling the buffers selects the true `<= next_time` edits. This matches the value-aware cursor, which walks every time in `Ord` order and filters by `less_equal`. The Model.lean stream-correctness statement is unaffected — the traversal was simply unfaithful to the partial-order accumulation it realizes. Adds `scc_*` to tests/reduce_reference.rs: a differential SCC (two nested product-timed fixpoints) that reaches the three-coordinate depth the prior tests did not. The buggy tactic also thrashed the SCC fixpoint; with the fix it converges (scc_sweep ~0.26s). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a second
ReduceTactic—reference— that computesreduce's interesting times directly as the join-closure the formal model (formal/Differential/Model.lean) callsReached, then evaluates them. It runs at parity with the default cursor tactic and exists to differential-test that tactic and tie the code to the proof. The default (cursors::CursorTactic) stays the only tactic on production paths;referenceis opt-in.Why
reduce's hard part is determining which times to re-evaluate (joins of input/output times). The optimized cursor tactic does this with an intricate incremental navigation; there was no independent implementation to check it against, and no executable link to the model.referenceis both:Reached, so agreement with the cursor is the operational↔declarative correspondence tested, and its agreement withf(accumulated input)isModel.lean's stream-correctness tested.What it does
Per key, two phases:
Reached) — close the seeds (batch + due pending) under join with the ground set (stored input/output times), truncated atupper. Computed the non-quadratic way: an ordered walk keeping its live join-partners an antichain by advancing them with the running meet. Time-only — reads the histories' times via a newTimeReplay, leaving each history intact.emit_correct) — walk those times over the same loaded histories, keep accumulations tight by meet, apply the logic, emit corrections.The naive (uncompacted) closure was ~230× slower on wide intervals; the meet-compaction is what makes it linear — the same idea the cursor tactic uses.
Also included: cursor tactic phase-split
cursors::CursorTactic's per-keycomputeis refactored into the same two phases (determination, then evaluation), so its structure mirrors the reference's. Semantically identical — validated by the full test suite. This touches the blessed tactic; happy to drop it into a separate PR if preferred.Performance
Parity with the cursor tactic (
examples/reduce_bench.rs, cursor vs reference): churn (reduce-dominated) ~1.0×, fewkeys (wide intervals) ~1.0×, bfs (iterative) ~1.0×.Correctness & testing (
tests/reduce_reference.rs)differential_*,bfs_*): cursor vs reference,assert_eqat every time, overPair<u64,u64>partial-order random input with retractions, and an iterative BFS. The drift detector — run in CI to keep the tactics in sync.oracle_*): each tactic vs a from-scratchf(accumulated input)at many times — the executable form of stream-correctness. Independent of the two agreeing, so it catches bugs in shared code (the driver,ValueHistory).over_derivation,--features reduce-metrics): the reference is value-blind, so in theory it could derive extra zero-debt addresses. Measured across a cancellation-heavy density sweep: 1.00× — identical interesting-time counts.Contract
referencecomputes the fullReached; the cursor tactic is a value-aware equivalent (its consolidation additionally drops advance-cancelled, zero-debt times). Both are output-equivalent; the tests verify output equality, not interesting-set equality — a set-size difference is not a bug.Files
src/operators/mod.rs—TimeReplay,ValueHistory::{walk, replay_times}.src/operators/reduce.rs—mod reference,reduce_trace_reference, feature-gatedmetrics; cursor phase-split.src/operators/arrange/arrangement.rs,src/collection.rs—reduce_referenceentry points.tests/reduce_reference.rs,examples/reduce_bench.rs,Cargo.toml(reduce-metrics).🤖 Generated with Claude Code