reduce: add a model-derived `reference` tactic (differential oracle) by frankmcsherry · Pull Request #780 · TimelyDataflow/differential-dataflow

frankmcsherry · 2026-07-02T18:48:07Z

Summary

Adds a second ReduceTactic — reference — that computes reduce's interesting times directly as the join-closure the formal model (formal/Differential/Model.lean) calls Reached, then evaluates them. It runs at parity with the default cursor tactic and exists to differential-test that tactic and tie the code to the proof. The default (cursors::CursorTactic) stays the only tactic on production paths; reference is opt-in.

Why

reduce's hard part is determining which times to re-evaluate (joins of input/output times). The optimized cursor tactic does this with an intricate incremental navigation; there was no independent implementation to check it against, and no executable link to the model. reference is both:

a differential oracle — an independent determination whose output must match the cursor tactic on every input; and
the operational form of the model — its determination is Reached, so agreement with the cursor is the operational↔declarative correspondence tested, and its agreement with f(accumulated input) is Model.lean's stream-correctness tested.

What it does

Per key, two phases:

Determination (Reached) — close the seeds (batch + due pending) under join with the ground set (stored input/output times), truncated at upper. Computed the non-quadratic way: an ordered walk keeping its live join-partners an antichain by advancing them with the running meet. Time-only — reads the histories' times via a new TimeReplay, leaving each history intact.
Application (emit_correct) — walk those times over the same loaded histories, keep accumulations tight by meet, apply the logic, emit corrections.

The naive (uncompacted) closure was ~230× slower on wide intervals; the meet-compaction is what makes it linear — the same idea the cursor tactic uses.

Also included: cursor tactic phase-split

cursors::CursorTactic's per-key compute is refactored into the same two phases (determination, then evaluation), so its structure mirrors the reference's. Semantically identical — validated by the full test suite. This touches the blessed tactic; happy to drop it into a separate PR if preferred.

Performance

Parity with the cursor tactic (examples/reduce_bench.rs, cursor vs reference): churn (reduce-dominated) ~1.0×, fewkeys (wide intervals) ~1.0×, bfs (iterative) ~1.0×.

Correctness & testing (`tests/reduce_reference.rs`)

Differential (differential_*, bfs_*): cursor vs reference, assert_eq at every time, over Pair<u64,u64> partial-order random input with retractions, and an iterative BFS. The drift detector — run in CI to keep the tactics in sync.
Oracle (oracle_*): each tactic vs a from-scratch f(accumulated input) at many times — the executable form of stream-correctness. Independent of the two agreeing, so it catches bugs in shared code (the driver, ValueHistory).
Over-derivation (over_derivation, --features reduce-metrics): the reference is value-blind, so in theory it could derive extra zero-debt addresses. Measured across a cancellation-heavy density sweep: 1.00× — identical interesting-time counts.

Contract

reference computes the full Reached; the cursor tactic is a value-aware equivalent (its consolidation additionally drops advance-cancelled, zero-debt times). Both are output-equivalent; the tests verify output equality, not interesting-set equality — a set-size difference is not a bug.

Files

src/operators/mod.rs — TimeReplay, ValueHistory::{walk, replay_times}.
src/operators/reduce.rs — mod reference, reduce_trace_reference, feature-gated metrics; cursor phase-split.
src/operators/arrange/arrangement.rs, src/collection.rs — reduce_reference entry points.
tests/reduce_reference.rs, examples/reduce_bench.rs, Cargo.toml (reduce-metrics).

🤖 Generated with Claude Code

TimeReplay mirrors the time-facing half of HistoryReplay (time / meet / step / step_while_time_is / advance_buffer_by / buffer) but carries only *times* and walks a shrinking view of the built history rather than popping it, so the underlying history stays intact for a later value walk. advance_buffer_by compacts by join + dedup -- the time-only analogue of consolidation -- which keeps a join-closure over the times an antichain, hence non-quadratic. Adds ValueHistory::{walk, replay_times}. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Adds mod reference, a second ReduceTactic whose determination computes reduce's interesting times directly as formal/Differential/Model.lean's Reached -- the truncated join-closure over the stored input/output/seed times -- the non-quadratic way: an ordered walk that keeps its live join-partners an antichain via meet-compaction, over TimeReplay so it is time-only and leaves the histories intact for the value walk. It runs at parity with the default cursor tactic, and exists as a differential oracle for that tactic and an executable form of the model. Also splits the cursor tactic's per-key compute into the same two phases (determination, then evaluation); semantically identical, validated by the suite. Entry points: Collection::{reduce_reference, reduce_named_reference}, Arranged::{reduce_abelian_reference, reduce_core_reference}, reduce_trace_reference. Optional interesting-time counters in operators::reduce::metrics behind the reduce-metrics feature. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…actic tests/reduce_reference.rs: a differential harness (cursor vs reference, assert_eq at every time) over Pair<u64,u64> partial-order random input with retractions and an iterative BFS; an oracle checking each tactic against a from-scratch f(acc input) (the executable form of Model.lean's stream-correctness); and a feature-gated over-derivation metric -- measured 1.00x, i.e. the value-blind reference derives the same interesting-time set as the value-aware cursor. examples/reduce_bench.rs: a cursor-vs-reference benchmark, at parity across churn / wide-interval / iterative workloads. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

The reference tactic's time-only walk needs a scratch buffer, but it was added as a field on the shared ValueHistory, which the standard cursor tactic allocates and clears per key. Move the buffer to ReferenceThinker (pooled across keys, as before) and pass it into replay_times, which now borrows &self. ValueHistory is byte-identical to before the reference tactic. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The reference tactic is a testing and demonstration oracle, not something to build on. Mark its entry points `#[doc(hidden)]` so they leave the documented surface without inviting downstream reliance: Collection::{reduce_reference, reduce_named_reference}, Arranged::{reduce_abelian_reference, reduce_core_reference}, and reduce_trace_reference. Still callable, so the differential and oracle tests are unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…nged The reference tactic is a testing and demonstration oracle. Inherent methods on the stable public types (Collection::{reduce_reference, reduce_named_reference}, Arranged::{reduce_abelian_reference, reduce_core_reference}) commit us to them under semver and invite reliance, so remove them. The one standalone entry reduce_trace_reference stays (doc-hidden). The arrange / abelian-negate / as_collection glue those methods provided moves into a local `reduce_reference` helper in the test and the benchmark, the only callers. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Move reduce_trace_reference into `pub(crate) mod reference` alongside the tactic it constructs, and re-export it (doc-hidden) from the parent as the sole public handle its out-of-crate tests need. Move the reduce-metrics counters off the top of the file to the bottom; they are an optional diagnostic, not the headline. Also drop a stray blank line left in arrangement.rs by the earlier removal. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The interesting-time counters served only the `over_derivation` characterization test, which measured (not verified) the value-blind reference's over-derivation and asserted a non-correctness property. They cost a Cargo feature, two global AtomicUsize statics, and two fetch_add lines in both tactics' per-key hot loops (and would silently sum across workers). Correctness is already covered by the differential and oracle tests. Drop the feature, the metrics module, the instrumentation, and the test; the measurement stays reproducible via ad-hoc instrumentation when wanted. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

It existed only to race the default cursor tactic against the reference tactic. The reference is a testing and demonstration oracle, not a shipped code path, so a standing benchmark of it does not belong in `examples/`. Remove it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…l order) Phase 2 of the reference tactic visits only the active interesting times, so at each it catches the input/output histories up to `next_time`. `history` is sorted by the total `Ord` and `step` pops the least, but the walk stopped on the partial `t.less_equal(next_time)`. The partial order interleaves with the sort: an `Ord`-earlier time incomparable to `next_time` halts the walk early and strands later edits that are genuinely `<= next_time`. The reduce then accumulates an incomplete input and emits a transiently wrong output, self-correcting one interesting-time later. Non-iterative and single-iterate (two-coordinate) shapes never observe the in-between times, so it stayed hidden; SCC's nested iteration (three coordinates) feeds the transient error back and it compounds into a wrong result. Fix: step the `Ord`-prefix `*t <= next_time`; the existing `less_equal` filter when assembling the buffers selects the true `<= next_time` edits. This matches the value-aware cursor, which walks every time in `Ord` order and filters by `less_equal`. The Model.lean stream-correctness statement is unaffected — the traversal was simply unfaithful to the partial-order accumulation it realizes. Adds `scc_*` to tests/reduce_reference.rs: a differential SCC (two nested product-timed fixpoints) that reaches the three-coordinate depth the prior tests did not. The buggy tactic also thrashed the SCC fixpoint; with the fix it converges (scc_sweep ~0.26s). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

frankmcsherry and others added 10 commits July 2, 2026 14:47

frankmcsherry merged commit 5bbb9f6 into TimelyDataflow:master-next Jul 3, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

reduce: add a model-derived `reference` tactic (differential oracle)#780

reduce: add a model-derived `reference` tactic (differential oracle)#780
frankmcsherry merged 10 commits into
TimelyDataflow:master-nextfrom
frankmcsherry:reduce-reference-tactic

frankmcsherry commented Jul 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

frankmcsherry commented Jul 2, 2026

Summary

Why

What it does

Also included: cursor tactic phase-split

Performance

Correctness & testing (tests/reduce_reference.rs)

Contract

Files

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Correctness & testing (`tests/reduce_reference.rs`)