diff --git a/formal/Differential/Model.lean b/formal/Differential/Model.lean index 3a6b7457f..3b08bbb28 100644 --- a/formal/Differential/Model.lean +++ b/formal/Differential/Model.lean @@ -25,7 +25,10 @@ Correspondence with TARGETS2.md: advance-and-consolidate is one instance of the adversary * "The operator, abstracted", steps 1–4 → `seedSet`, `Reached`/`activeSet`/`pendedSet` (step 2, with the truncated closure: a time reached in `Beyond upper` is closed no - further), the `emit_*` fields of `Step` (step 3), `pending_eq` (step 4) + further), the `emit_*` fields of `Step` (step 3), the `pending_*` fields of `Step` + (step 4, relaxed: the implementation may warn about MORE times than the enumeration + defers, so long as they stay beyond the frontier; `Step.of_pending_eq` recovers the + exact choice) * "The goal theorem" → `Run.StreamCorrect`, `StreamCorrectness` * "The round invariant" → `Run.INV` (clause 1 = finalized correctness, clause 2 = pending coverage); `inv_zero` is the base case @@ -106,6 +109,27 @@ theorem acc_mapDomain (g : T → T) (d : T →₀ A) (t : T) Finsupp.sum_mapDomain_index (by simp) (fun b m₁ m₂ => by split <;> simp)] exact Finsupp.sum_congr fun x hx => by simp only [hg x hx] +/-! ### Intra-round compaction is invisible + +The implementation also compacts WITHIN a round: moving through the interesting times in +order, it advances its input and output buffers by joining their times with the meet `m` +of every time still to be examined, and consolidates. Because `m` is at or below +everything the round still reads, this needs no adversary move at all: the two lemmas +below (TARGETS2.md's *join preservation*, specialized to a one-element frontier) say the +compacted buffer yields the same accumulation at every remaining evaluation point and the +same join against every remaining seed as the uncompacted history would. -/ + +/-- Advancing by a meet below `t` and consolidating leaves the accumulation at `t` + unchanged. -/ +theorem acc_mapDomain_sup (d : T →₀ A) {m t : T} (hmt : m ≤ t) : + acc (Finsupp.mapDomain (· ⊔ m) d) t = acc d t := + acc_mapDomain _ d t fun _ _ => ⟨fun h => le_trans le_sup_left h, fun h => sup_le h hmt⟩ + +/-- Joins against a seed above the meet are unchanged by advancing the joinee, so the + enumeration discovers the same interesting times from the compacted buffer. -/ +theorem sup_advance_eq {m s : T} (x : T) (hms : m ≤ s) : s ⊔ (x ⊔ m) = s ⊔ x := by + rw [sup_comm x m, ← sup_assoc, sup_eq_left.mpr hms] + /-! ## The operator, abstracted (`TARGETS2.md`, steps 1–4) -/ /-- Operator state after a round: the stored input and output representatives and the @@ -151,15 +175,20 @@ def pendedSet (σ : State T A) (b : T →₀ A) (upper : Finset T) : Set T := state after, delivering batch `b` and advancing the frontier to `upper`. The emission `e` is supported on the active times and makes the output correct there - (step 3). The new pending set is exactly the deferred times plus the carried ones - (step 4). The new stored input and output are ADVERSARIAL: any update sets whose - accumulations agree with the true ones on `Beyond upper` ("Compaction, modeled - adversarially"). -/ + (step 3). The new pending set CONTAINS the deferred times plus the carried ones, and + is confined beyond the frontier (step 4, relaxed): the implementation warns about + more times than the enumeration defers — e.g. joins with already-finalized times it + kept around "to scare future times" — and over-warning is harmless, since a warned + time that owes nothing is evaluated to an empty correction when it comes due. The + new stored input and output are ADVERSARIAL: any update sets whose accumulations + agree with the true ones on `Beyond upper` ("Compaction, modeled adversarially"). -/ structure Step (f : A → A) (upper : Finset T) (b : T →₀ A) (σ : State T A) (e : T →₀ A) (σ' : State T A) : Prop where emit_support : ↑e.support ⊆ activeSet σ b upper emit_correct : ∀ t ∈ activeSet σ b upper, acc (σ.output + e) t = f (acc (σ.stored + b) t) - pending_eq : σ'.pending = pendedSet σ b upper ∪ (σ.pending ∩ Beyond upper) + pending_super : pendedSet σ b upper ∪ (σ.pending ∩ Beyond upper) ⊆ σ'.pending + pending_confined : σ'.pending ⊆ Beyond upper + pending_finite : σ'.pending.Finite stored_agrees : ∀ t ∈ Beyond upper, acc σ'.stored t = acc (σ.stored + b) t output_agrees : ∀ t ∈ Beyond upper, acc σ'.output t = acc (σ.output + e) t @@ -252,9 +281,8 @@ theorem active_not_beyond {σ : State T A} {b : T →₀ A} {upper : Finset T} { set. -/ theorem Step.reached_beyond_pended {f : A → A} {upper : Finset T} {b : T →₀ A} {σ σ' : State T A} {e : T →₀ A} (hs : Step f upper b σ e σ') - {x : T} (hr : x ∈ reachedSet σ b upper) (hb : x ∈ Beyond upper) : x ∈ σ'.pending := by - rw [hs.pending_eq] - exact Or.inl ⟨hr, hb⟩ + {x : T} (hr : x ∈ reachedSet σ b upper) (hb : x ∈ Beyond upper) : x ∈ σ'.pending := + hs.pending_super (Or.inl ⟨hr, hb⟩) /-- Scenario 1 (cancellation), concretely over `ℕ × ℕ` and `ℤ`: the batch `{+1 at (1,0), −1 at (0,1)}` accumulates to zero at every point of `Beyond {(1,1)}`, @@ -589,18 +617,12 @@ theorem Run.output_agrees_emitted {f : A → A} {n : ℕ} (R : Run T A f n) : /-- Pending times always sit at or beyond the current frontier. -/ theorem Run.pending_beyond {f : A → A} {n : ℕ} (R : Run T A f n) : ∀ r ≤ n, (R.state r).pending ⊆ Beyond (R.frontier r) := by - intro r - induction r with + intro r hr + cases r with | zero => - intro _ rw [R.state_zero] exact Set.empty_subset _ - | succ r _ => - intro hr x hx - rw [(R.step r (by omega)).pending_eq] at hx - rcases hx with hx | hx - · exact hx.2 - · exact hx.2 + | succ r => exact (R.step r (by omega)).pending_confined /-- The three-part discrepancy trace of a round: the change term, the carried staleness, minus the emission. -/ @@ -877,14 +899,10 @@ theorem Run.step_residual {f : A → A} {n : ℕ} (R : Run T A f n) {r : ℕ} (h · exact Or.inl (Or.inl hy) · exact Or.inr (Or.inl hy)) (Or.inl (Finset.mem_coe.mpr hnu)) hnux hcl hxb - refine ⟨p, ?_, hple⟩ - rw [hstep.pending_eq] - exact Or.inl ⟨hpr, hpb⟩ + exact ⟨p, hstep.pending_super (Or.inl ⟨hpr, hpb⟩), hple⟩ · by_cases hyp : ∃ y ∈ (R.state r).pending, y ≤ x ∧ y ∈ Beyond (R.frontier (r + 1)) · obtain ⟨y, hyP, hyx, hyb⟩ := hyp - refine ⟨y, ?_, hyx⟩ - rw [hstep.pending_eq] - exact Or.inr ⟨hyP, hyb⟩ + exact ⟨y, hstep.pending_super (Or.inr ⟨hyP, hyb⟩), hyx⟩ · replace hyp : ∀ y ∈ (R.state r).pending, y ≤ x → y ∉ Beyond (R.frontier (r + 1)) := fun y hy hyx hyb => hyp ⟨y, hy, hyx, hyb⟩ @@ -900,9 +918,7 @@ theorem Run.step_residual {f : A → A} {n : ℕ} (R : Run T A f n) {r : ℕ} (h · exact Or.inl (Or.inl hy) · exact Or.inl (Or.inr hy)) (Or.inr ⟨hp₀P, hp₀due⟩) hp₀x hcl hxb - refine ⟨p, ?_, hple⟩ - rw [hstep.pending_eq] - exact Or.inl ⟨hpr, hpb⟩ + exact ⟨p, hstep.pending_super (Or.inl ⟨hpr, hpb⟩), hple⟩ /-- **The transfer property**: the residual staleness trace, supported in the OLD ground set's closure above a new pending time, can be re-based onto the NEW representatives' @@ -1016,6 +1032,41 @@ theorem pendedSet_subset_supClosure (σ : State T A) (b : T →₀ A) (upper : F pendedSet σ b upper ⊆ supClosure (baseSet σ b upper) := fun _ hx => Reached.mem_supClosure Set.subset_union_right hx.1 +/-- The deferred set is finite: it lives in the sup-closure of a finite base. -/ +theorem pendedSet_finite (σ : State T A) (b : T →₀ A) (upper : Finset T) + (hP : σ.pending.Finite) : (pendedSet σ b upper).Finite := by + refine Set.Finite.subset ?_ (pendedSet_subset_supClosure σ b upper) + refine Set.Finite.supClosure ?_ + exact Set.Finite.union + (Set.Finite.union (Finset.finite_toSet _) (Finset.finite_toSet _)) + (Set.Finite.union (Finset.finite_toSet _) + (Set.Finite.subset hP Set.sdiff_subset)) + +/-- Build a `Step` from the EXACT pending choice — the deferred times plus the carried + ones, nothing more. This is TARGETS2.md's step 4 as originally written; the relaxed + `pending_*` fields follow from exactness. -/ +theorem Step.of_pending_eq {f : A → A} {upper : Finset T} {b : T →₀ A} + {σ σ' : State T A} {e : T →₀ A} (hP : σ.pending.Finite) + (emit_support : ↑e.support ⊆ activeSet σ b upper) + (emit_correct : ∀ t ∈ activeSet σ b upper, + acc (σ.output + e) t = f (acc (σ.stored + b) t)) + (pending_eq : σ'.pending = pendedSet σ b upper ∪ (σ.pending ∩ Beyond upper)) + (stored_agrees : ∀ t ∈ Beyond upper, acc σ'.stored t = acc (σ.stored + b) t) + (output_agrees : ∀ t ∈ Beyond upper, acc σ'.output t = acc (σ.output + e) t) : + Step f upper b σ e σ' where + emit_support := emit_support + emit_correct := emit_correct + pending_super := by rw [pending_eq] + pending_confined := by + rw [pending_eq] + exact Set.union_subset (fun _ hx => hx.2) (fun _ hx => hx.2) + pending_finite := by + rw [pending_eq] + exact Set.Finite.union (pendedSet_finite σ b upper hP) + (Set.Finite.subset hP Set.inter_subset_left) + stored_agrees := stored_agrees + output_agrees := output_agrees + /-! ## Consumption-time re-derivation: the with-output trace, for ANY adversary Rather than carrying the staleness trace across the adversary (the `Transfer`), rebuild @@ -1029,27 +1080,15 @@ stored output times, as the implementation does): it is exactly what makes `Run.stream_correct` unconditional. An output-free variant of the model would still need `Transfer` proved by other means. -/ -/-- Pending sets are finite: they live in the closure of a finite base. -/ +/-- Pending sets are finite. -/ theorem Run.pending_finite {f : A → A} {n : ℕ} (R : Run T A f n) : ∀ r ≤ n, ((R.state r).pending).Finite := by - intro r - induction r with + intro r hr + cases r with | zero => - intro _ rw [R.state_zero] exact Set.finite_empty - | succ r ih => - intro hr - have hrn : r < n := by omega - rw [(R.step r hrn).pending_eq] - refine Set.Finite.union ?_ ?_ - · refine Set.Finite.subset ?_ (pendedSet_subset_supClosure _ _ _) - refine Set.Finite.supClosure ?_ - refine Set.Finite.union - (Set.Finite.union (Finset.finite_toSet _) (Finset.finite_toSet _)) - (Set.Finite.union (Finset.finite_toSet _) ?_) - exact Set.Finite.subset (ih (by omega)) Set.sdiff_subset - · exact Set.Finite.subset (ih (by omega)) Set.inter_subset_left + | succ r => exact (R.step r (by omega)).pending_finite /-- **The with-output re-derivation.** If the staleness at round `r` vanishes wherever no pending time reaches (which `step_residual` supplies), then it has a trace supported @@ -1293,7 +1332,7 @@ noncomputable def scenario1Run : Run (ℕ × ℕ) ℤ s1f 2 where interval_cases r · -- Round 1: seeds `(1,0)` and `(0,1)`, join pended, adversary cancels the input. show Step s1f {(1, 1)} s1b s1st0 s1e1 s1st1 - refine ⟨?_, ?_, rfl, ?_, ?_⟩ + refine Step.of_pending_eq (by exact Set.finite_empty) ?_ ?_ rfl ?_ ?_ · -- emissions at the two seeds, both active intro x hx have hmem : x ∈ ({(1, 0), (0, 1)} : Set (ℕ × ℕ)) := by @@ -1363,7 +1402,11 @@ noncomputable def scenario1Run : Run (ℕ × ℕ) ℤ s1f 2 where s1_not_beyond_10 (Reached.seed (Or.inl (Finsupp.mem_support_iff.mpr (by simp [s1b_apply_01])))) s1_not_beyond_01 - refine ⟨?_, ?_, rfl, ?_, ?_⟩ + have hP1 : (s1st1.pending).Finite := by + exact Set.Finite.union + (pendedSet_finite s1st0 s1b {(1, 1)} (by exact Set.finite_empty)) + (Set.Finite.subset (by exact Set.finite_empty) Set.inter_subset_left) + refine Step.of_pending_eq hP1 ?_ ?_ rfl ?_ ?_ · intro x hx have hmem : x = ((1, 1) : ℕ × ℕ) := by have := Finsupp.support_single_subset (Finset.mem_coe.mp hx) diff --git a/formal/README.md b/formal/README.md index 2b7d4c744..c04789f0b 100644 --- a/formal/README.md +++ b/formal/README.md @@ -21,6 +21,37 @@ The goal theorem (`Model.Run.stream_correct`, quantified as `Model.streamCorrect The enumeration's ground set includes the stored output support (TARGETS2's step 2, as amended there); whether the OUTPUT-FREE enumeration also suffices is the open question recorded in TARGETS2's amendment note. TARGETS2.md states the model in prose; the file header of `Model.lean` has a sentence-by-sentence correspondence table, and `scenario1Run` checks TARGETS2's cancellation scenario end to end. +## The algorithm, in prose + +The proof organizes around four observations; they are worth having separately from the Lean, as a way to study the algorithm. + +**What is owed has no memory.** +At any moment the debt at a time `t` is the difference of two readable quantities: `f` of the input accumulated at or below `t`, minus the corrections emitted at or below `t`. +Compaction of either collection may change anything except accumulations at or beyond the frontier — and everything the operator will ever again read is an accumulation at such a time. +So the compacted input and compacted output together determine the debt at every non-finalized time exactly, and each round can recompute what it owes from scratch rather than carry anything across the compaction (`Run.staleness_rederived`). +This is where keeping the output earns its keep: the input testifies to what SHOULD have been said, the output to what WAS said, and the debt is the gap between the two. +Without the stored output, "what was said" is knowable only by attributing past emissions to joins of input times as they stood at emission; input compaction can orphan that attribution when the constituents later merge or cancel, and re-basing debts onto the surviving times is exactly the open output-free question. + +**Debt is created only above the seeds.** +Values are not enough; the operator must also know the ADDRESSES at which to re-evaluate. +Track the debt across a round and ask what can change it at a time `t`: the input accumulation moves only if a batch update sits at or below `t`, and the emitted accumulation moves only if the round emits at or below `t` — which happens only at active times, all above the seed set `N` (batch times plus pending times now due). +So at a `t` above nothing in `N`, both ingredients are frozen: whatever debt `t` has after the round it already had before, and inductively it had none — pre-existing debt sits above pending times (the invariant's pending-coverage clause), a pending time at or below an in-interval `t` is due and hence in `N`, and a pending time not yet due is beyond `upper`, putting everything above it outside the interval. +Debt never appears at a time nothing touched; `N` is by construction the complete list of a round's creation sites. + +**Seeding on `N` is the full closure, restricted to where debt can live.** +Within `Above(N)` the restriction to "joins containing at least one seed" is vacuous, by absorption: if a join `x` of purely stored times sits above a seed `n`, then `x = x ⊔ n`, a join containing the seed after all (`cl_inter_above_eq_novelJoins`, `Reached.absorb`). +So seeding on the new times and closing against stored input and output times enumerates exactly the full join-closure of everything, intersected with the only region where debt can exist; what the full closure would add — joins of old times above no new time — is precisely the region the accounting above shows was paid and left untouched. +Old times still matter as join partners because `f` is opaque: a batch update at `b` changes the collection everywhere above `b`, but `f` may respond only where the new update first co-habits with an old one, at `s ⊔ b`. +New times are ignition; old input times are fuel; old output times are where past statements sit and may need retraction. +The output times cannot replace the pending set, though: for updates whose values satisfy `f(a) = f(b) = 0` but `f(a + b) ≠ 0`, the constituent evaluations emit nothing, so no output evidence exists anywhere, while the join of their times still owes a first emission — and the adversary may re-represent the input (say, as one update at a common lower bound) so that no visible join structure points there either. +Only the pended time knows. + +**Pends are resumption tokens.** +Within a round the closure is not even completed: a join crossing `upper` is recorded at its first crossing and closed no further. +When the frontier passes it, it enters `N`, and closing it against the stored times regenerates the deeper joins — the pend resumes a suspended closure computation. +Resuming from the CURRENT representatives is sound by the first observation: the debt above a due pend is re-derived from the compacted input and output, so the resumption needs nothing that compaction could have destroyed. +A pend carries no other content — the implementation's own comment, "we know nothing about why we were warned", is the honest reading — which is also why over-warning is harmless and the model asks only that the recorded pends be CONTAINED in the new pending set (`Step.pending_super`). + ## Building ``` diff --git a/formal/TARGETS2.md b/formal/TARGETS2.md index 3f1d24320..6ce2d33f4 100644 --- a/formal/TARGETS2.md +++ b/formal/TARGETS2.md @@ -59,6 +59,7 @@ Round `r+1` computes: 3. Emission: an update set `e_{r+1}` supported on `active`, chosen so that for every `t ∈ active`, `acc (O_r + e_{r+1}) t = f (acc (S_r + b_{r+1}) t)`. (The greedy `dtrace` construction of `Coverage.lean` builds it.) 4. `P_{r+1} = pended ∪ (P_r ∩ Beyond u_{r+1})`, and the adversary chooses fresh representatives `S_{r+1}`, `O_{r+1}`. + (Amended 2026-07-02: the model requires only `P_{r+1} ⊇ pended ∪ (P_r ∩ Beyond u_{r+1})` with `P_{r+1} ⊆ Beyond u_{r+1}` finite — the implementation warns about MORE times than the enumeration defers, e.g. joins with already-finalized times it keeps around "to scare future times", and over-warning is harmless: a warned time owing nothing evaluates to an empty correction when due. The exact choice above is one instance, `Step.of_pending_eq` in the Lean.) ## The goal theorem