feat(storage): prune only old finalized block signatures, keep blocks forever#453
Conversation
🤖 Kimi Code ReviewThe PR changes block pruning to retain headers and bodies indefinitely while only pruning signatures of finalized blocks outside a retention window. This is a safer approach for consensus. Code correctness and safety:
Performance:
Testing:
Documentation:
Nitpick:
Conclusion: The logic is sound and safely preserves consensus-critical data. The pruning strategy correctly prioritizes safety (never pruning non-finalized data) over storage efficiency. Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt |
🤖 Codex Code ReviewFindings
I did not run Automated review by OpenAI Codex · gpt-5.4 · custom prompt |
Greptile SummaryThis PR relaxes block pruning by replacing the old count-based
Confidence Score: 4/5The change is safe to merge; the pruning logic, boundary conditions, and the guard against touching non-finalized signatures are all correct. The core logic is sound: the cutoff <= finalized_slot guard correctly ensures only fully-finalized signature ranges are deleted, and the test suite covers the three key pruning scenarios. The two small gaps — orphaned signatures being silently skipped and the prune_old_data integration test not asserting BlockSignatures count — are edge-case quality issues rather than current defects. crates/storage/src/store.rs — specifically the orphaned-signature skip path and the integration test assertion coverage.
|
| Filename | Overview |
|---|---|
| crates/storage/src/store.rs | Replaces count-based prune_old_blocks with slot-based prune_old_block_signatures; headers/bodies kept forever, signatures pruned only when the cutoff lies entirely within finalized history. Logic and boundary conditions are correct; two minor gaps: orphaned signatures are silently skipped rather than cleaned up, and the prune_old_data integration test does not assert the BlockSignatures count. |
| CLAUDE.md | Updates block-storage documentation to reflect the new signature-only pruning policy and the fact that get_signed_block returns None for deep historical finalized blocks. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[prune_old_data called] --> B[Compute finalized_slot from latest_finalized checkpoint]
B --> C[Get tip_slot from head block header]
C -->|header found| D[tip_slot = header.slot]
C -->|header missing| E[tip_slot = finalized_slot fallback]
D --> F[prune_old_block_signatures]
E --> F
F --> G[cutoff = tip_slot - SIGNATURE_PRUNING_RANGE]
G --> H{cutoff > finalized_slot?}
H -->|Yes: non-finalized range exceeds window| I[Return 0 - prune nothing]
H -->|No: window is fully within finalized history| J[Scan BlockSignatures table]
J --> K[For each signature key, fetch header from BlockHeaders]
K -->|header.slot < cutoff| L[Mark for deletion]
K -->|header.slot >= cutoff| M[Keep signature]
K -->|header missing - orphaned| N[Skip silently]
L --> O[Delete batch from BlockSignatures only]
O --> P[BlockHeaders and BlockBodies never deleted]
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
A[prune_old_data called] --> B[Compute finalized_slot from latest_finalized checkpoint]
B --> C[Get tip_slot from head block header]
C -->|header found| D[tip_slot = header.slot]
C -->|header missing| E[tip_slot = finalized_slot fallback]
D --> F[prune_old_block_signatures]
E --> F
F --> G[cutoff = tip_slot - SIGNATURE_PRUNING_RANGE]
G --> H{cutoff > finalized_slot?}
H -->|Yes: non-finalized range exceeds window| I[Return 0 - prune nothing]
H -->|No: window is fully within finalized history| J[Scan BlockSignatures table]
J --> K[For each signature key, fetch header from BlockHeaders]
K -->|header.slot < cutoff| L[Mark for deletion]
K -->|header.slot >= cutoff| M[Keep signature]
K -->|header missing - orphaned| N[Skip silently]
L --> O[Delete batch from BlockSignatures only]
O --> P[BlockHeaders and BlockBodies never deleted]
Comments Outside Diff (1)
-
crates/storage/src/store.rs, line 1662-1722 (link)Integration test for
prune_old_datadoesn't assertBlockSignaturescountfallback_pruning_removes_old_states_and_blockscallsprune_old_data()and then assertsBlockHeadersandStatescounts, but never checksBlockSignatures. In this testtip_slot = STATES_TO_KEEP + 4 = 3004 < SIGNATURE_PRUNING_RANGE, socutoffsaturates to 0 and no signatures are pruned — all 3005 should remain. Without an explicit assertion the new signature-pruning path withinprune_old_datagoes unvalidated at the integration level.Prompt To Fix With AI
This is a comment left during a code review. Path: crates/storage/src/store.rs Line: 1662-1722 Comment: **Integration test for `prune_old_data` doesn't assert `BlockSignatures` count** `fallback_pruning_removes_old_states_and_blocks` calls `prune_old_data()` and then asserts `BlockHeaders` and `States` counts, but never checks `BlockSignatures`. In this test `tip_slot = STATES_TO_KEEP + 4 = 3004 < SIGNATURE_PRUNING_RANGE`, so `cutoff` saturates to 0 and no signatures are pruned — all 3005 should remain. Without an explicit assertion the new signature-pruning path within `prune_old_data` goes unvalidated at the integration level. How can I resolve this? If you propose a fix, please make it concise.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
crates/storage/src/store.rs:993-1001
**Orphaned signatures silently skipped, never cleaned up**
When a `BlockSignatures` entry exists but `BlockHeaders` returns `Ok(None)` for the same key, the `?` in the `filter_map` closure short-circuits to `None`, causing the entry to be skipped. Under the new design (headers kept forever) this scenario cannot arise in the forward path, but any signature entries that outlived their headers from prior runs will accumulate in storage indefinitely since nothing else will ever delete them. A defensive pass to collect orphan keys and delete them — or at minimum an instrumented warn — would surface corruption early.
### Issue 2 of 2
crates/storage/src/store.rs:1662-1722
**Integration test for `prune_old_data` doesn't assert `BlockSignatures` count**
`fallback_pruning_removes_old_states_and_blocks` calls `prune_old_data()` and then asserts `BlockHeaders` and `States` counts, but never checks `BlockSignatures`. In this test `tip_slot = STATES_TO_KEEP + 4 = 3004 < SIGNATURE_PRUNING_RANGE`, so `cutoff` saturates to 0 and no signatures are pruned — all 3005 should remain. Without an explicit assertion the new signature-pruning path within `prune_old_data` goes unvalidated at the integration level.
Reviews (1): Last reviewed commit: "feat(storage): prune only old finalized ..." | Re-trigger Greptile
🤖 Claude Code ReviewHere is the review: PR 453 — Review:
|
… forever Replace prune_old_blocks (which deleted headers, bodies, and signatures beyond a fixed block window) with prune_old_block_signatures, which drops only the signatures of old finalized blocks while keeping headers and bodies for the full history. - BlockSignatures is keyed by slot||root (big-endian slot), so pruning iterates in slot order and stops at the first entry past the cutoff, without a per-row BlockHeaders lookup to recover the slot. The shared slot||root codec (also used by LiveChain) is named encode_slot_root_key. - With cutoff = tip_slot - SIGNATURE_PRUNING_RANGE: prune signatures for slot < cutoff only when cutoff <= finalized_slot (healthy finality); during deep non-finality (non-finalized range > window) prune nothing, so non-finalized signatures are never touched. - get_signed_block reads the header first (so it has the slot) and looks up the signature by slot||root; it returns None when absent, including a pruned finalized block. Headers and bodies are kept forever. - prune_old_data computes the tip slot from the head header and runs signature pruning alongside state pruning. - InMemoryBackend::prefix_iterator now returns keys in sorted order to match the RocksDB backend, which the slot-ordered range scans rely on. Note: changes the BlockSignatures key format; existing root-keyed entries are not read after upgrade (old finalized blocks read as pruned). Fine for fresh/checkpoint-synced nodes; no backfill.
6600cb2 to
fb1cf37
Compare
Summary
Relaxes block pruning: instead of deleting old blocks wholesale, keep block headers and bodies forever and prune only the signatures of old finalized blocks. This preserves the full block history (for debugging, re-org safety, and state reconstruction) while still reclaiming the heavy signature data (~3KB+ per block).
Change
prune_old_blocks(deleted headers, bodies, and signatures beyond a fixedBLOCKS_TO_KEEPwindow) withprune_old_block_signatures(finalized_slot, tip_slot).cutoff = tip_slot - SIGNATURE_PRUNING_RANGE:cutoff <= finalized_slot): delete signatures forslot < cutoff(entirely within finalized history);BlockHeadersandBlockBodiesare kept forever; all non-finalized signatures are always retained.get_signed_blockreturnsNonewhen a signature is absent, now including a pruned finalized block (deep historical signed-block serving via BlocksByRoot is lost; peers use checkpoint sync).prune_old_dataderives the tip slot from the head header and runs signature pruning alongside state pruning.Key layout (slot-ordered pruning)
BlockSignaturesis now keyed byslot||root(big-endian slot), reusing the sharedencode_slot_root_keycodec also used byLiveChain.take_while), and no longer does a per-rowBlockHeaderslookup to recover the slot. Every read site already has the slot:get_signed_blockloads the header first.InMemoryBackend::prefix_iteratornow returns keys in sorted order to match the RocksDB backend, which these slot-ordered range scans rely on.Migration
Changes the
BlockSignatureskey format: existing root-keyed entries are not read after upgrade (old finalized blocks read as pruned). Fine for fresh / checkpoint-synced nodes; no backfill.Tests
prune_signatures_keeps_recent_window_when_finality_healthy,prune_signatures_noop_when_non_finalized_range_exceeds_window,prune_signatures_noop_when_tip_within_window.-D warningsclean.Context
Split out of #444 (diff-layer state storage), which depends on this: reconstructing historical states needs block headers retained. #444 stacks on top of this PR.