Skip to content

feat(storage): diff-layer state storage with an in-memory LRU cache#444

Open
MegaRedHand wants to merge 9 commits into
mainfrom
feat/state-diff-layers
Open

feat(storage): diff-layer state storage with an in-memory LRU cache#444
MegaRedHand wants to merge 9 commits into
mainfrom
feat/state-diff-layers

Conversation

@MegaRedHand

@MegaRedHand MegaRedHand commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Closes #238

Summary

Replaces aggressive state pruning with a diff-layer storage model plus an in-memory cache, so the full state history stays available cheaply and no state is ever pruned. Builds on the block-signature pruning from #453 (merged to main): keeping block headers/bodies forever is what lets historical states be reconstructed.

State storage

  • Every non-genesis state is stored as a parent-linked StateDiff (StateDiffs table, never pruned), preserving full state history.
  • Full-state snapshots (States table) are written only at anchors: the bootstrap state plus one per SNAPSHOT_ANCHOR_INTERVAL (1024-slot) window crossing. Also never pruned. There is no separate anchor table; an anchor is just a state whose slot crosses a 1024 boundary relative to its parent.
  • StateDiff stores slot, both checkpoints, and the justification fields in full, plus the appended historical_block_hashes tail. config/validators come from the nearest ancestor snapshot (they never change); latest_block_header is read back from BlockHeaders (the stored state caches the real state_root there, so it matches byte-for-byte). These assumptions are documented on StateDiff::from_base.
  • get_state resolves in three tiers: (1) in-memory LRU cache, (2) a full snapshot in States, else (3) reconstruct by walking base_root back to the nearest ancestor snapshot and replaying appended tails forward. Results are memoized in the cache (STATE_CACHE_CAPACITY = 32) so recent/repeated reads stay hot.
  • DiffBase captures the parent (root, hbh_len, slot) before it is consumed into the post-state; StateDiff/DiffBase live in the storage crate.

Why this over snapshot eviction

The earlier approach pruned old snapshots on a hot window and scanned the whole States table to do it. Anchors are kept forever and spread across all slots, so a slot||root-keyed range scan still touches every anchor: bounding the scan doesn't help. Keeping diffs-only (never pruned) plus an LRU for hot reads removes the periodic scan entirely and keeps full history, at the cost of on-demand reconstruction for cold, non-anchor states.

Tests

  • StateDiff build/SSZ round-trip; state reconstruction (single-diff and multi-diff) verified via a cold store with an empty cache so the diff walk is actually exercised; snapshot-only-on-boundary-crossing; bootstrap-anchor snapshot.
  • Storage + blockchain suites green; clippy -D warnings clean.

Validation

  • Local 4-node devnet, ~52-slot soak: all nodes finalize and stay byte-identical (same head/justified/finalized), aggregated attestations packed, zero reconstruction errors / panics.
  • A >1024-slot soak is running to also exercise the second-anchor snapshot write and anchor-to-anchor reconstruction (the short soak only crosses the bootstrap anchor).
  • Migration: existing DBs keep their full states as snapshots; diffs start accruing going forward (no backfill).

@MegaRedHand MegaRedHand force-pushed the feat/state-diff-layers branch 4 times, most recently from 2cb4ca3 to a342fa7 Compare June 23, 2026 18:56
@MegaRedHand MegaRedHand changed the base branch from main to feat/prune-block-signatures June 23, 2026 18:57
@MegaRedHand MegaRedHand force-pushed the feat/prune-block-signatures branch from 6600cb2 to fb1cf37 Compare June 23, 2026 19:27
@MegaRedHand MegaRedHand force-pushed the feat/state-diff-layers branch from a342fa7 to 34d6345 Compare June 23, 2026 19:31
MegaRedHand added a commit that referenced this pull request Jun 23, 2026
… forever (#453)

## Summary

Relaxes block pruning: instead of deleting old blocks wholesale, keep
block headers and bodies **forever** and prune only the signatures of
old finalized blocks. This preserves the full block history (for
debugging, re-org safety, and state reconstruction) while still
reclaiming the heavy signature data (~3KB+ per block).

## Change

- Replaces `prune_old_blocks` (deleted headers, bodies, and signatures
beyond a fixed `BLOCKS_TO_KEEP` window) with
`prune_old_block_signatures(finalized_slot, tip_slot)`.
- Policy, with `cutoff = tip_slot - SIGNATURE_PRUNING_RANGE`:
- **healthy finality** (`cutoff <= finalized_slot`): delete signatures
for `slot < cutoff` (entirely within finalized history);
- **deep non-finality** (non-finalized range exceeds the window): prune
nothing, so non-finalized signatures are never touched.
- `BlockHeaders` and `BlockBodies` are kept forever; all non-finalized
signatures are always retained.
- `get_signed_block` returns `None` when a signature is absent, now
including a pruned finalized block (deep historical signed-block serving
via BlocksByRoot is lost; peers use checkpoint sync).
- `prune_old_data` derives the tip slot from the head header and runs
signature pruning alongside state pruning.

## Key layout (slot-ordered pruning)

- `BlockSignatures` is now keyed by **`slot||root`** (big-endian slot),
reusing the shared `encode_slot_root_key` codec also used by
`LiveChain`.
- Pruning iterates in slot order and **stops at the first entry past the
cutoff** (`take_while`), and no longer does a per-row `BlockHeaders`
lookup to recover the slot. Every read site already has the slot:
`get_signed_block` loads the header first.
- `InMemoryBackend::prefix_iterator` now returns keys in sorted order to
match the RocksDB backend, which these slot-ordered range scans rely on.

## Migration

Changes the `BlockSignatures` key format: existing root-keyed entries
are not read after upgrade (old finalized blocks read as pruned). Fine
for fresh / checkpoint-synced nodes; no backfill.

## Tests

- `prune_signatures_keeps_recent_window_when_finality_healthy`,
`prune_signatures_noop_when_non_finalized_range_exceeds_window`,
`prune_signatures_noop_when_tip_within_window`.
- Storage lib suite green (42 tests); clippy `-D warnings` clean.

## Context

Split out of #444 (diff-layer state storage), which depends on this:
reconstructing historical states needs block headers retained. #444
stacks on top of this PR.

Co-authored-by: Pablo Deymonnaz <pdeymon@fi.uba.ar>
Base automatically changed from feat/prune-block-signatures to main June 23, 2026 20:56
@MegaRedHand MegaRedHand force-pushed the feat/state-diff-layers branch 3 times, most recently from 608e97f to d1a76aa Compare June 24, 2026 17:56
Store every non-genesis state as a parent-linked StateDiff (StateDiffs,
never pruned) plus a full snapshot (States) written only at 1024-slot
anchors (and the bootstrap). Neither table is ever pruned, so the full
state history is preserved cheaply.

- get_state returns an anchor snapshot directly, else reconstructs by
  walking base_root back to the nearest anchor and replaying the appended
  historical_block_hashes tails; config/validators come from the snapshot
  and latest_block_header from BlockHeaders.
- Reconstructed and freshly imported states are memoized in an in-memory
  LRU (STATE_CACHE_CAPACITY), keyed by block root. States are immutable per
  root, so the cache never needs invalidation; it keeps recent reads (e.g. a
  child block's parent state) hot without reconstruction.
- DiffBase captures the parent (root, hbh_len, slot) before it is consumed
  into the post-state. StateDiff/DiffBase live in the storage crate.
- No snapshot eviction and no StateAnchors table: anchors are simply the
  snapshots in States, so the prune-states scan is gone entirely.
@MegaRedHand MegaRedHand force-pushed the feat/state-diff-layers branch from d1a76aa to 1a060f2 Compare June 24, 2026 18:31
The migrated test built its DiffBase from the target post-state itself
(DiffBase::from_state(a, &head_state)), so base.slot/base.hbh_len came
from the target rather than the parent. That made the anchor-boundary
check always false (no snapshot written, contradicting the comment) and
left the diff self-referential, passing only via the cache memoization.

Diff against the genesis anchor already present in the store instead, so
the base correctly describes the parent state.
@MegaRedHand MegaRedHand linked an issue Jun 24, 2026 that may be closed by this pull request
@MegaRedHand MegaRedHand changed the title feat(storage): diff-layer state storage with bounded pruning feat(storage): diff-layer state storage with an in-memory LRU cache Jun 24, 2026
@MegaRedHand MegaRedHand marked this pull request as ready for review June 24, 2026 20:21
The reconstruction and boundary tests built DiffBase struct literals
directly from the crate-internal fields instead of the public
DiffBase::from_state constructor. Use the constructor so the tests
exercise the real construction path (and would catch a regression in
from_state itself).
@github-actions

Copy link
Copy Markdown

🤖 Kimi Code Review

Overall this is a well-architected optimization that significantly reduces storage overhead by diffing states against their parents rather than storing full copies. The logic appears correct and the tests are comprehensive.

Critical Observations

  1. Fragile panic in reconstruction (crates/storage/src/state_diff.rs:182)
    The reconstruct function panics if diffs is empty. While reconstruct_state currently ensures this is unreachable (it only calls reconstruct after pushing at least one diff), this invariant is fragile. Consider returning Option<State> or asserting with a detailed message linking to the invariant.

  2. N+1 query pattern in reconstruction (crates/storage/src/store.rs:1169-1180)
    reconstruct_state performs one DB read per diff in the chain (up to 1024 reads for a full anchor interval). Each get_ssz creates a new read view. For performance, consider opening a single read transaction for the entire walk, though this is acceptable for the current milestone.

  3. Cache lock held during DB operations (crates/storage/src/store.rs:1139, 1146)
    The LRU cache is locked separately for get and put operations. While the window is small, between releasing the lock after the miss and acquiring it for the insert, another thread could reconstruct the same state. Consider restructuring to minimize lock contention, though this is not a correctness issue.

Code Quality & Safety

  • Correctness: The diff logic correctly assumes append-only historical_block_hashes and immutable config/validators. The boundary check for anchor snapshots (slot / INTERVAL > base.slot / INTERVAL) is correct.
  • SSZ Safety: HistoricalBlockHashesTail properly uses SszList with the same limit as the full list, ensuring encoding consistency.
  • Memory Safety: The DiffBase capture before state consumption (line 559 in blockchain/src/store.rs) correctly avoids cloning the multi-MB historical hashes list.
  • Error Handling: The transition from insert_state to insert_state_with_diff in tests properly uses the genesis snapshot as the diff base.

Minor Suggestions

  • Documentation: Add a comment in reconstruct_state explaining why diffs is never empty when reaching the reconstruct call to prevent future refactors from breaking the invariant.
  • Metrics: Consider adding a counter for cache hits vs. reconstructions to monitor performance in production.

Consensus Security

The reconstruction logic preserves all consensus-critical fields (latest_justified, latest_finalized, justification fields) verbatim from the diff. The latest_block_header is retrieved from the BlockHeaders table, which preserves the state_root that was computed when the block was imported. This maintains the cryptographic link between the reconstructed state and the block root.

Verdict: Approved with minor defensive programming suggestions.


Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

@github-actions

Copy link
Copy Markdown

🤖 Codex Code Review

  1. Medium: crates/storage/src/state_diff.rs and crates/storage/src/store.rs make the diff format depend on “parent historical_block_hashes is an exact prefix of child” but only validate len >= base_hbh_len. If that invariant is ever broken by a future STF change or bad DB repair, reconstruction will silently fabricate the wrong state; for one subset of failures it will instead panic on assert! during block import. In consensus code, this should be a fallible check with enough base metadata to validate the prefix/root relationship explicitly.

  2. Medium: crates/storage/src/store.rs, crates/storage/src/store.rs, and crates/storage/src/store.rs add a fairly expensive cold path. reconstruct_state() opens a fresh begin_read() for every diff hop, and get_state() / insert_state_with_diff() deep-clone full States into and out of the LRU. In the worst case that is roughly 1024 diff hops, so one cache miss turns into thousands of DB-read creations plus multiple multi-MB copies. Reusing a single read view for reconstruction and caching Arc<State>-style values would avoid a noticeable restart/RPC regression.

I did not run the test suite successfully here: cargo test hit read-only rustup/cargo home paths in this environment before dependency resolution completed.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

The table dropped the 4 stale rows (LatestKnown/LatestNewAttestations,
GossipSignatures, AggregatedPayloads) that no longer exist as persisted
tables: attestations and gossip signatures are now in-memory Store
buffers. Also fixes the BlockSignatures key (slot||root, not H256).
@greptile-apps

greptile-apps Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR replaces aggressive state pruning with a diff-layer storage model: every non-genesis post-state is persisted as a parent-linked StateDiff (never pruned), while full State snapshots are written only at 1024-slot anchor boundaries. An in-memory LRU cache memoises recently accessed states so the hot path (parent state of the next block) avoids any reconstruction.

  • StateDiff captures slot, checkpoints, justification fields, and the appended historical_block_hashes tail; config/validators are omitted (taken from the nearest ancestor snapshot); latest_block_header is read from BlockHeaders.
  • get_state resolves via (1) LRU cache, (2) States snapshot, or (3) diff-chain walk + reconstruction, memoising the result on every miss.
  • prune_old_states is removed; prune_old_data now only prunes finalized block signatures.

Confidence Score: 4/5

The core diff-layer write and reconstruction paths are correct and the tests cover single-diff, multi-diff, anchor-crossing, and bootstrap-snapshot cases with a cold store. No state is silently lost or incorrectly reconstructed under normal operation.

The doc comment on prune_old_data still describes snapshot eviction that the new design abandoned entirely, which could mislead future maintainers. Anchor states also get a redundant StateDiffs row that is never read back. The backward walk in reconstruct_state ignores the LRU cache for intermediate steps and lacks any depth bound, so DB corruption could cause an indefinite hang. None of these are correctness issues under well-formed data, but the documentation inaccuracy and unbounded walk are worth addressing before the second anchor soak completes.

crates/storage/src/store.rs — specifically the prune_old_data doc comment, insert_state_with_diff (redundant anchor diff), and the reconstruct_state backward walk loop.

Important Files Changed

Filename Overview
crates/storage/src/state_diff.rs New file implementing StateDiff/DiffBase types, SSZ-encode/decode, and the reconstruct() function for replaying diffs onto a snapshot. Well-documented with explicit invariant contracts.
crates/storage/src/store.rs Core storage changes: adds LRU state cache, replaces insert_state with insert_state_with_diff (snapshot-on-anchor), adds reconstruct_state for diff-chain replay, removes prune_old_states. Doc comment on prune_old_data misleadingly describes snapshot eviction that no longer happens; anchor states get a redundant StateDiffs entry; the backward walk has no cycle-detection/depth guard.
crates/blockchain/src/store.rs Captures DiffBase before parent is consumed into post-state, then calls insert_state_with_diff. Test updated to use the new diff API correctly.
crates/storage/src/api/tables.rs Adds StateDiffs table variant and updates ALL_TABLES, name(), plus updates rocksdb.rs to delegate cf_name to table.name() for single source of truth.
crates/storage/src/lib.rs Adds state_diff module and re-exports DiffBase for callers.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant Caller
    participant Store
    participant LRU as LRU Cache
    participant States as States Table (DB)
    participant Diffs as StateDiffs Table (DB)
    participant Headers as BlockHeaders Table (DB)

    Note over Caller,Headers: get_state(root)
    Caller->>Store: get_state(root)
    Store->>LRU: get(root)
    alt Cache hit
        LRU-->>Store: State
        Store-->>Caller: Some(State)
    else Cache miss
        Store->>States: get(root)
        alt Snapshot found
            States-->>Store: State (full snapshot)
        else Not an anchor
            Store->>Store: reconstruct_state(root)
            loop walk back until snapshot found
                Store->>States: get(cursor)
                alt Snapshot found
                    States-->>Store: snapshot State
                else
                    Store->>Diffs: get(cursor)
                    Diffs-->>Store: StateDiff
                    Note over Store: cursor = diff.base_root
                end
            end
            Store->>Headers: get(root)
            Headers-->>Store: latest_block_header
            Store->>Store: reconstruct(snapshot, diffs[], header)
        end
        Store->>LRU: put(root, state)
        Store-->>Caller: Some(State)
    end
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant Caller
    participant Store
    participant LRU as LRU Cache
    participant States as States Table (DB)
    participant Diffs as StateDiffs Table (DB)
    participant Headers as BlockHeaders Table (DB)

    Note over Caller,Headers: get_state(root)
    Caller->>Store: get_state(root)
    Store->>LRU: get(root)
    alt Cache hit
        LRU-->>Store: State
        Store-->>Caller: Some(State)
    else Cache miss
        Store->>States: get(root)
        alt Snapshot found
            States-->>Store: State (full snapshot)
        else Not an anchor
            Store->>Store: reconstruct_state(root)
            loop walk back until snapshot found
                Store->>States: get(cursor)
                alt Snapshot found
                    States-->>Store: snapshot State
                else
                    Store->>Diffs: get(cursor)
                    Diffs-->>Store: StateDiff
                    Note over Store: cursor = diff.base_root
                end
            end
            Store->>Headers: get(root)
            Headers-->>Store: latest_block_header
            Store->>Store: reconstruct(snapshot, diffs[], header)
        end
        Store->>LRU: put(root, state)
        Store-->>Caller: Some(State)
    end
Loading
Prompt To Fix All With AI
Fix the following 4 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 4
crates/storage/src/store.rs:848-852
The doc comment says "evicting old state snapshots" but the new design never prunes snapshots — anchors are kept forever. Only block signatures are removed. The second sentence ("only full-state snapshots outside the hot window … are removed") reinforces the misconception. This will mislead future readers who rely on this doc to understand storage bounds.

```suggestion
    /// Prune finalized block signatures to keep signature storage bounded.
    ///
    /// State diffs, block headers, block bodies, and full-state snapshots are
    /// all retained for the full history and are never pruned. Only signatures
    /// of finalized blocks older than the pruning window are removed.
```

### Issue 2 of 4
crates/storage/src/store.rs:1213-1235
**Redundant `StateDiffs` entry written for every anchor state**

`insert_state_with_diff` always writes a `StateDiffs` row, even when `is_anchor` is true and a full `States` snapshot is also written. In `reconstruct_state` the loop checks `States[cursor]` before `StateDiffs[cursor]`, so when reconstruction reaches an anchor it breaks immediately on the snapshot — the diff row is never read. Over a long-running chain each anchor accumulates a useless diff entry (anchor-slot states get two on-disk representations).

### Issue 3 of 4
crates/storage/src/store.rs:1164-1189
**`reconstruct_state` backward walk skips the LRU cache for intermediate states**

The loop only checks `States` (the DB snapshot table) to decide whether to stop. If an intermediate state (e.g. the direct parent, already imported a moment ago) is present in the LRU cache, the walk continues all the way back to the nearest DB snapshot anyway, doing up to `SNAPSHOT_ANCHOR_INTERVAL` extra DB reads. Checking `self.state_cache.lock().unwrap().get(&cursor)` at the top of each iteration would short-circuit the walk and use the cached state as the reconstruction base, which is safe because states are immutable per root.

### Issue 4 of 4
crates/storage/src/store.rs:1164-1175
**Unbounded backward walk with no cycle detection or depth limit**

The `loop` in `reconstruct_state` terminates only when it finds a snapshot in `States` or a missing diff. With DB corruption (e.g. a `StateDiff` whose `base_root` points to itself or forms a multi-hop cycle), the loop runs indefinitely and hangs the calling thread. A simple guard — capping iterations at `SNAPSHOT_ANCHOR_INTERVAL + 1` and returning `None` if exceeded — would bound the worst case and surface corruption as a clean error rather than a hang.

Reviews (1): Last reviewed commit: "docs: correct storage tables list to mat..." | Re-trigger Greptile

Comment on lines +848 to +852
/// Bound storage by evicting old state snapshots and finalized signatures.
///
/// State diffs, block headers, and block bodies are retained for the full
/// history; only full-state snapshots outside the hot window (diffs remain)
/// and signatures of finalized blocks are removed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The doc comment says "evicting old state snapshots" but the new design never prunes snapshots — anchors are kept forever. Only block signatures are removed. The second sentence ("only full-state snapshots outside the hot window … are removed") reinforces the misconception. This will mislead future readers who rely on this doc to understand storage bounds.

Suggested change
/// Bound storage by evicting old state snapshots and finalized signatures.
///
/// State diffs, block headers, and block bodies are retained for the full
/// history; only full-state snapshots outside the hot window (diffs remain)
/// and signatures of finalized blocks are removed.
/// Prune finalized block signatures to keep signature storage bounded.
///
/// State diffs, block headers, block bodies, and full-state snapshots are
/// all retained for the full history and are never pruned. Only signatures
/// of finalized blocks older than the pruning window are removed.
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/src/store.rs
Line: 848-852

Comment:
The doc comment says "evicting old state snapshots" but the new design never prunes snapshots — anchors are kept forever. Only block signatures are removed. The second sentence ("only full-state snapshots outside the hot window … are removed") reinforces the misconception. This will mislead future readers who rely on this doc to understand storage bounds.

```suggestion
    /// Prune finalized block signatures to keep signature storage bounded.
    ///
    /// State diffs, block headers, block bodies, and full-state snapshots are
    /// all retained for the full history and are never pruned. Only signatures
    /// of finalized blocks older than the pruning window are removed.
```

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +1213 to 1235
pub fn insert_state_with_diff(&mut self, root: H256, base: DiffBase, state: State) {
let slot = state.slot;
let is_anchor = slot / SNAPSHOT_ANCHOR_INTERVAL > base.slot / SNAPSHOT_ANCHOR_INTERVAL;

// Snapshot only at anchors; serialize before `state` is consumed.
let snapshot_bytes = is_anchor.then(|| state.to_ssz());
// Memoize the post-state for fast reads, then move it into the diff so
// its multi-MB justification fields are not cloned again.
self.state_cache.lock().unwrap().put(root, state.clone());
let diff_bytes = StateDiff::from_base(base.root, base.hbh_len, state).to_ssz();

let key = root.to_ssz();
let mut batch = self.backend.begin_write().expect("write batch");
let entries = vec![(root.to_ssz(), state.to_ssz())];
batch.put_batch(Table::States, entries).expect("put state");
batch
.put_batch(Table::StateDiffs, vec![(key.clone(), diff_bytes)])
.expect("put state diff");
if let Some(snapshot_bytes) = snapshot_bytes {
batch
.put_batch(Table::States, vec![(key, snapshot_bytes)])
.expect("put state snapshot");
}
batch.commit().expect("commit");
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Redundant StateDiffs entry written for every anchor state

insert_state_with_diff always writes a StateDiffs row, even when is_anchor is true and a full States snapshot is also written. In reconstruct_state the loop checks States[cursor] before StateDiffs[cursor], so when reconstruction reaches an anchor it breaks immediately on the snapshot — the diff row is never read. Over a long-running chain each anchor accumulates a useless diff entry (anchor-slot states get two on-disk representations).

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/src/store.rs
Line: 1213-1235

Comment:
**Redundant `StateDiffs` entry written for every anchor state**

`insert_state_with_diff` always writes a `StateDiffs` row, even when `is_anchor` is true and a full `States` snapshot is also written. In `reconstruct_state` the loop checks `States[cursor]` before `StateDiffs[cursor]`, so when reconstruction reaches an anchor it breaks immediately on the snapshot — the diff row is never read. Over a long-running chain each anchor accumulates a useless diff entry (anchor-slot states get two on-disk representations).

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +1164 to +1189
fn reconstruct_state(&self, root: &H256) -> Option<State> {
// Walk back collecting diffs until we reach a snapshot.
let mut diffs: Vec<StateDiff> = Vec::new();
let mut cursor = *root;
let snapshot = loop {
if let Some(snapshot) = self.get_ssz::<State>(Table::States, &cursor) {
break snapshot;
}
let diff = self.get_ssz::<StateDiff>(Table::StateDiffs, &cursor)?;
cursor = diff.base_root;
diffs.push(diff);
};

// `diffs` runs target -> snapshot child; reverse to snapshot child -> target.
diffs.reverse();

// The latest block header lives in BlockHeaders; the stored state caches
// the real state_root there, so it equals the header byte-for-byte.
let latest_block_header = self.get_block_header(root)?;

Some(crate::state_diff::reconstruct(
snapshot,
&diffs,
latest_block_header,
))
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 reconstruct_state backward walk skips the LRU cache for intermediate states

The loop only checks States (the DB snapshot table) to decide whether to stop. If an intermediate state (e.g. the direct parent, already imported a moment ago) is present in the LRU cache, the walk continues all the way back to the nearest DB snapshot anyway, doing up to SNAPSHOT_ANCHOR_INTERVAL extra DB reads. Checking self.state_cache.lock().unwrap().get(&cursor) at the top of each iteration would short-circuit the walk and use the cached state as the reconstruction base, which is safe because states are immutable per root.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/src/store.rs
Line: 1164-1189

Comment:
**`reconstruct_state` backward walk skips the LRU cache for intermediate states**

The loop only checks `States` (the DB snapshot table) to decide whether to stop. If an intermediate state (e.g. the direct parent, already imported a moment ago) is present in the LRU cache, the walk continues all the way back to the nearest DB snapshot anyway, doing up to `SNAPSHOT_ANCHOR_INTERVAL` extra DB reads. Checking `self.state_cache.lock().unwrap().get(&cursor)` at the top of each iteration would short-circuit the walk and use the cached state as the reconstruction base, which is safe because states are immutable per root.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +1164 to +1175
fn reconstruct_state(&self, root: &H256) -> Option<State> {
// Walk back collecting diffs until we reach a snapshot.
let mut diffs: Vec<StateDiff> = Vec::new();
let mut cursor = *root;
let snapshot = loop {
if let Some(snapshot) = self.get_ssz::<State>(Table::States, &cursor) {
break snapshot;
}
let diff = self.get_ssz::<StateDiff>(Table::StateDiffs, &cursor)?;
cursor = diff.base_root;
diffs.push(diff);
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unbounded backward walk with no cycle detection or depth limit

The loop in reconstruct_state terminates only when it finds a snapshot in States or a missing diff. With DB corruption (e.g. a StateDiff whose base_root points to itself or forms a multi-hop cycle), the loop runs indefinitely and hangs the calling thread. A simple guard — capping iterations at SNAPSHOT_ANCHOR_INTERVAL + 1 and returning None if exceeded — would bound the worst case and surface corruption as a clean error rather than a hang.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/src/store.rs
Line: 1164-1175

Comment:
**Unbounded backward walk with no cycle detection or depth limit**

The `loop` in `reconstruct_state` terminates only when it finds a snapshot in `States` or a missing diff. With DB corruption (e.g. a `StateDiff` whose `base_root` points to itself or forms a multi-hop cycle), the loop runs indefinitely and hangs the calling thread. A simple guard — capping iterations at `SNAPSHOT_ANCHOR_INTERVAL + 1` and returning `None` if exceeded — would bound the worst case and surface corruption as a clean error rather than a hang.

How can I resolve this? If you propose a fix, please make it concise.

@github-actions

Copy link
Copy Markdown

🤖 Claude Code Review

Now I have everything to write the final review.


PR Review: feat(storage): diff-layer state storage with an in-memory LRU cache

This PR replaces time-windowed state pruning with a parent-linked diff-layer model: every non-genesis block gets a StateDiff entry (never pruned), full-state snapshots are written only at 1024-slot boundary crossings and the bootstrap anchor, and get_state resolves in three tiers — LRU cache, snapshot, or diff-chain reconstruction. The design is conceptually sound and the test coverage is good, but there are a few issues worth addressing before merge.


Finding 1 — Cache populated before DB commit (liveness hazard on crash)

crates/storage/src/store.rs:1221

insert_state_with_diff writes the state into the LRU cache at line 1221, then commits the StateDiffs write batch at line 1234:

self.state_cache.lock().unwrap().put(root, state.clone());   // ← cache, in-memory only
let diff_bytes = StateDiff::from_base(..., state).to_ssz();
let mut batch = self.backend.begin_write()...;
batch.put_batch(Table::StateDiffs, ...)...;
batch.commit().expect("commit");                              // ← DB commit

If the process is killed (OOM, SIGKILL, panic from .expect() in the batch path) between these two points, the cache is discarded on restart (new_state_cache() is called fresh in every construction path) and no StateDiff row exists in RocksDB. On restart, get_state(root) misses the empty cache, misses Table::States, calls reconstruct_stateget_ssz::<StateDiff> returns None → returns None → any child block fails with MissingParentState. Recovery requires the parent block to be re-gossiped from the network (which re-triggers on_block_core since has_state checks only the DB). The fix is simple: move state_cache.lock().unwrap().put(...) to after batch.commit().


Finding 2 — Undocumented write-order precondition for snapshot/reconstruction consistency

crates/storage/src/store.rs:1213 / crates/blockchain/src/store.rs:571

reconstruct_state reads latest_block_header from BlockHeaders[root] (line 1182), which stores the proposer's block.state_root. This matches post_state.latest_block_header.state_root only because on_block_core:571 explicitly sets it after state transition:

post_state.latest_block_header.state_root = state_root;   // blockchain/src/store.rs:571
// ...
store.insert_state_with_diff(block_root, diff_base, post_state);  // :585

If insert_state_with_diff is ever called with a state where lbh.state_root is still H256::ZERO (as it comes out of process_block_header), then the anchor snapshot serialized from that state would have lbh.state_root = ZERO, while reconstruction via get_block_header would return the block's non-zero state_root. The snapshot path and reconstruction path would then disagree on the reconstructed state — making child-block process_block_header parent-root validation non-deterministic depending on whether the parent was hot in cache or reconstructed cold. The state_diff.rs doc comment mentions this on the reader side ("the persisted post-state caches the real state_root there") but insert_state_with_diff's own doc comment does not list it as a precondition on the writer side.


Finding 3 — No depth cap or cycle guard in reconstruct_state

crates/storage/src/store.rs:1172

The diff-walk loop has no visited-set, depth counter, or iteration cap:

let snapshot = loop {
    if let Some(snapshot) = self.get_ssz::<State>(Table::States, &cursor) {
        break snapshot;
    }
    let diff = self.get_ssz::<StateDiff>(Table::StateDiffs, &cursor)?;  // None exits via ?
    cursor = diff.base_root;   // no cycle check
    diffs.push(diff);          // unbounded push
};

If storage corruption produces StateDiff[A].base_root = B and StateDiff[B].base_root = A, neither entry misses the ? and neither is in Table::States, so the loop pushes into diffs indefinitely, eventually OOM-ing the process. This path is not reachable through any correct write sequence (real base_root values come from block hashes, which cannot form cycles), but the loop is one corrupt DB entry away from a hung thread. A simple depth cap of SNAPSHOT_ANCHOR_INTERVAL + 1 would make corruption fail-fast rather than hang.


Finding 4 — Cold reconstruction: up to 1023 DB reads with no intermediate caching

crates/storage/src/store.rs:106

STATE_CACHE_CAPACITY = 32 is appropriate for steady-state block import (each block's parent is hot from the previous write). However, reconstruct_state does one get_ssz call — and thus one begin_read()view.get() round-trip — per step in the diff chain, without caching intermediate states. On a cold node (after restart, or for any historical API read), requesting a state 1023 diffs past the last anchor triggers 1023 separate DB reads and populates the cache with only the single target state, not the intermediates. A second request for that state's sibling (same ancestor snapshot, 1022 diffs away) re-walks from scratch. For a validator node that restarts mid-window or an API serving historical block requests, this means every non-anchor get_state is O(depth) in DB round-trips rather than O(1). Consider either increasing STATE_CACHE_CAPACITY significantly (e.g., to SNAPSHOT_ANCHOR_INTERVAL / 8 ≈ 128) or memoizing intermediate states during the walk.


Finding 5 — put_batch called with inline vec![] (CLAUDE.md formatting rule)

crates/storage/src/store.rs:1227–1232

batch
    .put_batch(Table::StateDiffs, vec![(key.clone(), diff_bytes)])
    .expect("put state diff");
if let Some(snapshot_bytes) = snapshot_bytes {
    batch
        .put_batch(Table::States, vec![(key, snapshot_bytes)])
        .expect("put state snapshot");
}

CLAUDE.md states: "Extract long arguments into variables so formatter can join lines", and shows let entries = vec![...] before batch.put_batch(Table::X, entries) as the preferred form. Every other multi-entry batch write in the same file (e.g., init_store around line 662) follows this pattern. The two new calls use the inline form the convention explicitly discourages.


Automated review by Claude (Anthropic) · sonnet · custom prompt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Block building optimizations Implement diff-layers for state calculation

1 participant