Skip to content

feat(inprocess): in-process N-validator harness#3642

Open
bdchatham wants to merge 16 commits into
mainfrom
feat/inprocess-harness
Open

feat(inprocess): in-process N-validator harness#3642
bdchatham wants to merge 16 commits into
mainfrom
feat/inprocess-harness

Conversation

@bdchatham

@bdchatham bdchatham commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

in-process N-validator harness

Stands up N sei-chain validators in one Go process — real CometBFT consensus, each node serving its own Tendermint RPC + EVM JSON-RPC (HTTP/WS), with deterministic teardown. The in-process foundation for the SDK "local" provider, and the compute backend for running the YAML query/tx suites in-memory (no docker).

Gated behind the inprocess build tag, so the heavy sei-tendermint/sei-cosmos bring-up never enters a normal seid build. The only production app.App change is retaining the two EVM server handles for teardown (harness-only accessors live in app/app_inprocess.go behind the tag); app.App's public surface doesn't widen and the serve goroutines keep their original bare-panic path.

What's here

  • inprocess/Start(ctx, Options) (*Network, error), per-node Node handles, WaitReady, idempotent Close. Handles mirror the SDK sei.NodeHandle/NetworkHandle by shape (no SDK import — toolchain skew would break the seid build), so a future thin adapter satisfies the interface structurally.
  • integration_test/runner/ — a pluggable execer seam: docker stays the default; WithInProcessNetwork (build-tagged) runs the YAML suites in-memory.

Validator count: 1 or ≥ 3 (never 2)

Driven by CometBFT's block-sync→consensus handoff, not a quorum: N=1 proposes solo; N=2 deadlocks (each node has one peer, IsCaughtUp needs >1) so Start rejects it; N≥3 works. The bring-up invariants (empty valset, gentx-derived peer mesh, EVM-enable, metrics-off, loopback scope) are documented at point-of-use in inprocess/doc.go.

Served surface

TM RPC + EVM JSON-RPC HTTP/WS. No cosmos gRPC (the harness never starts it). REST is an honest parity stub for the SDK handle shape.

Tests

TestInProcessNetwork (N=4) asserts each node serves TM RPC + EVM and round-trips a tx (broadcast on node 0, observed on node 1); plus input-rejection and chain-id guards, and TestInProcessBankModule running the bank suite in-memory.

go test -tags inprocess ./inprocess/ ./integration_test/runner/

Reviewer notes

  • The EVM worker pool is a process-wide singleton; Close deliberately doesn't close it (its sync.Once won't re-fire), so one network runs per process today. De-globalizing it in evmrpc is the fix if repeated Start/Close is ever needed.
  • Not yet wired into CI: the harness runs on-demand (-tags inprocess) and the pipeline still runs the docker suites. CI adoption is a deliberate follow-up.

🤖 Generated with Claude Code

bdchatham and others added 2 commits June 24, 2026 14:05
RegisterLocalServices constructs the EVM HTTP/WS listeners in detached
goroutines that panic on a bind failure. An in-process host running N apps
in one process needs (a) the listener handles to Stop() at teardown and
(b) a single node's bind failure to be a reportable error, not a
process-wide panic that kills all N.

Add evmHTTPServer/evmWSServer handles (EVMHTTPServer/EVMWebSocketServer
getters), and SetEVMServeErr to redirect Start() failures to a buffered
channel. With no channel set (production seid) behavior is unchanged: the
listener still panics on bind failure.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Stands up N sei-chain validators in one Go process reaching real CometBFT
consensus, each serving its own RPC stack (Tendermint RPC + EVM JSON-RPC
HTTP/WS + gRPC), with deterministic teardown. Gated behind the inprocess
build tag so the heavy bring-up never enters a normal seid build.

The load-bearing recipe (vs testutil/network): empty genesis valset
(derive from InitChain), full P2P mesh, EVM enabled on per-node loopback
ports, metrics off, raised conn-tracker ceiling for the loopback burst.

Productionization: fresh per-run chain-id (no cross-run genesis collision),
partial-startup cleanup, per-node EVM serve-error channel, idempotent
Close. Handle methods mirror the SDK sei.NodeHandle signatures by name so a
future adapter satisfies the interface structurally — without importing the
SDK (its module graph + grpc replace conflict would break the seid build).

Test: TestInProcessNetwork stands up N=4, asserts each node serves its RPC
stack, and round-trips a tx (broadcast on node0, observed on node1).

  go test -tags inprocess -run TestInProcessNetwork -v -timeout 300s ./inprocess/

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@bdchatham

Copy link
Copy Markdown
Contributor Author

bugbot run

@github-actions

github-actions Bot commented Jun 24, 2026

Copy link
Copy Markdown

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedJun 30, 2026, 7:58 PM

@cursor

cursor Bot commented Jun 24, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Large new consensus/RPC bring-up path and router changes in app.go for test-only fields; production behavior should be unchanged without the inprocess tag, but singleton limits (one network per process, global EVM worker pool) affect test ergonomics.

Overview
Introduces an inprocess package ( inprocess build tag ) that Starts N validators in one process with CometBFT consensus, per-node Tendermint RPC, and EVM JSON-RPC (HTTP/WS), plus WaitReady, Close, and node handles shaped for a future SDK local provider. Bring-up deliberately diverges from testutil/network: empty genesis valset (except N=1), gentx memo → PersistentPeers mesh wiring with a post-collectGentxs guard, loopback TM/P2P, metrics off, per-node EVM ports via injected AppOptions, and rejection of N=2 (block-sync deadlock).

app.App now stores constructed EVM HTTP/WS servers in fields assigned at RegisterLocalServices; app/app_inprocess.go exposes EVMHTTPServer / EVMWebSocketServer only under the same tag so production App API stays unchanged while the harness can Stop() listeners.

integration_test/runner gains an execer seam (docker remains default). WithInProcessNetwork (tagged) runs YAML steps on the host: builds seid with -tags inprocess, prepends a seid shim that injects --home / --node for client subcommands, and sets EVM env vars. Tests cover N=4 consensus + cross-node tx, config guards, and in-process bank YAML.

Reviewed by Cursor Bugbot for commit 0c875ce. Bugbot is set up for automated code reviews on this repo. Configure here.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit a869e15. Configure here.

@codecov

codecov Bot commented Jun 24, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.14%. Comparing base (2378fca) to head (0c875ce).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
integration_test/runner/runner.go 0.00% 11 Missing ⚠️
app/app.go 0.00% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3642      +/-   ##
==========================================
- Coverage   58.97%   58.14%   -0.84%     
==========================================
  Files        2263     2176      -87     
  Lines      187223   177070   -10153     
==========================================
- Hits       110421   102953    -7468     
+ Misses      66858    65028    -1830     
+ Partials     9944     9089     -855     
Flag Coverage Δ
sei-chain-pr 47.16% <0.00%> (?)
sei-db 70.41% <ø> (ø)
sei-db-state-db ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
app/app.go 71.19% <0.00%> (-0.09%) ⬇️
integration_test/runner/runner.go 0.00% <0.00%> (ø)

... and 153 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

The harness never starts a cosmos gRPC listener (servergrpc.StartGRPCServer
is only on the seid start path), so enabling GRPC in app.toml and exposing
Node.GRPC() advertised a port nothing binds. Remove the gRPC surface
entirely: harness serves TM RPC + EVM (HTTP/WS) only. REST stays an honest
"" parity stub.

Also: move the harness-only app.App accessors (SetEVMServeErr,
EVMHTTPServer, EVMWebSocketServer) behind //go:build inprocess in
app/app_inprocess.go so production App's public surface stays unchanged;
remove the dead wireMesh path (collectGentxs is authoritative for
persistent-peers); correct serve-error wording to listener-start (construct
-time bind is still fail-fast); state the metrics-off and 0.0.0.0-bind
invariants as standing conditions; stripScheme via strings.CutPrefix.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@bdchatham

Copy link
Copy Markdown
Contributor Author

bugbot run

Comment thread inprocess/handle.go
Explicitly set GRPC.Enable/GRPCWeb.Enable=false so app.toml matches the
"gRPC stays off" comment and can't collide on the fixed default port if the
standard start path is ever wired. Scope doc.go recipe #5's bare "listeners"
to consensus/RPC, and note on EVMRPC/EVMWS that the URL dials loopback while
the listener binds 0.0.0.0.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@bdchatham

Copy link
Copy Markdown
Contributor Author

bugbot run

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

1 issue from previous review remains unresolved.

Fix All in Cursor

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit a1ae06c. Configure here.

@bdchatham bdchatham changed the title feat(inprocess): in-process N-validator harness (C1) feat(inprocess): in-process N-validator harness Jun 24, 2026
…unner execer (C2)

Wire the integration_test/runner to drive a real bank query/tx suite against
the C1 inprocess.Network — no docker.

Runner seam: extract execCmd into an `execer` interface. The docker-exec arm
stays the zero-value default (existing yaml_integration runs unaffected). A new
build-tagged in-process arm (runner_inprocess.go, tag `inprocess`) runs each
command on the host against a `seid` it builds once, redirected to a node via a
PATH shim that prepends `--home "$SEID_HOME"` — so opaque sourced helpers that
call bare `seid` land on the right node without rewriting the commands.

Harness bridge: keyring moves into the node home (so host `seid --home` resolves
it), each home gets a client.toml pinning test keyring + chain-id + that node's
loopback RPC, and Options.ExtraKeys genesis-funds non-validator signing keys
(admin on node 0) mirroring the docker localnode topology the suites sign as.

bank_module/send_funds_test.yaml is GREEN in-memory (N=3, the min topology that
leaves block-sync and forms consensus): a real admin->bank-test send plus
historical balance queries at distinct heights, all four verifiers passing.

  go test -tags inprocess -run TestInProcessBankModule ./integration_test/runner/

Out of scope (process/binary boundary): upgrade + statesync suites.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@bdchatham

Copy link
Copy Markdown
Contributor Author

bugbot run

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

1 issue from previous review remains unresolved.

Fix All in Cursor

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 3a6be99. Configure here.

…geting + cleanup

The N-floor was documented as a >2/3 voting-power quorum (N<3 stay in
block-sync). That is wrong. The real constraint is CometBFT's block-sync
handoff, verified against sei-tendermint and empirically:

- N=1 produces blocks as solo proposer IF onlyValidatorIsUs fires — which
  needs state.Validators.Size()==1 at the blockSync decision. Recipe #1's
  empty genesis valset leaves size 0 there (decision precedes InitChain), so
  the solo node fell into block-sync and hung at height 1. Fixed by pinning
  the single validator into genesis for N=1.
- N=2 deadlocks: each node has exactly 1 peer and BlockPool.IsCaughtUp
  requires >1. Start now rejects N=2 loudly instead of hanging.
- N>=3 works (>=2 peers each). Bank suite stays at N=3.

Corrected the false call-site comment + Options doc; added a doc.go recipe
entry. Guard test now asserts N=2 rejected.

Hardening:
- F2: shim injects --node (client subcommands only; --node is not
  root-persistent, so keys/* would break) so RPC targeting is explicit, not
  client.toml-only. writeClientConfig returns its error (keyring-backend=test
  resolves only from it). Fixed the stale "injects same values defensively"
  comment.
- F5: t.Cleanup removes the temp build dir holding seid.real + shim.
- repoRoot surfaces the Getwd error instead of degrading to ".".

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@bdchatham

Copy link
Copy Markdown
Contributor Author

bugbot run

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

1 issue from previous review remains unresolved.

Fix All in Cursor

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 8403b1e. Configure here.

waitEVMServing now selects on the node's serveErr channel alongside the
poll tick, so a reported EVM listener-start failure short-circuits with
the real error instead of polling eth_blockNumber until the ctx deadline
and masking it as a generic timeout.

Consumption is non-destructive: the received error is re-sent (non-blocking,
slot just freed) so Node.ServeErr() still observes it after WaitReady
returns. Production seid (nil channel -> panic in app.go) is untouched.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@bdchatham

Copy link
Copy Markdown
Contributor Author

bugbot run

…leanup

The harness's P2P mesh is derived implicitly: collectGentxs mutates each
node's tmCfg.P2P.PersistentPeers in place. Correct the doc to describe that
mechanism (it never set PersistentPeers itself) and add a post-collectGentxs
guard that fails loudly for N>=2 if the wiring didn't land, turning a fragile
silent dependency into a fast failure.

Replace the rot-prone "recipe #N" taxonomy with self-describing named
invariants referenced at point-of-use (empty-valset, gentx-derived peer mesh,
EVM-enable injection, metrics-off constraint, loopback bind scope / 0.0.0.0 EVM
caveat, loopback conn-tracker ceiling, validator-count rule).

Also: nolint:gosec on the seid build exec (consistency with siblings); drop the
F5 step-tag comment; probeInterval var -> const; document that ServeErr() must
be read after WaitReady, not concurrently; add a test asserting metrics stay off.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@bdchatham

Copy link
Copy Markdown
Contributor Author

bugbot run

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 30459ac. Configure here.

The diversion (route a listener Start() failure to a channel instead of
panicking) only softened a rare EVM port-bind collision. For a test
harness a loud panic on a rare event is fine, and the diversion's
production footprint is not worth it. Production app.go reverts to the
original bare panic(err) serve goroutines; the sole retained production
change is keeping the constructed EVM HTTP/WS handles so the harness can
Stop() them at teardown.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@bdchatham

Copy link
Copy Markdown
Contributor Author

bugbot run

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 4e143fc. Configure here.

Consolidate the package doc (lead with the validator-count rule, centralize
the N=1 mechanism, distill the invariant prose) and strip work-item
provenance from code comments — "productionizes the spike", "C2 end-to-end
proof", "proven live by", "the point of this demo". Collapse the N-count
re-derivation in the runner test to a cross-reference; the canonical
statement lives in the package doc. No constraint dropped.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@bdchatham bdchatham marked this pull request as ready for review June 30, 2026 14:41
Comment thread integration_test/runner/runner_inprocess.go

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A well-isolated, build-tagged in-process N-validator test harness with thorough docs and tests; the only production change (two nil-default App fields) is inert. No blockers found — the lone external (Codex) finding appears to be a false positive, and several minor non-blocking notes remain.

Findings: 0 blocking | 7 non-blocking | 2 posted inline

Blockers

  • None at the file/PR level.

Non-blocking

  • Disagreement with the Codex review: it flags runner_inprocess.go:181 claiming the shim's seid --home … --node … query … fails because --node is not a root flag. This is a false positive — cobra resolves the subcommand via stripFlags (which skips --flag value tokens whether or not they're defined) and then parses all interspersed flags against the resolved leaf command, which registers --node (via AddQueryFlagsToCmd/AddTxFlagsToCmd, and StatusCommand adds it locally), with --home inherited as a persistent root flag from tmcli.PrepareBaseCmd. The shim's gating of --node to q|query|tx|status is correct and necessary (avoids adding it to keys).
  • Cursor's second-opinion review (cursor-review.md) is empty — that pass produced no output.
  • PR description mentions a redirectable serve-error channel and a SetEVMServeErr accessor ("Production seid behavior is unchanged when no channel is set"), but neither appears in the diff — app/app.go's serve goroutines still panic(err) and app/app_inprocess.go only exposes EVMHTTPServer/EVMWebSocketServer. Stale description vs. actual change; worth reconciling so reviewers/readers aren't misled.
  • Security (author-disclosed, accepted): the EVM JSON-RPC HTTP/WS listeners bind 0.0.0.0 for the harness lifetime (only TM RPC/P2P are loopback-scoped). Fine for a test harness on ephemeral ports, but flagged here for the record; tightening needs a bind-host option in evmrpc.
  • Documented limitations that are acceptable for current single-network-per-process tests but should not silently regress: (a) the EVM worker-pool process singleton is intentionally not closed, so a second Start/Close in the same process inherits a closed pool; (b) serve goroutines park until process exit if a node is Closed before its EVM start-signal fires. Both are clearly documented.
  • 2 suggestion(s)/nit(s) flagged inline on specific lines.

Comment thread integration_test/runner/runner_inprocess.go Outdated
Comment thread integration_test/runner/runner_inprocess.go
The in-process arm built its shared seid binary lazily in the first YAML
case's subtest and registered t.Cleanup there, so binDir was deleted before
later cases ran (cursor/bugbot: multi-case runs). Build it via a
backendPreparer seam in RunFile against the parent test, before the subtest
loop, so the binary outlives every case.

Also rewrite the seid shim to a two-branch quoted exec form: fixes the
doc-vs-code "appends"->"prepends" wording (seidroid) and drops the unquoted
word-splitting fragility for home paths with spaces. Tighten the touched
comments per the standards-champion xreview (centralize the cleanup-why,
lead the shim doc with the rule, type the serial-cases assumption).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A well-documented, build-tag-isolated in-process N-validator test harness. Production risk is essentially zero (only two inert fields added to app.go; everything else is gated behind the inprocess tag). No blockers; the main substantive note is a stats-tracker goroutine leak when EVM is enabled, plus a stale PR description.

Findings: 0 blocking | 5 non-blocking | 1 posted inline

Blockers

  • None at the file/PR level.

Non-blocking

  • Codex finding #1 (shim prepends --node before the subcommand → Cobra rejects it) appears to be a FALSE POSITIVE: Cobra's Find/stripFlags skips the leading --home/--node tokens regardless of whether root knows them, then executes the leaf command (status, balances, etc.) whose merged flagset includes both --node (client flag) and --home (persistent) — pflag parses interspersed flags/positionals, so flag position relative to the subcommand name does not matter. The shim also correctly omits --node for non-client subcommands (keys), which is the only case that would actually fail. That said, the PR description only claims TestInProcessNetwork/TestStartRejectsZeroValidators/TestFreshChainIDPerRun pass — TestInProcessBankModule (the one that actually exercises the shim against seid status/seid q) is NOT confirmed passing, so a quick manual run of the bank suite would put this to rest.
  • PR description is stale relative to the diff: it describes "a redirectable serve-error channel", a SetEVMServeErr accessor, and "Per-node EVM serve-error channel + goroutine recover, so one node's listener-start failure reports instead of killing all N." None of that is in the diff — app/app.go's serve goroutines still panic(err) on start failure, and only EVMHTTPServer()/EVMWebSocketServer() accessors exist. The in-code docs (doc.go/readiness.go) correctly describe the fail-loud panic path, so only the PR summary over-claims; please reconcile it before merge.
  • cursor-review.md was present but empty — the Cursor second-opinion pass produced no output for this synthesis.
  • Accepted/documented caveat worth a second look from a maintainer: the EVM HTTP/WS listeners bind 0.0.0.0 (all interfaces) for the harness lifetime because evmrpc has no bind-host option, while TM RPC/P2P are loopback-scoped. Fine for ephemeral test ports on CI, but it does briefly publish externally-reachable EVM JSON-RPC; tightening needs a bind-host option in evmrpc.
  • 1 suggestion(s)/nit(s) flagged inline on specific lines.

Comment thread inprocess/appoptions.go
NewEVMHTTPServer/NewEVMWebSocketServer spawn a stats-reporter goroutine when
rpc_stats_interval > 0 (the unset default is 10s), on a package global that
EVMServer.Stop never cancels and that's reassigned per node — so N nodes
orphan N-1 reporters. Return a 0 interval from appOptions so the tracker is
skipped, keeping teardown deterministic.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A well-isolated, heavily-documented in-process N-validator test harness gated behind the inprocess build tag; the only production change is two retained EVM-server handles plus tag-gated accessors. No blocking correctness or security issues found. Non-blocking notes concern a likely-incorrect Codex finding, an empty Cursor pass, and a lint risk worth confirming.

Findings: 0 blocking | 5 non-blocking | 1 posted inline

Blockers

  • None at the file/PR level.

Non-blocking

  • Codex's only finding (that the seid shim places --node before the subcommand and Cobra will reject it as an unknown root flag) appears INCORRECT and should not block. --node is a local flag on the leaf query/tx/status commands (sei-cosmos/client/flags/flags.go:90,107); Cobra's Find uses stripFlags, which consumes an unknown long flag's value during traversal, so seid.real --home X --node Y q bank balances <addr> resolves to the leaf and then flag-parses --home/--node against the leaf's merged flag set successfully. TestInProcessBankModule drives exactly this path (seid q bank balances, seid tx bank send) and is reported passing.
  • The Cursor second-opinion pass produced no output (cursor-review.md is empty); REVIEW_GUIDELINES.md is also empty, so no repo-specific standards were applied beyond AGENTS.md.
  • Reviewer-noted limitation (acceptable): the EVM worker pool is a process-wide sync.Once singleton that Close deliberately does not drain, so only one Network can run per process. This is documented in handle.go and the PR body; fine for an on-demand harness, but worth a follow-up to de-globalize if repeated Start/Close is ever needed.
  • Not wired into CI (runs only under -tags inprocess); acknowledged as a deliberate follow-up. Worth tracking so the harness doesn't silently bit-rot since the default build never compiles these files.
  • 1 suggestion(s)/nit(s) flagged inline on specific lines.

Comment thread app/app.go

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A well-documented, build-tagged (inprocess) test-only harness for N-validator in-process networks; the sole production change is harmless retention of two EVM-server handles. No blockers — only minor, mostly pre-existing limitations and a low-severity goroutine leak on the failure path.

Findings: 0 blocking | 5 non-blocking | 1 posted inline

Blockers

  • None at the file/PR level.

Non-blocking

  • Review tooling gaps: cursor-review.md and REVIEW_GUIDELINES.md are both empty/missing, so neither Cursor's pass nor repo-specific guidelines were available for this synthesis (only OpenAI Codex produced output).
  • Codex flagged the shim injecting --node before the subcommand as producing unknown flag: --node for seid q/tx/status. I disagree: cobra strips flags to resolve the command path (consuming --node <val> even when unknown at root) and then parses the full flagset of the resolved leaf command, which does register --node (AddQueryFlagsToCmd / AddTxFlagsToCmd / StatusCommand). Flag position relative to the subcommand is irrelevant, and TestInProcessBankModule would fail if this were broken. No change needed, but worth a quick manual seid q ... confirmation under the shim.
  • Pre-existing limitation (Codex #2): RegisterLocalServices passes DefaultNodeHome (not each node's n.home) to NewEVMHTTPServer/NewEVMWebSocketServer, so EVM hosted-key methods (eth_accounts/eth_sign/eth_sendTransaction) read the wrong keyring under the harness. Not introduced by this PR and not exercised by the harness's served surface, but worth documenting as an in-process-mode caveat alongside the other doc.go invariants.
  • doc.go notes the EVM listeners bind 0.0.0.0 (no bind-host option in evmrpc); acceptable and documented for a loopback-dialed test harness, but means each in-process run briefly exposes EVM JSON-RPC on all interfaces.
  • 1 suggestion(s)/nit(s) flagged inline on specific lines.

Comment thread inprocess/handle.go
Comment thread inprocess/appoptions.go
Comment thread integration_test/runner/runner_inprocess.go Outdated

@amir-deris amir-deris left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — in-process N-validator harness

Thorough, well-documented PR. The production-facing change (app.App retaining two EVM server handles) is clean and additive; the docker-arm refactor in runner.go is byte-for-byte behavior-preserving; gofmt -s, go vet -tags inprocess, and errcheck are all clean; and the genesis/client.toml reimplementations are justified (the cosmos helpers they mirror are genuinely unexported).

All findings below are in the new inprocess-tagged code — none block the production path. The two I'd act on before merge are the port-allocation flake and a Start guard for the one-network-per-process limitation so it fails loudly instead of silently misbehaving.

CI coverage gap (not a rule violation): .golangci.yml uses build-tags: [codeanalysis] / tests: false, so none of the new //go:build inprocess code is linted in CI, and the PR notes it isn't in the test pipeline either — so this code currently has no automated lint or test gate. If the harness is going to be relied on, an -tags inprocess lint+test job is the deeper fix.

Inline comments follow, most-severe first.

Comment thread inprocess/harness.go
Comment thread inprocess/handle.go
Comment thread inprocess/handle.go Outdated
Comment thread integration_test/runner/runner_inprocess.go
Comment thread integration_test/runner/runner_inprocess.go
Comment thread inprocess/appoptions.go
Comment thread inprocess/readiness.go
- freePort: probe 127.0.0.1 + a process-wide allocated-port set, closing the
  dual-stack localhost->::1 family mismatch and intra-process reuse across the
  4*N allocations (residual cross-process TOCTOU documented).
- Start: a one-shot networkStarted guard fails loudly on a second in-process
  network, since the EVM worker pool / metrics printer / Prometheus registries
  are process-global singletons that don't re-init.
- shim: a leading-global-flag branch errors (exit 64) instead of silently
  targeting the default RPC; rewritten as a quoted raw-string template.
- readiness: drop the unused CatchingUp field. The dual-shape /status parse is
  KEPT — the in-process node emits the unwrapped shape, so an enveloped-only
  parse hangs WaitReady (the reviewer's "dead branch" claim was empirically
  wrong; verified by bisect).
- handle.go: document the metrics-printer singleton alongside the worker pool,
  and the parked-EVM-serve-goroutine leak on a pre-first-block stop.

Pin the SeiDB AppOptions explicitly rather than embedding app.TestAppOpts:
delegating would adopt its giga-OFF default and flip the in-process app off
the production Giga execution engine (an unset giga flag selects it).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@bdchatham

Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review, Amir — all addressed in b55917c (also ran it through our standards-champion review). Dispositions:

Acted on

  • Port-allocation flakefreePort now probes 127.0.0.1 + tracks a process-wide allocated set, closing the dual-stack ::1 mismatch and intra-process reuse; residual cross-process TOCTOU documented.
  • One-network-per-process guard — a networkStarted one-shot in Start fails loudly on a second network. Also documented the StopMetricsPrinter singleton you flagged alongside the worker pool.
  • --node allowlist brittleness — the shim now errors (exit 64) on a leading global flag instead of silently targeting :26657; the allowlist is documented as the maintained contract.
  • Shim raw-string literal, serve-goroutine leak / binDir unsync — done / documented as caveats tied to the one-network limit.

Rejected (with evidence)

  • Dead /status branch — the unwrapped branch is the live path: the in-process node's HTTP /status returns the unwrapped shape, and an enveloped-only parse hangs WaitReady. Confirmed by bisect (enveloped-only → TestInProcessNetwork times out in WaitReady; dual-shape → 18s pass). Kept both; did drop the genuinely-unused CatchingUp field per your note.

Declined (with reason)

  • Embed app.TestAppOpts — good catch that it dedups the SeiDB flags, but embedding silently flips the app off the production Giga execution engine: an unset giga flag defaults Giga ON (prior harness + production), while TestAppOpts' zero value forces Giga OFF. Pinned the SeiDB flags explicitly instead, with a forward-guard comment.

Your CI-coverage-gap point is the most important one and exactly right: the inprocess-tagged code isn't linted or run in CI, and that gap let both the giga flip and a readiness regression pass CI green this round — only the local test caught them. Filing a follow-up to wire an -tags inprocess lint+test job.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit b55917c. Configure here.

Comment thread inprocess/harness.go
Comment thread inprocess/handle.go

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A well-documented, build-tag-gated in-process N-validator harness with only a minimal, safe change to production app.App (two retained EVM server handles). No blocking issues; a handful of non-blocking notes, the main one being that keyring-backed EVM RPC methods on harness nodes read the global ~/.seid home rather than the per-node home.

Findings: 0 blocking | 6 non-blocking | 1 posted inline

Blockers

  • None at the file/PR level.

Non-blocking

  • Keyring-home mismatch (confirmed; raised by Codex): RegisterLocalServices in app/app.go:2733/2748 passes the package-global DefaultNodeHome (~/.seid) as homeDir to NewEVMHTTPServer/NewEVMWebSocketServer. That homeDir is what getTestKeyring reads for the keyring-backed EVM methods (eth_accounts, eth_sign, eth_sendTransaction — see evmrpc/info.go:79, send.go:243, tx.go:436). The harness writes each node's genesis keys into its per-node home (n.home), so those EVM methods on a harness node would resolve against ~/.seid, not the node's keyring. The PR's tests don't exercise those methods (they use TM RPC broadcast + eth_blockNumber) and doc.go doesn't claim them, so this isn't a correctness break today, but the served-surface claim of 'EVM JSON-RPC' is partial; worth documenting the limitation or threading n.home through for the harness path.
  • 0.0.0.0 EVM listener bind (raised by Codex): the EVM HTTP/WS listeners bind all interfaces while TM RPC/P2P are scoped to loopback, making the test EVM endpoints reachable on the host network. This is explicitly acknowledged and justified in inprocess/doc.go (evmrpc has no bind-host option yet, ephemeral ports dialed via 127.0.0.1). Acceptable for a test-only, build-tagged harness; flagging for awareness only.
  • REVIEW_GUIDELINES.md and cursor-review.md were both empty/missing — no repo-specific guideline input and no Cursor second-opinion pass was available for this synthesis. Codex's pass produced three findings, all incorporated.
  • Node.Object() doc (inprocess/handle.go) says it returns the live *node.Node, but it returns h.n.tmNode whose static type is rpclocal.NodeService (handle.go field at the tmNode declaration). Minor doc/type-description drift.
  • No CI coverage yet (acknowledged in the PR description): the harness only runs under -tags inprocess and isn't wired into the pipeline, so regressions in this code won't be caught automatically until the deliberate CI follow-up lands.
  • 1 suggestion(s)/nit(s) flagged inline on specific lines.

Comment thread inprocess/harness.go Outdated
The networkStarted guard was claimed before resolveBaseDir, so a temp-dir
failure (or any pre-app.New error) permanently burned the slot though no EVM
singleton had been touched (cursor/bugbot + Codex). Move the CompareAndSwap to
immediately before the startNode loop — the first point that touches the
process-global singletons — so only a real bring-up claims it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Code review skipped — your organization's overage spend limit has been reached.

Code review is billed via overage credits. To resume reviews, an organization admin can raise the monthly limit at claude.ai/admin-settings/claude-code.

Once credits are available, reopen this pull request to trigger a review.

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A well-documented, build-tag-gated in-process N-validator harness whose only production change is retaining two EVM server handles for teardown. No blocking issues found; the one blocker-grade external finding (Codex's --node-placement claim) does not reproduce against the actual Cobra/cosmos flag wiring. A few forward-looking limitations and a lint risk are worth confirming.

Findings: 0 blocking | 6 non-blocking | 1 posted inline

Blockers

  • None at the file/PR level.

Non-blocking

  • Codex finding #1 (the seid --home … --node … q/tx/status shim form is rejected by Cobra because --node is not a root persistent flag) does NOT reproduce and is assessed as a false positive: the repo sets no TraverseChildren, so with Cobra's default the root does not ParseFlags before traversal — Find uses stripFlags (which never errors on unknown flags) to locate the subcommand, then the resolved LEAF command parses the whole arg list. The leaves that run here all register --node (status.go:86, and AddQueryFlagsToCmd/AddTxFlagsToCmd on the balances/account/send leaves); the DisableFlagParsing:true is only on the non-executing q bank grouping parent. The PR's own TestInProcessBankModule drives seid status and seid q bank balances through this exact shim, which would fail at the first step if the claim held. No change needed.
  • Codex finding #2 is a real but forward-looking limitation, not a defect in this PR: existing EVM YAML suites (e.g. integration_test/seidb/flatkv_evm_test.yaml) hard-code http://localhost:8545, while the harness allocates ephemeral EVM ports and exposes them via SEI_EVM_RPC/EVM_RPC. No EVM suite is wired to the in-process arm here (only bank_module/send_funds_test.yaml, which uses no EVM HTTP), so nothing regresses — but adopting EVM suites in-process will require those YAMLs to read $SEI_EVM_RPC instead of the fixed port. Worth a follow-up note in doc.go.
  • cursor-review.md and REVIEW_GUIDELINES.md are both empty — the Cursor second-opinion pass and repo-specific guidelines produced no input to merge.
  • freePort() retains an unavoidable bind-time TOCTOU across the 4*N allocations (probe-close then later bind); it is documented and accepted, but under heavy parallel test load it could occasionally flake a node's serve goroutine into a panic. Acceptable for a test harness; flagging for awareness.
  • The EVM HTTP/WS listeners bind 0.0.0.0 (all interfaces) rather than loopback, unlike TM RPC/P2P which are correctly scoped to 127.0.0.1. This is documented as an accepted caveat (evmrpc has no bind-host option), and ports are ephemeral, but it does briefly expose a serving EVM endpoint on all interfaces during tests.
  • 1 suggestion(s)/nit(s) flagged inline on specific lines.

Comment thread app/app.go
// in-process harness) can Stop() them at teardown. Nil when the respective
// listener is disabled. Production seid does not read these; its process exit
// reaps the listeners.
evmHTTPServer evmrpc.EVMServer

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] These two new unexported fields are written here in the untagged build but only ever read via the accessors in app_inprocess.go, which are behind the inprocess build tag. In a normal (non-inprocess) seid build the fields are assigned and never read — worth confirming make lint stays green, since staticcheck (U1000) can flag write-only unexported fields. If it does complain, a //nolint:unused or a brief justifying comment will keep CI clean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants