Skip to content

Fix/ws byte mismatch verify signed payload#14

Merged
maltsev-dev merged 4 commits into
masterfrom
fix/ws-byte-mismatch-verify-signed-payload
Jun 18, 2026
Merged

Fix/ws byte mismatch verify signed payload#14
maltsev-dev merged 4 commits into
masterfrom
fix/ws-byte-mismatch-verify-signed-payload

Conversation

@maltsev-dev

Copy link
Copy Markdown
Member

What

Why

How

Test plan

  • Unit tests pass (per-repo, e.g. cd backend && cargo test, cd frontend && npm test)
  • Lint passes (per-repo, e.g. cd frontend && npm run lint)
  • Type-check passes (per-repo, e.g. cd frontend && npm run type-check)
  • Manually verified in dev / staging

Risk

Checklist

  • I have read the repo's CONTRIBUTING.md (if present)
  • My change does not introduce new lint warnings
  • I have updated the CHANGELOG (if user-visible)
  • I have considered backwards compatibility

maltsev-dev and others added 4 commits June 18, 2026 13:07
The SDK's import chain (nullrun.__init__ -> nullrun.decorators ->
nullrun.instrumentation.langgraph -> 'from langchain_core.callbacks
import BaseCallbackHandler') runs at pytest *collection* time, not at a
specific test. With CI installing [dev] only, every test in the suite
errored on collection with:

    ModuleNotFoundError: No module named 'langchain_core'

This is the same class of bug that 'nullrun[langgraph]' exists to
prevent for end users, except the dev install never benefited from
the extras indirection.

Fix: add 'langchain-core>=0.3,<1.0' to the [dev] extras. The
heavier 'langgraph' / 'langchain' extras pull in stacks the unit
tests don't use; the bare core is the smallest dep that makes the
import chain resolve and unblocks test collection on every
supported Python (3.10 / 3.11 / 3.12) on every PR.

Validation: locally on Python 3.14.2 (which is outside the
3.10/3.11/3.12 matrix that CI tests), 'pip install -e .[dev]'
followed by 'pytest tests/' runs 443/443 + 9/9 new byte-mismatch
unit tests, no collection error. CI will re-confirm on the 3.10 /
3.11 / 3.12 matrix.
Counterpart of NULLRUN fix(ws-control) (commit 5e2f65b). The
backend now embeds the exact bytes that were HMAC-signed in a
separate signed_payload field. The SDK:

  1. Verifies the signature against bytes.fromhex(signed_payload),
     falling back to the legacy wire-bytes path only when the
     field is absent (pre-FIX-C servers).
  2. Dispatches state changes from the parsed signed_payload
     bytes, not from the outer envelope body. This closes a
     security hole: an attacker who captured a (signed_payload,
     signature) pair from a benign 'state=Normal' event could
     otherwise splice a forged 'state=Killed' into the outer body
     and the signature would still verify, because the signature
     covers only the signed_payload bytes. Reading dispatch state
     from the trusted source keeps the captured signature
     semantically bound to its captured body.

Tests in test_ws_signed_payload.py cover:
  - round-trip, wrong-secret, tampered-payload rejection
  - malformed signed_payload does not crash
  - replay-with-spliced-body: signature still verifies, but the
    dispatched state is the captured one (not the forged one) -
    the attack is harmless
  - replays where the attacker also rewrites signed_payload are
    rejected via signature mismatch

Note: the two ACK tests are still failing because
ACKNOWLEDGED_STATES is still lowercase. That is fixed separately
by S-2 in the same release - kept as a separate commit so the
byte-mismatch/security fix is reviewable on its own.
The server's WsWorkflowState enum (NULLRUN/backend/src/proxy/http/
ws_control.rs) emits 'Killed' / 'Paused' (PascalCase). The SDK was
comparing against {'killed', 'paused'} (lowercase), so the ACK path
was dead and the server's pending-ack queue grew without ever
being drained.

This unblocks the two remaining failing tests in
test_ws_signed_payload.py:
  - test_state_change_with_signed_payload_is_dispatched (now sends
    the ACK that the server expects)
  - test_acknowledged_states_use_pascalcase (now matches server
    casing)

With byte-mismatch FIX-C in place (commits 5e2f65b + 105fb80), the
KILL/PAUSE path now works end-to-end:
  1. server signs the inner message and embeds the bytes in
     signed_payload
  2. server sends the envelope (flattened WsMessage + signature +
     timestamp + api_key_id + signed_payload)
  3. SDK verifies signature against bytes.fromhex(signed_payload)
  4. SDK dispatches from the trusted source (parsed signed_payload),
     so a captured (signed_payload, signature) pair can only
     re-trigger its captured state, never a forged one
  5. SDK sends ACK on Killed/Paused, draining server's pending-acks
@maltsev-dev maltsev-dev merged commit 5459964 into master Jun 18, 2026
8 checks passed
@maltsev-dev maltsev-dev deleted the fix/ws-byte-mismatch-verify-signed-payload branch June 18, 2026 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant