Skip to content

Wip/working tree 2026 06 18#13

Closed
maltsev-dev wants to merge 3 commits into
masterfrom
wip/working-tree-2026-06-18
Closed

Wip/working tree 2026 06 18#13
maltsev-dev wants to merge 3 commits into
masterfrom
wip/working-tree-2026-06-18

Conversation

@maltsev-dev

Copy link
Copy Markdown
Member

What

Why

How

Test plan

  • Unit tests pass (per-repo, e.g. cd backend && cargo test, cd frontend && npm test)
  • Lint passes (per-repo, e.g. cd frontend && npm run lint)
  • Type-check passes (per-repo, e.g. cd frontend && npm run type-check)
  • Manually verified in dev / staging

Risk

Checklist

  • I have read the repo's CONTRIBUTING.md (if present)
  • My change does not introduce new lint warnings
  • I have updated the CHANGELOG (if user-visible)
  • I have considered backwards compatibility

Counterpart of NULLRUN fix(ws-control) (commit 5e2f65b). The
backend now embeds the exact bytes that were HMAC-signed in a
separate signed_payload field. The SDK:

  1. Verifies the signature against bytes.fromhex(signed_payload),
     falling back to the legacy wire-bytes path only when the
     field is absent (pre-FIX-C servers).
  2. Dispatches state changes from the parsed signed_payload
     bytes, not from the outer envelope body. This closes a
     security hole: an attacker who captured a (signed_payload,
     signature) pair from a benign 'state=Normal' event could
     otherwise splice a forged 'state=Killed' into the outer body
     and the signature would still verify, because the signature
     covers only the signed_payload bytes. Reading dispatch state
     from the trusted source keeps the captured signature
     semantically bound to its captured body.

Tests in test_ws_signed_payload.py cover:
  - round-trip, wrong-secret, tampered-payload rejection
  - malformed signed_payload does not crash
  - replay-with-spliced-body: signature still verifies, but the
    dispatched state is the captured one (not the forged one) -
    the attack is harmless
  - replays where the attacker also rewrites signed_payload are
    rejected via signature mismatch

Note: the two ACK tests are still failing because
ACKNOWLEDGED_STATES is still lowercase. That is fixed separately
by S-2 in the same release - kept as a separate commit so the
byte-mismatch/security fix is reviewable on its own.
The server's WsWorkflowState enum (NULLRUN/backend/src/proxy/http/
ws_control.rs) emits 'Killed' / 'Paused' (PascalCase). The SDK was
comparing against {'killed', 'paused'} (lowercase), so the ACK path
was dead and the server's pending-ack queue grew without ever
being drained.

This unblocks the two remaining failing tests in
test_ws_signed_payload.py:
  - test_state_change_with_signed_payload_is_dispatched (now sends
    the ACK that the server expects)
  - test_acknowledged_states_use_pascalcase (now matches server
    casing)

With byte-mismatch FIX-C in place (commits 5e2f65b + 105fb80), the
KILL/PAUSE path now works end-to-end:
  1. server signs the inner message and embeds the bytes in
     signed_payload
  2. server sends the envelope (flattened WsMessage + signature +
     timestamp + api_key_id + signed_payload)
  3. SDK verifies signature against bytes.fromhex(signed_payload)
  4. SDK dispatches from the trusted source (parsed signed_payload),
     so a captured (signed_payload, signature) pair can only
     re-trigger its captured state, never a forged one
  5. SDK sends ACK on Killed/Paused, draining server's pending-acks
The working tree contained a large uncommitted changeset that was
never pushed: 68 files, +8955/-3328 lines. Reading the diff shape
this is the 0.3.0 -> 0.4.0 production-readiness migration
(per CHANGELOG.md / audit §6.1):

  - PoolConfig / AdaptivePool removed (Transport now is a
    context manager; weakref.finalize replaces atexit.register)
  - gRPC transport removed (NULLRUN_USE_GRPC no-op; create_grpc_transport
    was a NameError)
  - signal.signal global hijack removed
  - track.proto removed
  - decision_history / flow / gate / common placeholders removed
  - six zombie exceptions removed (CostLimitExceeded,
    ApprovalRequired, BreakerTimeout, LoopDetectedException,
    RetryStormException, RateLimitExceededException)
  - _organization_id_var, _api_key_id_var removed
  - patch_openai / unpatch_openai removed
  - auto-instrumentation extended with langgraph / llama-index /
    crewai / autogen / openai-agents via safe_patch
  - SENSITIVE_ARG_KEYS expanded from 7 to 29 tokens
  - HMAC always-on for /track/batch, /gate, /evaluate, /status,
    /auth/verify + WS ACKs signed
  - 14 new test files
  - analyze.md (this session's plan)

Tracking as a wip branch so the work is preserved. This commit does
not change the byte-mismatch FIX-C landing in
fix/ws-byte-mismatch-verify-signed-payload (commits 105fb80,
73f3197) - those branches are based on 316a694 + the byte-mismatch
fixes only.
@maltsev-dev

Copy link
Copy Markdown
Member Author

Closing: the wip staging snapshot (16f8fca) and the two upstream commits (105fb80 HMAC, 73f3197 ACKNOWLEDGED_STATES) are already in master via PR #11 and #12. No unique content remains in this branch. Branch will be deleted to keep the repo tidy.

@maltsev-dev maltsev-dev deleted the wip/working-tree-2026-06-18 branch June 18, 2026 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant