fix: security and correctness hardening across API, tools, MCP, persistence, and gateways#59
Conversation
Critical: - api/server.go: add ReadHeaderTimeout/Read/Write/Idle timeouts (slowloris DoS fix); add validateAuthConfig loopback auth gate; Stop uses 15s bounded timeout instead of unbounded ctx - tool/file_write.go: replace os.WriteFile with atomic temp-file -> sync -> rename (crash-safe writes); log backup failures via slog.Warn instead of silently dropping them High: - tool/bash.go: replace cmd.CombinedOutput with limitedWriter that caps memory at 500 KB (yes / cat /dev/urandom can no longer OOM the process before the timeout kills the command) - tool/safety.go: validateURLPublic now returns originalHost so callers can preserve the Host header for virtual-host routing; add 0.0.0.0/8, 100.64.0.0/10, 198.18.0.0/15 SSRF ranges; check IPv4-mapped IPv6 (::ffff:x.x.x.x) against IPv4 CIDR blocks - tool/web_fetch.go + download.go: set req.Host to preserve original hostname after IP pinning - mcp/mcp.go: time.After -> time.NewTimer + Stop (avoid timer leak); preserve JSON-RPC error code/message in returned error via pendErrors map; log scanner.Err() with slog.Warn on oversized responses Medium: - permissions/guardian.go: isBase64Injection uses byte iteration (consistent with len(s) byte count) instead of mixing rune count with byte length
… submodule - tool/retry.go: replace time.After with time.NewTimer+Stop in RetryExecutor backoff wait - resilience/retry/retry.go: same fix in Do() and DoWithResult() backoff waits - api/server.go: add securityHeaders middleware (X-Content-Type- Options, X-Frame-Options, Cache-Control) on all routes - external/eyrie: bump submodule to c5ab1f0 — picks up eyrie's timer-leak fixes (ratelimit, adaptive_ratelimit), security headers, and all prior security/correctness fixes (loopback auth guard, stream close, guardrails safe variants, etc.)
- engine/stream.go: replace time.After with time.NewTimer+Stop in the stream retry backoff wait (line 464) to avoid leaking the timer in the runtime when ctx is cancelled before the delay elapses
…yrie - resilience/ratelimit/ratelimit.go: replace time.After with time.NewTimer+Stop in Wait() backoff loop - engine/stream.go: add 30s timeout context to snapshot Track goroutine (was fire-and-forget with no timeout) - external/eyrie: bump submodule to 356184a — picks up centralized httputil package, bounded keyring lookups, and all prior fixes
Session persistence: - persist.go: atomic write with sync before rename (was os.WriteFile without sync — crash could leave partial file) - sqlite_store.go: add SetMaxOpenConns(1) + busy_timeout=5000 pragma to prevent 'database is locked' under concurrent access Sandbox: - netproxy.go: add HTTP server timeouts (ReadHeader 10s, Read 30s, Write 5min for CONNECT tunnels, Idle 120s) — was no timeouts, vulnerable to slowloris Daemon gateways: - telegram.go: replace fmt.Sprintf JSON injection with json.Marshal; bound response body read to 1 MiB; rune-safe truncation at 4000 - discord.go: bound error response body read to 4 KiB - gateway.go: bound response body read to 1 MiB
- bash.go: cap limitedWriter at maxOutputBytes+1 so TruncateOutput's truncation marker fires when output exceeds the cap (was silently lost) - stream.go: thread snapCtx into TrackCtx so the 30s snapshot timeout actually bounds git operations (was dead code; Track ignored context) - snapshot.go: add TrackCtx/gitWorkCtx/gitWorkOutputCtx to accept context - safety.go: remove dead IPv4-mapped IPv6 checkIPs branch (net.IPNet.Contains already handles mapped addresses via To4) - mcp.go: only delete pendErrors for undelivered requests in readLoop cleanup, preserving error details for already-signaled callers
Code Review — Findings & FixesReviewed all 20 changed files. Found 4 issues (1 P1, 1 P2, 2 P3), all fixed in commit [P1] bash.go: Truncation marker silently lost (regression)
[P2] engine/stream.go: Snapshot 30s timeout was dead code
[P3] tool/safety.go: Dead IPv4-mapped IPv6 branchThe [P3] mcp/mcp.go: pendErrors detail lost on quick server EOF
Verified cleanAtomic file_write, SSRF IP pinning + Host preservation, timer.Stop() on all paths, sqlite_store MaxOpenConns + busy_timeout, daemon body bounding + rune-safe truncation, security headers middleware, bounded Shutdown timeout. |
Summary
Comprehensive security and correctness hardening across hawk, covering 6 commits and 19 files + eyrie submodule bumps.
Critical fixes
High fixes
Medium fixes
Submodule bumps
Verification
go build ./...go test -race -count=1 ./...— all passgo vet ./...go fmtclean