Skip to content

Headroom × sandcastle: Phase 1 validation — measure token savings on a live run #84

Description

@lsfera

Goal

Verify the headroom context compression integration works correctly end-to-end and quantify actual token savings.

Background

#83 completed the technical integration: @ai-hero/sandcastle patched via patch-package with promptCompression callback injected at 3 points, .sandcastle/context-compressor.ts provides the headroom-ai wrapper gated by HEADROOM_MODE.

This issue covers the validation step — running it against a real issue.

Test plan

  1. Functional test (conservative): run /afk with HEADROOM_MODE=conservative on a self-contained small issue
  2. Functional test (aggressive): same on a slightly more complex prompt to check for over-compression
  3. Token measurement: compare API call logs or sandcastle session JSONL usage events between off/conservative/aggressive

Expected savings (estimates)

Component Tokens (off) Tokens (conservative) Savings
Issue body in prompt (12 iterations) ~36,000 ~9,600 ~75%
Reviewer diff (1 pass) ~4,000–8,000 ~1,000–2,500 ~70%

Acceptance criteria

  • Conservative mode completes a live issue without regressions
  • Token savings measured and documented (raw numbers, not estimates)
  • Aggressive mode tested; over-compression behavior documented
  • Default HEADROOM_MODE=off confirmed — zero accidental cost

Notes

Compression works at the prompt text level (sandcastle doesn't expose structured messages). The compressed prompt becomes the user instructions to Claude Code. headroom preserves structural/formatting-sensitive content, so <promise>ISSUE_COMPLETE</promise> completion signal and scope guardrails should survive compression intact.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions