Headroom × sandcastle: Phase 1 validation — measure token savings on a live run

## Goal

Verify the headroom context compression integration works correctly end-to-end and quantify actual token savings.

## Background

#83 completed the technical integration: `@ai-hero/sandcastle` patched via patch-package with `promptCompression` callback injected at 3 points, `.sandcastle/context-compressor.ts` provides the headroom-ai wrapper gated by `HEADROOM_MODE`.

This issue covers the **validation** step — running it against a real issue.

## Test plan

1. **Functional test (conservative)**: run `/afk` with `HEADROOM_MODE=conservative` on a self-contained small issue
2. **Functional test (aggressive)**: same on a slightly more complex prompt to check for over-compression
3. **Token measurement**: compare API call logs or sandcastle session JSONL usage events between off/conservative/aggressive

## Expected savings (estimates)

| Component | Tokens (off) | Tokens (conservative) | Savings |
|-----------|-------------|----------------------|---------|
| Issue body in prompt (12 iterations) | ~36,000 | ~9,600 | ~75% |
| Reviewer diff (1 pass) | ~4,000–8,000 | ~1,000–2,500 | ~70% |

## Acceptance criteria
- [ ] Conservative mode completes a live issue without regressions
- [ ] Token savings measured and documented (raw numbers, not estimates)
- [ ] Aggressive mode tested; over-compression behavior documented
- [ ] Default `HEADROOM_MODE=off` confirmed — zero accidental cost

## Notes

Compression works at the **prompt text level** (sandcastle doesn't expose structured messages). The compressed prompt becomes the user instructions to Claude Code. headroom preserves structural/formatting-sensitive content, so `<promise>ISSUE_COMPLETE</promise>` completion signal and scope guardrails should survive compression intact.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Headroom × sandcastle: Phase 1 validation — measure token savings on a live run #84

Goal

Background

Test plan

Expected savings (estimates)

Acceptance criteria

Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Component	Tokens (off)	Tokens (conservative)	Savings
Issue body in prompt (12 iterations)	~36,000	~9,600	~75%
Reviewer diff (1 pass)	~4,000–8,000	~1,000–2,500	~70%

Headroom × sandcastle: Phase 1 validation — measure token savings on a live run #84

Description

Goal

Background

Test plan

Expected savings (estimates)

Acceptance criteria

Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions