Background
#84's Phase 1 live validation (see PR #96, comment) confirmed the headroom
integration is wired correctly end-to-end — both the one-shot prompt
compression (context-compressor.ts) and the full-session proxy routing
(sandbox-runner.ts's sandboxEnv → ANTHROPIC_BASE_URL) reach the
headroom service, and its stats are tracked accurately.
But measured compression savings across every live test run so far — 2 direct
compress() calls plus 14 real proxied API requests — is 0%:
compressions_by_strategy: {} (empty — never fired)
compression_cache.total_tokens_saved: 0
cost.total_tokens_saved: 0 (of $0.627 total spend)
agent_usage.totals.savings_percent: 0.0
The only nonzero discount anywhere is Anthropic's own native prefix-caching
(prefix_cache.discount_usd: $0.4988) — unrelated to headroom.
Likely root cause
From the headroom container's own startup banner:
License: OSS (no license key)
Code-Aware: DISABLED (install headroom-ai[code] to enable)
Headroom's compression is turn-based/staleness-driven
(HEADROOM_COMPRESSION_STABLE_AFTER_TURN / HEADROOM_STALE_READ_COMPRESS_AFTER_TURNS),
and the code-aware strategies that do most of the actual compression work
aren't available without a license key. Our test sessions were short
(≤14 turns) one-shot or small multi-turn runs — plausibly too short to cross
whatever staleness threshold triggers compression, on top of code-aware
being unavailable at all.
Open questions to resolve before further investment here
- Is a Headroom Cloud / licensed key available or worth acquiring, and would
it actually change compressions_by_strategy from empty?
- Does tuning
HEADROOM_COMPRESSION_STABLE_AFTER_TURN /
HEADROOM_STALE_READ_COMPRESS_AFTER_TURNS down (to trigger compression
sooner in shorter sessions) produce nonzero savings on the OSS tier alone,
without a license?
- If neither moves the needle, is the one-shot
context-compressor.ts
prompt-compression path (which by construction never has multi-turn
history to compress) worth keeping at all, versus relying solely on the
proxy-routing path for any future compression gains?
This is a product/spend decision (whether to acquire a license) as much as
an engineering one, so left unlabeled for ready-for-agent pending a human
call on scope and budget.
Background
#84's Phase 1 live validation (see PR #96, comment) confirmed the headroom
integration is wired correctly end-to-end — both the one-shot prompt
compression (
context-compressor.ts) and the full-session proxy routing(
sandbox-runner.ts'ssandboxEnv→ANTHROPIC_BASE_URL) reach theheadroomservice, and its stats are tracked accurately.But measured compression savings across every live test run so far — 2 direct
compress()calls plus 14 real proxied API requests — is 0%:The only nonzero discount anywhere is Anthropic's own native prefix-caching
(
prefix_cache.discount_usd: $0.4988) — unrelated to headroom.Likely root cause
From the
headroomcontainer's own startup banner:Headroom's compression is turn-based/staleness-driven
(
HEADROOM_COMPRESSION_STABLE_AFTER_TURN/HEADROOM_STALE_READ_COMPRESS_AFTER_TURNS),and the code-aware strategies that do most of the actual compression work
aren't available without a license key. Our test sessions were short
(≤14 turns) one-shot or small multi-turn runs — plausibly too short to cross
whatever staleness threshold triggers compression, on top of code-aware
being unavailable at all.
Open questions to resolve before further investment here
it actually change
compressions_by_strategyfrom empty?HEADROOM_COMPRESSION_STABLE_AFTER_TURN/HEADROOM_STALE_READ_COMPRESS_AFTER_TURNSdown (to trigger compressionsooner in shorter sessions) produce nonzero savings on the OSS tier alone,
without a license?
context-compressor.tsprompt-compression path (which by construction never has multi-turn
history to compress) worth keeping at all, versus relying solely on the
proxy-routing path for any future compression gains?
This is a product/spend decision (whether to acquire a license) as much as
an engineering one, so left unlabeled for
ready-for-agentpending a humancall on scope and budget.