A development harness for Claude Code, built around one position: the field answers AI-code drift with more specs, and I think specs and verification are two different things. A spec is feedforward; it sets the target once and hopes. What bounds drift is a sensor, so the rigor here goes to the intent layer (what exactly to build, confirmed before code) and the verification layer (isolated, evidence-bound checks after it), and the code layer is left to the model. The full argument, what it cannot do, and the experiment that would falsify it: docs/intent-diff.md.
The six skills below are the current implementation; they have carried 40+ of my projects from research to deployment.
/scout → /plan → /dev → /qa → /deploy
│
└→ /fix (then re-run /qa)
| Command | Purpose |
|---|---|
/scout |
Research: investigate technologies, map existing codebases, evaluate options |
/plan |
Generate a structured phases.md with tasks, acceptance criteria, and validation gates |
/dev N |
Implement phase N; each task runs in a fresh subagent for clean context |
/qa N |
Adversarial validation across 8 categories (regression, functional, security, a11y, etc.) |
/fix |
Targeted bug fixes scoped to ~15 files, with root cause analysis |
/deploy |
Deployment readiness verification, changelog generation, and release execution |
The rebuild in progress makes the intent-diff position structural: intent frozen at the start with concrete examples, an isolated evaluator reconstructing behavior from the finished code alone, the harness computing the drift delta, and deterministic gates owning rot and security with zero human attention. The v2 skills land here as they finish dogfooding.
Skills: Each skill is a markdown file (SKILL.md) that instructs Claude how to behave at each stage. Skills load references on demand and delegate heavy work to subagents.
Stack Packs (_shared/references/stacks/): Language-specific knowledge loaded automatically when the project's tech stack is detected. Provides conventions, toolchain commands, safety patterns, anti-patterns, and testing patterns for Rust, Python, Go, Swift/iOS, and more. See stacks/README.md for how to add new languages.
Subagents (8 total):
| Agent | Model | Used By | Role |
|---|---|---|---|
dev-planner |
opus | /dev |
Rich planning with full project context |
task-implementer |
sonnet | /dev |
Single-task execution, fresh context per task |
qa-planner |
opus | /qa |
Test plan generation across 8 categories |
category-executor |
sonnet | /qa |
Isolated test execution for high-context categories |
security-auditor |
sonnet | /qa |
OWASP Top 10 vulnerability scanning |
project-analyzer |
inherit | /scout |
Codebase analysis for brownfield projects |
migration-planner |
inherit | /scout |
Dependency upgrade impact analysis |
deep-researcher |
general | any | Deep research when WebSearch isn't enough |
Hooks (7 total):
| Hook | Trigger | What It Does |
|---|---|---|
guard-dangerous-commands.sh |
Before Bash | Blocks rm -rf /, force push to main |
scan-written-code.sh |
After Write/Edit | Async security scan (secrets, eval, XSS, SQLi) |
post-edit-typecheck.sh |
After Edit | Runs tsc --noEmit on edited TypeScript files |
suggest-compact.sh |
After Write/Edit | Suggests /compact at 50 tool calls |
save-pipeline-state.sh |
Before Compact | Preserves active skill state |
reinject-pipeline-state.sh |
After Compact | Re-injects pipeline state |
verify-pipeline-completion.sh |
On Stop | Warns if work is still in progress |
git clone <this-repo> pipeline
cd pipeline
./install.shOpen Claude Code and say:
Read SETUP.md in the pipeline repo and follow its instructions to install.
Claude will walk you through installation interactively — asks global vs local, copies files, configures hooks, helps personalize your profile, and verifies everything works.
- Edit your profile:
~/.claude/skills/_shared/owner-profile.md— fill in your role and tech preferences - Merge settings (if needed): Hook configuration in
settings.jsonmust be merged with existing settings
Design doc or idea
│
▼
/scout ────────► discovery/discovery-{slug}.md
│
▼
/plan ────────► phases.md (tasks + ACs)
│ pipeline-state.md
│ key-learnings/ directory
▼
/dev N ────────► key-learnings/key-learnings-{NN}.md
│ Updated phases.md checkboxes
▼
/qa N ────────► qa-reports/qa-report-phase-{NN}.md
│
├─ PASS ───► /dev N+1 (next phase)
├─ COND ───► address issues, then continue
└─ FAIL ───► /fix → re-run /qa
All phases PASS:
│
▼
/deploy ──────► release-checklist.md, CHANGELOG.md
Deployment + post-deploy verification
Key concepts:
- Key-learnings chain: Each
/devwrites learnings, each/qaappends findings. Later phases read all prior learnings. - pipeline-state.md: Tracks active skill, phase, task progress. Hooks preserve this across context compaction.
- Subagent isolation: Heavy work runs in fresh-context subagents. Main conversation stays lightweight.
- User approval gates: No skill advances without explicit user approval.
- [UNVERIFIED] markers: Plan marks ungrounded details. Dev verifies before implementing.
- Claude Code installed and configured
- No other dependencies
pipeline/
├── skills/
│ ├── scout/ (SKILL.md + 7 references + 2 assets)
│ ├── plan/ (SKILL.md + 7 references + 1 asset)
│ ├── dev/ (SKILL.md + 6 references + 1 asset)
│ ├── qa/ (SKILL.md + 5 references + 2 assets)
│ ├── fix/ (SKILL.md + 3 references)
│ ├── deploy/ (SKILL.md + 3 references + 1 asset)
│ └── _shared/
│ ├── references/
│ │ ├── stacks/ (language-specific knowledge packs)
│ │ │ ├── rust.md, python.md, go.md, swift-ios.md
│ │ │ └── README.md (template for adding new languages)
│ │ ├── testing-strategy-archetypes.md (A-F)
│ │ ├── pre-mortem-framework.md
│ │ ├── ai-output-determinism.md
│ │ ├── user-journey-simulation.md
│ │ ├── pipeline-constitution.md
│ │ ├── pipeline-state-protocol.md
│ │ └── port-registry.md
│ ├── ground.md
│ └── owner-profile.md
├── agents/ (8 agent definitions)
├── hooks/ (7 shell scripts)
├── settings.json (hook configuration)
├── install.sh
├── SETUP.md (Claude Code guided installation)
└── README.md