Skip to content

reesepj/agentwall

Agentwall

Agentwall

Runtime firewall and operator console for AI agents.

License: MIT TypeScript Node >= 20 Fastify 5 Status: active PRs welcome


Agentwall sits between AI agents and the surfaces where their actions become real — shell commands, tools, network egress, browser actions, communication channels, identity, and content delivery. Before an agent acts, it answers one question: should this action be allowed, redacted, routed to a human, or denied? Prompts are not a security boundary; Agentwall enforces policy at runtime, where the risk is, and leaves a tamper-evident record an operator can review.

Features

  • Deterministic policy engine — every request is scored across six action planes (network, tool, content, browser, identity, governance) into one of four decisions (allow / redact / approve / deny). The highest-severity matching rule wins, and every decision returns the matched rule IDs, human reasons, risk level, and MITRE ATT&CK-mapped detections.
  • Provenance- and flow-aware — trust labels (trusted / untrusted / derived) and flow labels (secret_material, pii, credential_access, destructive_action, payment, private_network_target, …) drive high-risk detection, so untrusted content can't quietly escalate into privileged egress.
  • DLP secret & PII scanning — built-in detectors for AWS keys, GitHub PATs/OAuth tokens, OpenAI keys, Slack tokens, private keys, and JWTs, plus SSNs, credit cards, emails, and phone numbers — with inline redaction.
  • Egress / SSRF firewall — default-deny egress with scheme/port/host allowlists; blocks private, loopback, and link-local ranges and cloud metadata endpoints (169.254.169.254, metadata.google.internal).
  • Human approval gateauto / always / never modes, a persistent approval queue, operator and channel notifications, and HMAC-signed, TTL-bound capability tickets minted on allow.
  • Damage Control command preflight — static analysis of shell/bash commands before execution, combined with the policy decision so risky commands escalate to a human.
  • Communication-channel containment — scoped guardrails for agent bots in shared Telegram/Slack/Discord chats: deny filesystem mutation and secret egress, redact PII in outbound replies.
  • Runtime FloodGuard — per-session and per-actor rate limits, pending-approval caps, cost budgets, one-command shield mode, and temporary session boosts.
  • Tamper-evident audit — SHA-256 hash-chained audit events plus OpenTelemetry (OTLP/HTTP) decision traces for downstream observability.
  • Operator console & org control plane — a local Fastify dashboard with live SSE updates, session containment (pause / resume / terminate), and a single-pane federation summary across multiple Agentwall instances via authenticated peer polling.
  • Manifest integrity — detects drift in tool/agent manifests against approved fingerprints.

Architecture

The request → policy → decision → audit flow for a single agent action:

flowchart TD
    A["Agent action request"] -->|"POST /evaluate"| B["Fastify server"]
    B --> C{"FloodGuard<br/>rate &amp; cost limits"}
    C -->|throttled| Z["Blocked &amp; audited"]
    C -->|ok| D{"Session paused<br/>or terminated?"}
    D -->|contained| Z
    D -->|active| E["Policy engine"]

    subgraph INPUTS["Evaluation inputs"]
      direction LR
      F["DLP scan<br/>secrets &amp; PII"]
      G["Egress / SSRF<br/>inspector"]
      H["Provenance &amp;<br/>flow labels"]
      I["Built-in &amp; YAML<br/>policy rules"]
    end
    INPUTS --> E

    E --> J{"Decision"}
    J -->|allow| K["Issue capability ticket<br/>HMAC + TTL"]
    J -->|redact| L["Return redacted content"]
    J -->|approve| M["Approval queue"]
    J -->|deny| N["Block action"]

    M --> O["Operator console<br/>&amp; channel notifier"]
    O -->|"approve / deny"| J

    K --> P["Emit audit event"]
    L --> P
    N --> P
    P --> Q["SHA-256 hash chain"]
    P --> R["OTLP decision trace"]
    P --> S["Runtime state<br/>dashboard &amp; org summary"]
Loading

Getting started

Requires Node.js ≥ 20.

git clone https://github.com/reesepj/agentwall.git
cd agentwall
npm install
npm run build

# scaffold config + policy, then start
node dist/cli.js init --mode guarded --allow-hosts api.openai.com
node dist/cli.js doctor
node dist/cli.js start

The operator console is served at /dashboard on the configured host/port (default http://127.0.0.1:3000/dashboard).

npm run dev    # ts-node, no build step
npm test       # Jest test suite
npm run lint   # tsc --noEmit typecheck

Ask for a decision over HTTP:

curl -s http://127.0.0.1:3000/evaluate \
  -H 'content-type: application/json' \
  -d '{"agentId":"demo","plane":"network","action":"http_get",
       "payload":{"url":"http://169.254.169.254/latest/meta-data/"},
       "flow":{"direction":"egress"}}'
# -> { "decision": "deny", "riskLevel": "critical", "matchedRules": ["net:block-ssrf-private","net:block-metadata-endpoint"], ... }

Core API surface:

POST /evaluate                                          # policy decision (+ capability ticket)
POST /inspect/content                                   # DLP secret/PII scan + redaction
POST /inspect/network                                   # egress / SSRF inspection
POST /inspect/manifest                                  # manifest drift detection
POST /approval/request    GET /approval/pending         # approval queue
POST /approval/:id/respond
POST /integrations/communication-channel/guardrail      # channel containment
POST /integrations/damage-control/command-preflight     # bash command firewall
GET  /detections          GET /rules                    # decision catalog & active rules
GET  /api/dashboard/state GET /api/dashboard/events     # operator console (state + SSE)
GET  /api/org/summary                                   # federation single-pane summary

Tech stack

Layer Choice
Language TypeScript 5 (strict), Node.js ≥ 20
HTTP server Fastify 5
Schema validation Zod
Logging pino
Config & policy YAML (js-yaml)
Crypto Node crypto — HMAC capability tickets, SHA-256 audit chain
Telemetry OpenTelemetry OTLP/HTTP decision traces
Tests Jest + ts-jest
Tooling bundled agentwall CLI

How it works

A few load-bearing design decisions:

  • Enforce at the action surface, not in the model. Agentwall is independent of the agent framework; it gates the moment an action would touch a real resource.
  • Default-deny. Unmatched actions and outbound egress are denied unless explicitly allowed, and a watchdog kill-switch fails closed (deny_all) when heartbeats go stale.
  • Explainable decisions. Results carry matched rule IDs, reasons, and ATT&CK-mapped detections — no opaque scores — so operators and audit logs agree on why.
  • Tamper-evident by construction. Audit events are hash-chained and capability tickets are signed, so allow decisions are verifiable after the fact.
  • Operator-first. Live dashboard, approval queue, FloodGuard shield, and per-session pause/resume/terminate give a human real-time containment controls.

Policy ships as a built-in rule pack plus hot-reloadable YAML, so posture can be tightened without a redeploy. See examples/policy.yaml and the docs/ directory for threat model, architecture, and tutorials.

License

MIT — see LICENSE.