Skip to content

docs: add logo and shields.io badges to README#16

Merged
maltsev-dev merged 2 commits into
masterfrom
docs/logo-badges-classifiers
Jun 19, 2026
Merged

docs: add logo and shields.io badges to README#16
maltsev-dev merged 2 commits into
masterfrom
docs/logo-badges-classifiers

Conversation

@maltsev-dev

Copy link
Copy Markdown
Member

Summary

Adds the NullRun logo and a set of shields.io badges to the top of README.md so the package page on PyPI renders the same visual identity as the brand.

Changes

  • …new file… docs/nullrun-logo.png — the NullRun NR mark, hosted in-repo so it is reachable from the rendered README on PyPI via raw.githubusercontent.com (…https… only — PyPI strips http… images).
  • README.md — adds two badge rows at the top:
    • Release row: PyPI version, Python versions, License, Downloads
    • Quality/Project row: CI, Coverage, Stars, Documentation

All badges link to the corresponding real page (PyPI, GitHub Actions, Codecov, docs.nullrun.io).

Why

  • The PyPI page currently renders only plain text. A logo + badges make the package discoverable and signal project health at a glance (matches how nullout-mcp and other ecosystem packages present themselves).
  • Badges are https://-only (PyPI sanitiser requirement).

Not changed

  • pyproject.toml classifiers — left as-is per project decision.
  • The @protect example form — kept exactly as in the current README.

Verification

  • twine check dist/* — README.md renders correctly as the package long description.
  • After merge, the PyPI page for nullrun will show the logo + badges above the heading.

maltsev-dev and others added 2 commits June 18, 2026 19:10
analyze.md is a session-scoped working-notes file (~240 KB of
audit/plan material) that does not belong in the public SDK
repo. Remove from version control but keep on disk for the
author's reference.

- git rm --cached analyze.md: drop from index, file stays on disk
- add analyze.md to .gitignore so it isn't accidentally re-added
- drop the self-referential '.gitignore' entry from .gitignore
  so future edits don't need 'git add -f'
- Add docs/nullrun-logo.png (NullRun NR logo) and render it centered
  at the top of README.md via raw.githubusercontent.com
- Add shields.io badges in two rows:
  * Release: PyPI version, Python versions, License, Downloads
  * Quality/Project: CI, Coverage, Stars, Documentation
- All badges use https:// (PyPI readme sanitizer strips http://)
- No classifier changes (left as-is per project decision)

Co-Authored-By: Claude <noreply@anthropic.com>
@maltsev-dev maltsev-dev merged commit 0945be9 into master Jun 19, 2026
0 of 4 checks passed
maltsev-dev added a commit that referenced this pull request Jun 19, 2026
Closes the P0/P1/P2/P3 issues from the security review (plan §10/§11.4).

Security / PCI-DSS / GDPR

- P0-1: Mask positional PII in `_enforce_sensitive_tool` by introspecting
  the wrapped function's signature and applying `SENSITIVE_ARG_KEYS` to
  positional params. Pre-fix, `charge("4111-…-1111", 50)` forwarded the
  PAN into `/execute` and the audit log.
- P0-6 / P3-3: `_safe_repr` now redacts BEFORE truncating. The pre-fix
  order truncated first, so `details={…}` past position 50 leaked
  verbatim. `_safe_repr` is now the single source of truth for the
  redact-then-truncate flow.

Cost-audit / reliability

- P0-3: Bounded chunked reads on the sync + async httpx transports
  (`MAX_RESPONSE_BYTES`, default 16 MiB, `NULLRUN_MAX_RESPONSE_BYTES`
  env override). Above the cap, tracking is skipped and
  `_coverage_streaming_skipped` is incremented. Replaces the
  `response.read()` / `await response.aread()` unbounded buffer that
  held entire LLM streaming bodies in memory.
- P0-4: `_do_flush_locked` re-queue on CB OPEN now drops the NEWEST
  non-critical events instead of the oldest. The oldest events
  (incident start, billing-period start) are exactly what a billing
  investigator needs; losing them silently broke monthly rollups.
  Control-plane events (`state_change`, `kill_received`,
  `policy_invalidated`, `key_rotated`) are preserved unconditionally
  so the dashboard KILL switch lands even under sustained backend
  outage.

Identity

- S-8 / P2-4: `agent()` now emits `str(uuid.uuid4())` (with dashes).
  Pre-fix the format was `f"agent-{uuid.uuid4().hex}"` — 32 hex chars,
  no dashes — and backend UUID-typed columns dropped these to NULL
  on insert. User-supplied names are still preserved verbatim.
- §7.2 #16: `workflow()` context manager now resets `span_id` (not
  only `workflow_id` / `trace_id`) so nested `with span()` blocks
  don't leave the inner span_id visible inside the workflow scope.

Resource leaks

- S-9: `_active_runs` on `NullRunCallback` is now an `OrderedDict`
  capped at 4096 with FIFO eviction. Pre-fix the dict grew
  unbounded when `on_chain_end` did not fire (some LangChain
  versions short-circuit the end hook on chain-body errors).
- S-10: WebSocket reconnect loop is now capped at 10 consecutive
  failures, then falls back to HTTP-poll. Pre-fix the loop ran
  forever when the backend was permanently down, leaking the
  WS thread.

Transport

- §7.2 #6: Separate `hmac_verify_expired_total` counter so SRE can
  distinguish clock-skew (NTP drift) from forged packets. Mirrored
  in both the HTTP and WebSocket verify paths.
- §7.2 #35: `CircuitBreaker.call` now dispatches the OPEN→HALF_OPEN
  jitter through `_maybe_apply_open_jitter_sync` /
  `_maybe_apply_open_jitter_async`. Pre-fix the jitter used
  `time.sleep` before dispatching to async, which blocked the
  caller's event loop on every transition.
- P2-1: `_coverage_seen` now bumps in the httpx path (sync + async).
  Pre-fix the counter was only bumped by the `requests` transport,
  so the dashboard's coverage view was empty for the dominant
  OpenAI / Anthropic / Gemini / Mistral / Cohere traffic.
- P2-3: `is_sensitive_tool` match is case-insensitive. Pre-fix
  `"stripe.charge"` did not match `"Stripe.Charge"`, bypassing the
  sensitive gate.

Concurrency

- §7.2 #39: New `_tools_lock` guards every mutation of
  `_strict_mode_tools` / `_sensitive_tools`. Same lock guards the
  coverage-counter bump+prune sequence (§7.2 #33) so two threads
  can't both observe the dict at length 4095 and both grow it to
  4097 before either prune lands.
- §7.2 #47: New `_langchain_lock` / `_langgraph_lock` guard the
  patch sequences end-to-end. Pre-fix two threads racing through
  `auto_instrument` could both pass the early `_x_patched` check
  and double-wrap `BaseCallbackManager` / `Pregel`.
- §7.2 #33: `_COVERAGE_CAP` (4096) bounds the per-host coverage
  dicts.

Webhook delivery

- P3-2: Exponential backoff (0.5s, 1s, 2s, 4s, 8s, 16s, 30s cap)
  replaces the previous linear schedule. Linear didn't back off
  fast enough under sustained outage — each KILL/PAUSE spawned
  its own delivery thread, producing 1000+ spinning threads
  hammering the dead endpoint.

WAL crash-recovery

- P1-5b: Atomic WAL writes (tmp + `fsync` + `os.replace`), 64 MiB
  rotation with `os.replace(wal, wal.1)`, replay drains both
  `wal.1` and `wal`. New `NULLRUN_WAL_PATH` / `NULLRUN_WAL_MAX_BYTES`
  env overrides for containers with `readOnlyRootFilesystem: true`.

Tests

8 new regression test files (57 tests total):
  test_agent_id_uuid.py, test_args_pii_masked.py,
  test_streaming_oom_cap.py, test_lru_active_runs.py,
  test_reconnect_cap.py, test_coverage_seen_httpx.py,
  test_webhook_backoff.py, test_redact.py

`test_buffer_invariants.py` extended with drop-newest +
critical-event preservation cases. `test_release_polish.py`
updated to pin the 5s cap on both the sync and async jitter
helpers (post §7.2 #35 split).

Full incident write-ups in CHANGELOG.md under the same P0/S/P tags.
maltsev-dev added a commit that referenced this pull request Jun 19, 2026
* fix: P0 security/stability hardening bundle

Closes the P0/P1/P2/P3 issues from the security review (plan §10/§11.4).

Security / PCI-DSS / GDPR

- P0-1: Mask positional PII in `_enforce_sensitive_tool` by introspecting
  the wrapped function's signature and applying `SENSITIVE_ARG_KEYS` to
  positional params. Pre-fix, `charge("4111-…-1111", 50)` forwarded the
  PAN into `/execute` and the audit log.
- P0-6 / P3-3: `_safe_repr` now redacts BEFORE truncating. The pre-fix
  order truncated first, so `details={…}` past position 50 leaked
  verbatim. `_safe_repr` is now the single source of truth for the
  redact-then-truncate flow.

Cost-audit / reliability

- P0-3: Bounded chunked reads on the sync + async httpx transports
  (`MAX_RESPONSE_BYTES`, default 16 MiB, `NULLRUN_MAX_RESPONSE_BYTES`
  env override). Above the cap, tracking is skipped and
  `_coverage_streaming_skipped` is incremented. Replaces the
  `response.read()` / `await response.aread()` unbounded buffer that
  held entire LLM streaming bodies in memory.
- P0-4: `_do_flush_locked` re-queue on CB OPEN now drops the NEWEST
  non-critical events instead of the oldest. The oldest events
  (incident start, billing-period start) are exactly what a billing
  investigator needs; losing them silently broke monthly rollups.
  Control-plane events (`state_change`, `kill_received`,
  `policy_invalidated`, `key_rotated`) are preserved unconditionally
  so the dashboard KILL switch lands even under sustained backend
  outage.

Identity

- S-8 / P2-4: `agent()` now emits `str(uuid.uuid4())` (with dashes).
  Pre-fix the format was `f"agent-{uuid.uuid4().hex}"` — 32 hex chars,
  no dashes — and backend UUID-typed columns dropped these to NULL
  on insert. User-supplied names are still preserved verbatim.
- §7.2 #16: `workflow()` context manager now resets `span_id` (not
  only `workflow_id` / `trace_id`) so nested `with span()` blocks
  don't leave the inner span_id visible inside the workflow scope.

Resource leaks

- S-9: `_active_runs` on `NullRunCallback` is now an `OrderedDict`
  capped at 4096 with FIFO eviction. Pre-fix the dict grew
  unbounded when `on_chain_end` did not fire (some LangChain
  versions short-circuit the end hook on chain-body errors).
- S-10: WebSocket reconnect loop is now capped at 10 consecutive
  failures, then falls back to HTTP-poll. Pre-fix the loop ran
  forever when the backend was permanently down, leaking the
  WS thread.

Transport

- §7.2 #6: Separate `hmac_verify_expired_total` counter so SRE can
  distinguish clock-skew (NTP drift) from forged packets. Mirrored
  in both the HTTP and WebSocket verify paths.
- §7.2 #35: `CircuitBreaker.call` now dispatches the OPEN→HALF_OPEN
  jitter through `_maybe_apply_open_jitter_sync` /
  `_maybe_apply_open_jitter_async`. Pre-fix the jitter used
  `time.sleep` before dispatching to async, which blocked the
  caller's event loop on every transition.
- P2-1: `_coverage_seen` now bumps in the httpx path (sync + async).
  Pre-fix the counter was only bumped by the `requests` transport,
  so the dashboard's coverage view was empty for the dominant
  OpenAI / Anthropic / Gemini / Mistral / Cohere traffic.
- P2-3: `is_sensitive_tool` match is case-insensitive. Pre-fix
  `"stripe.charge"` did not match `"Stripe.Charge"`, bypassing the
  sensitive gate.

Concurrency

- §7.2 #39: New `_tools_lock` guards every mutation of
  `_strict_mode_tools` / `_sensitive_tools`. Same lock guards the
  coverage-counter bump+prune sequence (§7.2 #33) so two threads
  can't both observe the dict at length 4095 and both grow it to
  4097 before either prune lands.
- §7.2 #47: New `_langchain_lock` / `_langgraph_lock` guard the
  patch sequences end-to-end. Pre-fix two threads racing through
  `auto_instrument` could both pass the early `_x_patched` check
  and double-wrap `BaseCallbackManager` / `Pregel`.
- §7.2 #33: `_COVERAGE_CAP` (4096) bounds the per-host coverage
  dicts.

Webhook delivery

- P3-2: Exponential backoff (0.5s, 1s, 2s, 4s, 8s, 16s, 30s cap)
  replaces the previous linear schedule. Linear didn't back off
  fast enough under sustained outage — each KILL/PAUSE spawned
  its own delivery thread, producing 1000+ spinning threads
  hammering the dead endpoint.

WAL crash-recovery

- P1-5b: Atomic WAL writes (tmp + `fsync` + `os.replace`), 64 MiB
  rotation with `os.replace(wal, wal.1)`, replay drains both
  `wal.1` and `wal`. New `NULLRUN_WAL_PATH` / `NULLRUN_WAL_MAX_BYTES`
  env overrides for containers with `readOnlyRootFilesystem: true`.

Tests

8 new regression test files (57 tests total):
  test_agent_id_uuid.py, test_args_pii_masked.py,
  test_streaming_oom_cap.py, test_lru_active_runs.py,
  test_reconnect_cap.py, test_coverage_seen_httpx.py,
  test_webhook_backoff.py, test_redact.py

`test_buffer_invariants.py` extended with drop-newest +
critical-event preservation cases. `test_release_polish.py`
updated to pin the 5s cap on both the sync and async jitter
helpers (post §7.2 #35 split).

Full incident write-ups in CHANGELOG.md under the same P0/S/P tags.

* fix: address ruff lint findings from CI

Three CI lint failures on `ruff check src/` — fixes only, no
behavioural changes:

- **B905** (`src/nullrun/decorators.py:162`): `zip(bound_params,
  args)` now passes `strict=False` explicitly. Pre-fix the two
  iterables can be different lengths — `bound_params` is sliced to
  `[: len(args)]` but the function may have fewer positional
  parameters than args provided (e.g. *args-style callables), in
  which case the trailing loop below handles the excess. `strict=`
  was implicit and triggered B905. Now explicit so the intent is
  documented in code.

- **I001** (`src/nullrun/instrumentation/auto.py:1146`): the late
  `import os as _os` was moved to the top-of-file import block as
  `import os` (alphabetical order: hashlib, json, logging, os,
  threading). The `_os` alias was only there to avoid shadowing —
  there is no top-level `os` in scope, so the plain name is fine.
  Call site updated to use `os.environ.get(...)`.

- **S108** (`src/nullrun/transport.py:632`): replaced the
  hardcoded `/tmp/nullrun.wal` with
  `os.path.join(tempfile.gettempdir(), "nullrun.wal")`. The
  hardcoded `/tmp` flagged S108 (insecure / non-portable temp
  path) and would have broken the SDK on Windows out of the box.
  `gettempdir()` returns the OS-appropriate temp dir
  (`/tmp` on Linux, `/var/folders/...` on macOS, `%TEMP%` on
  Windows). `NULLRUN_WAL_PATH` env override still wins, so
  containers with `readOnlyRootFilesystem: true` are unaffected.
  Added `import tempfile` to the top-of-file imports.

Verified:
  - `ruff check src/` → All checks passed!
  - `mypy src/` → Success: no issues found in 23 source files
  - `pytest` → 493 passed, 13 skipped (CI default, no `-W error`)
maltsev-dev added a commit that referenced this pull request Jun 19, 2026
* fix: P0 security/stability hardening bundle

Closes the P0/P1/P2/P3 issues from the security review (plan §10/§11.4).

Security / PCI-DSS / GDPR

- P0-1: Mask positional PII in `_enforce_sensitive_tool` by introspecting
  the wrapped function's signature and applying `SENSITIVE_ARG_KEYS` to
  positional params. Pre-fix, `charge("4111-…-1111", 50)` forwarded the
  PAN into `/execute` and the audit log.
- P0-6 / P3-3: `_safe_repr` now redacts BEFORE truncating. The pre-fix
  order truncated first, so `details={…}` past position 50 leaked
  verbatim. `_safe_repr` is now the single source of truth for the
  redact-then-truncate flow.

Cost-audit / reliability

- P0-3: Bounded chunked reads on the sync + async httpx transports
  (`MAX_RESPONSE_BYTES`, default 16 MiB, `NULLRUN_MAX_RESPONSE_BYTES`
  env override). Above the cap, tracking is skipped and
  `_coverage_streaming_skipped` is incremented. Replaces the
  `response.read()` / `await response.aread()` unbounded buffer that
  held entire LLM streaming bodies in memory.
- P0-4: `_do_flush_locked` re-queue on CB OPEN now drops the NEWEST
  non-critical events instead of the oldest. The oldest events
  (incident start, billing-period start) are exactly what a billing
  investigator needs; losing them silently broke monthly rollups.
  Control-plane events (`state_change`, `kill_received`,
  `policy_invalidated`, `key_rotated`) are preserved unconditionally
  so the dashboard KILL switch lands even under sustained backend
  outage.

Identity

- S-8 / P2-4: `agent()` now emits `str(uuid.uuid4())` (with dashes).
  Pre-fix the format was `f"agent-{uuid.uuid4().hex}"` — 32 hex chars,
  no dashes — and backend UUID-typed columns dropped these to NULL
  on insert. User-supplied names are still preserved verbatim.
- §7.2 #16: `workflow()` context manager now resets `span_id` (not
  only `workflow_id` / `trace_id`) so nested `with span()` blocks
  don't leave the inner span_id visible inside the workflow scope.

Resource leaks

- S-9: `_active_runs` on `NullRunCallback` is now an `OrderedDict`
  capped at 4096 with FIFO eviction. Pre-fix the dict grew
  unbounded when `on_chain_end` did not fire (some LangChain
  versions short-circuit the end hook on chain-body errors).
- S-10: WebSocket reconnect loop is now capped at 10 consecutive
  failures, then falls back to HTTP-poll. Pre-fix the loop ran
  forever when the backend was permanently down, leaking the
  WS thread.

Transport

- §7.2 #6: Separate `hmac_verify_expired_total` counter so SRE can
  distinguish clock-skew (NTP drift) from forged packets. Mirrored
  in both the HTTP and WebSocket verify paths.
- §7.2 #35: `CircuitBreaker.call` now dispatches the OPEN→HALF_OPEN
  jitter through `_maybe_apply_open_jitter_sync` /
  `_maybe_apply_open_jitter_async`. Pre-fix the jitter used
  `time.sleep` before dispatching to async, which blocked the
  caller's event loop on every transition.
- P2-1: `_coverage_seen` now bumps in the httpx path (sync + async).
  Pre-fix the counter was only bumped by the `requests` transport,
  so the dashboard's coverage view was empty for the dominant
  OpenAI / Anthropic / Gemini / Mistral / Cohere traffic.
- P2-3: `is_sensitive_tool` match is case-insensitive. Pre-fix
  `"stripe.charge"` did not match `"Stripe.Charge"`, bypassing the
  sensitive gate.

Concurrency

- §7.2 #39: New `_tools_lock` guards every mutation of
  `_strict_mode_tools` / `_sensitive_tools`. Same lock guards the
  coverage-counter bump+prune sequence (§7.2 #33) so two threads
  can't both observe the dict at length 4095 and both grow it to
  4097 before either prune lands.
- §7.2 #47: New `_langchain_lock` / `_langgraph_lock` guard the
  patch sequences end-to-end. Pre-fix two threads racing through
  `auto_instrument` could both pass the early `_x_patched` check
  and double-wrap `BaseCallbackManager` / `Pregel`.
- §7.2 #33: `_COVERAGE_CAP` (4096) bounds the per-host coverage
  dicts.

Webhook delivery

- P3-2: Exponential backoff (0.5s, 1s, 2s, 4s, 8s, 16s, 30s cap)
  replaces the previous linear schedule. Linear didn't back off
  fast enough under sustained outage — each KILL/PAUSE spawned
  its own delivery thread, producing 1000+ spinning threads
  hammering the dead endpoint.

WAL crash-recovery

- P1-5b: Atomic WAL writes (tmp + `fsync` + `os.replace`), 64 MiB
  rotation with `os.replace(wal, wal.1)`, replay drains both
  `wal.1` and `wal`. New `NULLRUN_WAL_PATH` / `NULLRUN_WAL_MAX_BYTES`
  env overrides for containers with `readOnlyRootFilesystem: true`.

Tests

8 new regression test files (57 tests total):
  test_agent_id_uuid.py, test_args_pii_masked.py,
  test_streaming_oom_cap.py, test_lru_active_runs.py,
  test_reconnect_cap.py, test_coverage_seen_httpx.py,
  test_webhook_backoff.py, test_redact.py

`test_buffer_invariants.py` extended with drop-newest +
critical-event preservation cases. `test_release_polish.py`
updated to pin the 5s cap on both the sync and async jitter
helpers (post §7.2 #35 split).

Full incident write-ups in CHANGELOG.md under the same P0/S/P tags.

* fix: address ruff lint findings from CI

Three CI lint failures on `ruff check src/` — fixes only, no
behavioural changes:

- **B905** (`src/nullrun/decorators.py:162`): `zip(bound_params,
  args)` now passes `strict=False` explicitly. Pre-fix the two
  iterables can be different lengths — `bound_params` is sliced to
  `[: len(args)]` but the function may have fewer positional
  parameters than args provided (e.g. *args-style callables), in
  which case the trailing loop below handles the excess. `strict=`
  was implicit and triggered B905. Now explicit so the intent is
  documented in code.

- **I001** (`src/nullrun/instrumentation/auto.py:1146`): the late
  `import os as _os` was moved to the top-of-file import block as
  `import os` (alphabetical order: hashlib, json, logging, os,
  threading). The `_os` alias was only there to avoid shadowing —
  there is no top-level `os` in scope, so the plain name is fine.
  Call site updated to use `os.environ.get(...)`.

- **S108** (`src/nullrun/transport.py:632`): replaced the
  hardcoded `/tmp/nullrun.wal` with
  `os.path.join(tempfile.gettempdir(), "nullrun.wal")`. The
  hardcoded `/tmp` flagged S108 (insecure / non-portable temp
  path) and would have broken the SDK on Windows out of the box.
  `gettempdir()` returns the OS-appropriate temp dir
  (`/tmp` on Linux, `/var/folders/...` on macOS, `%TEMP%` on
  Windows). `NULLRUN_WAL_PATH` env override still wins, so
  containers with `readOnlyRootFilesystem: true` are unaffected.
  Added `import tempfile` to the top-of-file imports.

Verified:
  - `ruff check src/` → All checks passed!
  - `mypy src/` → Success: no issues found in 23 source files
  - `pytest` → 493 passed, 13 skipped (CI default, no `-W error`)

* chore(release): bump to 0.5.2

- Promote [Unreleased] to [0.5.2] — 2026-06-19; merge the two
  [Unreleased] sections that had drifted during Sprint 2.5 +
  Phase 0 development so release tooling scanning for the
  [Unreleased] anchor picks up the complete change set exactly
  once.
- Add PEP 561 marker (py.typed) — the package ships inline type
  annotations; the marker tells mypy / pyright / pylance to honour
  them.
- runtime.py (S-4): case-insensitive state compare in
  check_control_plane. Defensive against any backend casing drift
  beyond the current PascalCase (handlers.rs:9258). Pinned by
  tests/test_state_compare_case_insensitive.py (10 cases covering
  PascalCase / UPPERCASE / lowercase / mixed-case).

Working-notes file docs/integration-baseline-2026-06-19.md is
deliberately left untracked, matching the analyze.md pattern from
d74712e.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant