diff --git a/README.md b/README.md index aeca388..e9ff7e4 100644 --- a/README.md +++ b/README.md @@ -24,6 +24,10 @@ the right evidence path. For the shortest boundary check before adding or reviewing new material, use the [repository scope map](docs/repo-scope-map.md). +For why the scientific-computing background helps this repository's review +style without widening its scope, use +[`docs/why-scientific-computing-background-helps.md`](docs/why-scientific-computing-background-helps.md). + For the SBOM tool's risk-model boundary, use [`docs/risk-model-boundary.md`](docs/risk-model-boundary.md). It states which fields affect risk buckets, which fields are context only, and which claims the @@ -170,6 +174,8 @@ they do not prove the same thing. [`scripts/validate-reviewer-routes.py`](scripts/validate-reviewer-routes.py) - Repository scope map: [`docs/repo-scope-map.md`](docs/repo-scope-map.md) +- Scientific-computing background note: + [`docs/why-scientific-computing-background-helps.md`](docs/why-scientific-computing-background-helps.md) - Risk model boundary: [`docs/risk-model-boundary.md`](docs/risk-model-boundary.md) diff --git a/docs/reviewer-brief.md b/docs/reviewer-brief.md index f364195..f6ac513 100644 --- a/docs/reviewer-brief.md +++ b/docs/reviewer-brief.md @@ -25,6 +25,7 @@ workflows, but they are not part of the `sbom-diff-and-risk` release surface. | Review question | Start here | Stop when | | --- | --- | --- | | What is the repository shape? | This brief, the root [README](../README.md), and the [repository scope map](repo-scope-map.md). | You can distinguish the flagship SBOM tool from the supporting diagnostics projects. | +| Why does scientific-computing background help review? | The [scientific-computing background note](why-scientific-computing-background-helps.md). | You can explain reproducibility, data-pipeline, and uncertainty-boundary habits without widening repository scope. | | What should I review for the SBOM tool? | The SBOM [reviewer path](../tools/sbom-diff-and-risk/docs/reviewer-path.md). | You have chosen the right 30-second, 5-minute, 15-minute, release, or deep-review route. | | What does the SBOM risk model actually use? | The [risk model boundary](risk-model-boundary.md). | You can separate risk inputs from context-only fields and non-claims. | | Can the SBOM examples be reproduced? | The SBOM [example artifact regeneration guide](../tools/sbom-diff-and-risk/docs/example-artifact-regeneration.md). | `python scripts/regenerate-example-artifacts.py --check` passes. | @@ -53,6 +54,10 @@ workflows, but they are not part of the `sbom-diff-and-risk` release surface. intentionally deferred production PyPI decision docs. - Scope map: `docs/repo-scope-map.md` keeps the flagship/supporting split and repository non-claims explicit. +- Scientific-computing background note: + `docs/why-scientific-computing-background-helps.md` explains + reproducibility, data-pipeline, and uncertainty-boundary habits without + widening repository scope. - Risk model boundary: `docs/risk-model-boundary.md` states which fields affect risk classification, which fields are context only, and what the model never infers. diff --git a/docs/why-scientific-computing-background-helps.md b/docs/why-scientific-computing-background-helps.md new file mode 100644 index 0000000..5170a46 --- /dev/null +++ b/docs/why-scientific-computing-background-helps.md @@ -0,0 +1,66 @@ +# Why Scientific Computing Background Helps + +This note explains why scientific-computing habits are useful to this +repository's review style. It is a working-method note, not a domain identity +claim and not a reason to expand repository scope. + +## Reproducibility + +Scientific-computing work rewards runs that can be repeated from explicit +inputs. That habit maps directly to this repository's strongest reviewer +surface: + +- checked-in fixtures instead of private source material +- deterministic commands that can be rerun locally +- generated artifacts that can be compared against known outputs +- documentation that separates what was run from what was inferred + +For `sbom-diff-and-risk`, this means example reports, policy sidecars, SARIF +samples, and release evidence should stay reproducible from public-safe inputs. +The point is not to claim broad expertise; it is to make review evidence easier +to repeat. + +## Data Pipeline + +Scientific-computing workflows often make the pipeline visible: ingest, +normalize, transform, summarize, and report. That pattern helps keep this +repository inspectable. + +For the flagship SBOM tool, the useful pipeline boundary is: + +- parse SBOMs or dependency manifests +- normalize package records into a stable internal shape +- compute local diffs and heuristic findings +- apply explicit local policy when requested +- emit machine-readable and human-readable review artifacts + +Each stage should have a clear input and output. When a later report includes +context from an earlier stage, the report should preserve enough provenance for +a reviewer to understand where the value came from. Hidden enrichment, opaque +scoring, and untraceable conclusions work against that goal. + +## Uncertainty Boundary + +Scientific-computing review also depends on knowing what the data cannot prove. +That habit matters here because dependency evidence is easy to overstate. + +The repository should keep uncertainty boundaries explicit: + +- local manifests and SBOMs prove what they contain, not what the ecosystem + currently knows +- optional enrichment is evidence for that run, not a universal truth source +- policy output is a local decision, not a package safety verdict +- missing evidence should stay visible as missing evidence +- unknowns should be reported as unknown or `not_evaluated`, not filled with + guesses + +This is why the docs keep non-claims close to the examples. A reviewer should +be able to say what was observed, what was reproduced, and what remains outside +the evidence boundary. + +## Scope Rule + +Use scientific-computing background as a discipline for reproducible evidence, +clear data flow, and careful uncertainty handling. Do not use it as a reason to +add unrelated project surfaces, broaden claims, or dilute the flagship +`sbom-diff-and-risk` reviewer route. diff --git a/scripts/validate-reviewer-routes.py b/scripts/validate-reviewer-routes.py index d8a3d7d..61d2d69 100644 --- a/scripts/validate-reviewer-routes.py +++ b/scripts/validate-reviewer-routes.py @@ -14,6 +14,7 @@ Path("README.md"), Path("docs/reviewer-brief.md"), Path("docs/repo-scope-map.md"), + Path("docs/why-scientific-computing-background-helps.md"), Path("docs/risk-model-boundary.md"), Path("tools/sbom-diff-and-risk/docs/report-schema.md"), Path("tools/sbom-diff-and-risk/docs/github-actions-consumer-example.md"), @@ -47,6 +48,7 @@ Path("README.md"): { "docs/reviewer-brief.md", "docs/repo-scope-map.md", + "docs/why-scientific-computing-background-helps.md", "docs/risk-model-boundary.md", "tools/sbom-diff-and-risk/docs/reviewer-path.md", "tools/sbom-diff-and-risk/docs/reviewer-evidence-pack.md", @@ -57,6 +59,7 @@ Path("docs/reviewer-brief.md"): { "README.md", "docs/repo-scope-map.md", + "docs/why-scientific-computing-background-helps.md", "docs/risk-model-boundary.md", "tools/sbom-diff-and-risk/docs/reviewer-path.md", "tools/sbom-diff-and-risk/docs/example-artifact-regeneration.md", @@ -65,6 +68,7 @@ "projects/python-weather-diagnostics-toolkit/docs/reviewer-path.md", }, Path("docs/repo-scope-map.md"): set(), + Path("docs/why-scientific-computing-background-helps.md"): set(), Path("docs/risk-model-boundary.md"): { "tools/sbom-diff-and-risk/docs/dependency-risk-heuristics.md", "tools/sbom-diff-and-risk/src/sbom_diff_risk/diffing.py", @@ -150,11 +154,13 @@ Path("README.md"): ( "current flagship tool", "not part of the `sbom-diff-and-risk` release surface", + "why the scientific-computing background helps", "Production PyPI publishing: intentionally deferred", ), Path("docs/reviewer-brief.md"): ( "The current flagship project is", "supporting diagnostics projects", + "scientific-computing background note", "production PyPI publishing remains intentionally deferred", ), Path("docs/repo-scope-map.md"): ( @@ -170,6 +176,18 @@ "not a CVE resolver", "not a production PyPI release claim", ), + Path("docs/why-scientific-computing-background-helps.md"): ( + "Reproducibility", + "Data Pipeline", + "Uncertainty Boundary", + "not a domain identity claim", + "not a reason to expand repository scope", + "checked-in fixtures instead of private source material", + "Each stage should have a clear input and output.", + "missing evidence should stay visible as missing evidence", + "not a package safety verdict", + "Do not use it as a reason to add unrelated project surfaces", + ), Path("docs/risk-model-boundary.md"): ( "Fields that affect risk classification", "Context-only fields", @@ -243,6 +261,16 @@ ), } +FORBIDDEN_TEXT = { + Path("docs/why-scientific-computing-background-helps.md"): ( + "meteorology", + "weather", + "climate", + "atmospheric", + "precipitation", + ), +} + REQUIRED_REVIEWER_PATHS = ( Path("tools/sbom-diff-and-risk/docs/reviewer-path.md"), Path("projects/precipitation-anomaly-diagnostics/docs/reviewer-path.md"), @@ -527,6 +555,13 @@ def validate_required_text(markdown_path: Path, errors: list[str]) -> None: errors.append(f"{markdown_path}: missing reviewer contract phrase: {phrase!r}") +def validate_forbidden_text(markdown_path: Path, errors: list[str]) -> None: + text = read_markdown(markdown_path).lower() + for phrase in FORBIDDEN_TEXT.get(markdown_path, ()): + if phrase.lower() in text: + errors.append(f"{markdown_path}: forbidden scope phrase present: {phrase!r}") + + def validate_required_paths(errors: list[str]) -> None: for path in REQUIRED_REVIEWER_PATHS: if not (REPO_ROOT / path).is_file(): @@ -577,6 +612,7 @@ def main() -> int: validate_required_links(markdown_path, errors) validate_required_text(markdown_path, errors) + validate_forbidden_text(markdown_path, errors) validate_required_paths(errors) validate_workflow_path_filters(reviewer_surface_markdown, errors)