Skip to content

Phase 1: Seq<T> ghost type#143

Draft
coord-e wants to merge 4 commits into
mainfrom
claude/nifty-newton-wfxz2e
Draft

Phase 1: Seq<T> ghost type#143
coord-e wants to merge 4 commits into
mainfrom
claude/nifty-newton-wfxz2e

Conversation

@coord-e

@coord-e coord-e commented Jun 20, 2026

Copy link
Copy Markdown
Owner

Summary

Phase 1 of the Creusot-style ghost-sequence support tracked in
/root/.claude/plans/how-can-we-implement-buzzing-cascade.md. Eventual
goal: verify Creusot's iter::Map / iter::MapExt in Thrust.

Introduces thrust_models::model::Seq<T> as (Array<Int, T>, Int)
(mirroring the existing Vec model) and wires the following operations,
each with paired pass/fail UI tests:

Op Lowering
s.len() tuple_proj(s, 1)
s.push(e) tuple([store(s.0, s.1, e), s.1 + 1])
s[i] select(s.0, i) — detected via #[thrust::def::seq_model] marker
Seq::<T>::empty() tuple([seq_default_arr_int, 0])
Seq::singleton(x) tuple([store(seq_default_arr_int, 0, x), 1])
s.concat(t) tuple([seq_concat_arr_int(s.0, s.1, t.0), s.1 + t.1])
s.subsequence(l, r) tuple([seq_subseq_arr_int(s.0, l), r - l])

Solver-side details

seq_default_arr_int is declared once in the SMT preamble; the array
contents are irrelevant because indices ≥ length are never accessed.

seq_concat_arr_int(a, la, b) / seq_subseq_arr_int(a, l) are
peephole markers: the select simplifier rewrites
select(seq_concat_arr_int(a, la, b), i) → element-getter
seq_concat_get_int(a, la, b, i) (and similarly for subseq) before
the term reaches the SMT solver. The getters are emitted via SMT-LIB2
(define-fun-rec ...) whenever used, on demand (gated by
System::uses_seq_concat / uses_seq_subseq).

  • Length reasoning on concat / subsequence works under either
    solver, because the length is computed inline by the analyzer
    (a.len + b.len, r - l).
  • Indexed reasoning (e.g. s.concat(t)[i] == s[i] when
    i < s.len()) works under pcsat — opt in with
    //@rustc-env: THRUST_SOLVER=tests/thrust-pcsat-wrapper. z3 / Spacer
    in HORN logic returns unknown for define-fun-rec over arrays,
    which is a known solver limitation.

The marker symbols themselves are never declared and should not reach
the SMT output; if they do (e.g. through a whole-array equality), the
SMT solver reports an undeclared symbol. The reference iterator
examples never compare array slots directly, so this is not expected to
block the eventual port.

Term simplifiers added to chc::Term

Two small peephole rules keep the encoding inside what HORN solvers
reliably reduce:

  • tuple_proj(tuple(ts...), i)ts[i] — Spacer doesn't reduce
    datatype projections of literal constructors, so this is needed for
    e.g. Seq::empty().len() to reduce syntactically to 0.
  • select(store(a, i, v), j)v / select(a, j) for concrete i,
    j integer literals — needed for Seq::singleton(x)[0] == x.

Plus the marker-consuming select(seq_*_arr_int(...), i) rewrites
described above.

Additionally fixes chc::Term::App emission to print bare f (not
(f)) for nullary functions, matching SMT-LIB conventions for
declare-const.

Phases that follow (separate PRs)

  • Phase 2: snapshot!{} / proof_assert!{} macros
  • Phase 3: trait #[predicate] propagation
  • Phase 4: end-to-end port of map.rs / map_ext.rs

Test plan

  • cargo build
  • cargo fmt --all -- --check
  • cargo clippy -- -D warnings
  • cargo test (270 / 270 passing, including 12 new paired
    seq_* tests: seq_len_push, seq_index, seq_empty,
    seq_singleton, seq_concat, seq_subsequence,
    seq_concat_index, seq_concat_index_right, seq_subseq_index)

🤖 Generated with Claude Code

https://claude.ai/code/session_01A7nMxM1sJ8wxVJmMvsEvSX

claude added 3 commits June 19, 2026 12:53
Implements minimal Phase 1 of the Creusot-style ghost sequence support:
defines `thrust_models::model::Seq<T>` as `(Array<Int, T>, Int)` (mirroring
the existing Vec model) and wires per-operation analyzer arms in
`annot_fn.rs` so calls like `s.len()` and `s.push(e)` lower mechanically
to tuple projections / array stores.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01A7nMxM1sJ8wxVJmMvsEvSX
In ExprKind::Index, detect a `model::Seq<_>` receiver and lower
`s[i]` to `select(s.0, i)` instead of treating the receiver as an
Array directly. Introduces the `#[thrust::def::seq_model]` marker and
`DefIdCache::seq_model()` to recognise the Seq ADT.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01A7nMxM1sJ8wxVJmMvsEvSX
Adds the remaining Seq operations beyond `len` / `push` / indexing:

- `Seq::empty()` / `Seq::singleton(x)` — lower to tuple values whose array
  slot is the new global `seq_default_arr_int` constant (declared in the
  SMT preamble) and whose length is `0` / `1`. The array contents are
  irrelevant because indices ≥ length are never accessed.
- `s.concat(t)` / `s.subsequence(l, r)` — lower to `Tuple([..., len])`
  where `len` is computed inline (`a.len + b.len` / `r - l`). The array
  slot uses fresh uninterpreted symbols `seq_concat_arr_int` /
  `seq_subseq_arr_int`, declared on demand via new `System` flags
  `uses_seq_concat` / `uses_seq_subseq` to avoid tripping solvers that
  can't synthesize their sort even when unused. The extensional
  pointwise axioms describing index access on the result are
  intentionally omitted — they cause Spacer to return `unknown` even
  on clauses that don't touch the axiomatized positions. As a result
  this round supports length-only reasoning on `concat` /
  `subsequence`; indexed reasoning is a follow-up.

Two peephole simplifications in `chc::Term` keep the encoding inside
what HORN solvers reliably reduce:

- `tuple_proj(tuple(ts...), i)` ↦ `ts[i]` — needed so that
  `Seq::empty().len()` reduces syntactically to `0` (Spacer doesn't
  reduce datatype projections of literal constructors).
- `select(store(a, i, v), j)` ↦ `v` / `select(a, j)` when both indices
  are concrete integer literals — needed for `Seq::singleton(x)[0] == x`.

Also fixes `chc::Term::App` emission to print bare `f` (not `(f)`) for
nullary functions, matching SMT-LIB conventions for `declare-const`.

Adds paired pass/fail UI tests for each new operation.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01A7nMxM1sJ8wxVJmMvsEvSX
@coord-e coord-e changed the title Phase 1: Seq&lt;T&gt; ghost type — len, push, indexing Phase 1: Seq&lt;T&gt; ghost type Jun 20, 2026
Per-element semantics of `concat` / `subsequence` now verify under
pcsat — previously only length-only reasoning was supported.

Approach: keep the array-returning `seq_concat_arr_int` /
`seq_subseq_arr_int` symbols as *peephole markers* in the array slot
of concat / subsequence results, and add a `select` simplification
rule that rewrites `select(seq_concat_arr_int(a, la, b), i)` to a new
element-getter `seq_concat_get_int(a, la, b, i)` (and analogously for
subsequence) before the term reaches the SMT solver. The getters are
emitted via SMT-LIB2 `(define-fun-rec ...)` whenever they're used.

pcsat parses `define-fun-rec` returning `Int` and reasons about it;
z3 / Spacer in HORN logic still returns `unknown` for these
definitions, so indexed-access tests opt into pcsat via
`THRUST_SOLVER=tests/thrust-pcsat-wrapper`. Length-only reasoning
continues to work under either solver (the length is computed inline
by the analyzer as `a.len + b.len` / `r - l`).

Paired pass/fail tests added: `seq_concat_index{,_right}`,
`seq_subseq_index`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01A7nMxM1sJ8wxVJmMvsEvSX
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants