test(oracle): add cross-attention, AdaLN, timestep-embed to gradcheck+oracle by dndungu · Pull Request #164 · zerfoo/ztensor

dndungu · 2026-06-17T07:20:20Z

Extends the ADR-091 gradcheck + PyTorch-oracle harness with three more E127/T127.1.0a diffusion-DiT op classes — each composed from existing engine ops with an analytic backward verified against finite-difference on CPU:

CrossAttention — single-head scaled dot-product attention (Q,K,V; no params). torch: scaled_dot_product_attention.
AdaLN — out = x*(1 + c@Ws) + c@Wsh modulation core (two projection params).
TimestepEmbed — concat(sin(t@freqs), cos(t@freqs)) sinusoidal embedding (freqs leaf).

Verified

TestRegistry/{CrossAttention,AdaLN,TimestepEmbed} gradcheck pass; full gradcheck + oracle registry↔torchmap lockstep green; go vet/build clean.

Coverage

With GroupNorm (already merged, #159), 4 of the 6 T127.1.0a op classes are now covered. The remaining two — Conv3D, ConvTranspose — are forward-only per ADR-092 (inference-only VAE, forward-parity not gradcheck) and do not fit this backward-checking harness; they need a separate forward-parity path (tracked follow-up, not this PR).

Companion to #159. Refs zerfoo E127.

…+oracle Extends the ADR-091 gradcheck + PyTorch-oracle harness with three more E127 diffusion-DiT op classes (T127.1.0a), each composed from existing engine ops with an analytic backward verified against finite-difference on CPU: - CrossAttention: single-head scaled dot-product attention (Q,K,V; no params). torch: scaled_dot_product_attention. - AdaLN: out = x*(1+c@Ws) + c@Wsh modulation core (two projection params). - TimestepEmbed: concat(sin(t@freqs), cos(t@freqs)) sinusoidal embedding. Verified: TestRegistry/{CrossAttention,AdaLN,TimestepEmbed} gradcheck pass; full gradcheck + oracle registry<->torchmap lockstep green; go vet clean. With GroupNorm (already merged), 4 of the 6 T127.1.0a op classes are now covered. The remaining two (Conv3D, ConvTranspose) are FORWARD-ONLY per ADR-092 and do not fit a backward-checking gradcheck harness; they need a separate forward-parity path (tracked follow-up).

dndungu merged commit cc6948a into main Jun 17, 2026
1 check failed

dndungu deleted the feat/oracle-attn-adaln-timestep-t127 branch June 17, 2026 07:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test(oracle): add cross-attention, AdaLN, timestep-embed to gradcheck+oracle#164

test(oracle): add cross-attention, AdaLN, timestep-embed to gradcheck+oracle#164
dndungu merged 1 commit into
mainfrom
feat/oracle-attn-adaln-timestep-t127

dndungu commented Jun 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

dndungu commented Jun 17, 2026

Verified

Coverage

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant