Skip to content

Fix int16 TOSA.TABLE LUT zeroed when output range uses <16 bits#20668

Open
christine-long-meta wants to merge 2 commits into
pytorch:mainfrom
christine-long-meta:export-D107331163
Open

Fix int16 TOSA.TABLE LUT zeroed when output range uses <16 bits#20668
christine-long-meta wants to merge 2 commits into
pytorch:mainfrom
christine-long-meta:export-D107331163

Conversation

@christine-long-meta

@christine-long-meta christine-long-meta commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Summary:
InsertTableOpsPass.generate_16_bit_table_values builds the int16 TOSA.TABLE lookup for unary ops (sigmoid, tanh, ...). It computes rshift = ceil(log2(max_table_value)) + 1 - 16 to fit the table into 16 signed bits, then does lut_values >> rshift, assuming the table fills ~16 bits (its own comment notes "for int16, rshift == 0").

When the op's output range uses fewer than 16 bits this breaks. A sigmoid output is in [0, 1]; quantized with a small scale (e.g. 1/4096), the largest table value is 4096 (13 bits), so rshift = 13 - 16 = -3. lut_values >> -3 is an undefined negative right-shift; on the host the shift count is masked and the entire table is zeroed, so the activation returns 0 for every input. This makes any int16 TABLE op with a small output range (e.g. a sigmoid in a Squeeze-and-Excitation block) degenerate.

Fix: clamp rshift to >= 0. When it would be negative the values already fit in int16, so no shift is needed; this restores the documented rshift == 0 / rescale_lshift == -7 case. The fix is general -- it covers any int16 TABLE op whose output range is small.

Differential Revision: D107331163

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell @rascani

Summary:
`InsertTableOpsPass.generate_16_bit_table_values` builds the int16 `TOSA.TABLE` lookup for unary ops (sigmoid, tanh, ...). It computes `rshift = ceil(log2(max_table_value)) + 1 - 16` to fit the table into 16 signed bits, then does `lut_values >> rshift`, assuming the table fills ~16 bits (its own comment notes "for int16, rshift == 0").

When the op's output range uses fewer than 16 bits this breaks. A sigmoid output is in `[0, 1]`; quantized with a small scale (e.g. `1/4096`), the largest table value is `4096` (13 bits), so `rshift = 13 - 16 = -3`. `lut_values >> -3` is an undefined negative right-shift; on the host the shift count is masked and the entire table is zeroed, so the activation returns 0 for every input. This makes any int16 `TABLE` op with a small output range (e.g. a sigmoid in a Squeeze-and-Excitation block) degenerate.

Fix: clamp `rshift` to >= 0. When it would be negative the values already fit in int16, so no shift is needed; this restores the documented `rshift == 0` / `rescale_lshift == -7` case. The fix is general -- it covers any int16 `TABLE` op whose output range is small.

Differential Revision: D107331163
@pytorch-bot

pytorch-bot Bot commented Jul 1, 2026

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20668

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 926f6e5 with merge base fc408f8 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 1, 2026
@meta-codesync

meta-codesync Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

@christine-long-meta has exported this pull request. If you are a Meta employee, you can view the originating Diff in D107331163.

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. meta-exported module: arm Issues related to arm backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant