Honour LZWDecode /EarlyChange 0; expand filter test coverage#14
Merged
Conversation
decodeLZW defaulted EarlyChange to 1 whenever the value was 0, conflating
"unset" (PDF default 1) with an explicit /EarlyChange 0, so streams that
set it to 0 were decoded with the wrong code-width timing. Replace the
EarlyChange int field with NoEarlyChange bool: the zero value keeps the
default early change and an explicit 0 is now honoured. This also clamps
the flag to {0,1}, so a hostile /EarlyChange can no longer distort the
width threshold.
Adds filter tests: stdlib-LZW round-trips exercising dictionary reuse,
KwKwK, 9->12-bit growth and the dictionary-full reset; an EarlyChange
regression at the Open()/Content() surface; a truncated-stream no-panic
check; ASCII85 round-trips and edge cases (z, whitespace, invalid byte,
partial groups); and a chained /Filter array. Removes the dead errors
import from lzw.go.
internal/filter 80.7% -> 93.2%; decodeLZW/readBits/decodeASCII85 to 100%.
d04da68 to
a6d484f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bug
decodeLZWinterpreted the early-change flag as:/EarlyChangedefaults to 1 but may be set to 0. Because the field's zero value (0) doubled as "unset", a stream that explicitly declared/EarlyChange 0was silently decoded with early change on, producing garbage past the first 9→10-bit code-width boundary (~511 dict entries).Fix
Replace
Params.EarlyChange intwithParams.NoEarlyChange bool:/EarlyChange 0→NoEarlyChange = true.As a bonus this clamps the effective flag to
{0,1}, so a hostile/EarlyChangevalue (e.g. a huge int) can no longer drive the width threshold(1<<width) - earlynegative.How the oracle works
Go's
compress/lzw(MSB) turned out to use the non-early convention, so its output round-trips this decoder only withNoEarlyChange: true. That makes it a sound independent oracle for the early=0 path and the shared width-growth / clear / KwKwK machinery. The red→green was verified by temporarily reverting the fix (the/EarlyChange 0stream then fails withinvalid code … at width 9).Tests
TestLZWRoundTripStdlib— stdlib-encoded streams (incl. a 64 KiB varied buffer) exercise dict reuse, KwKwK, 9→12-bit growth and the dictionary-full reset.TestLZWEarlyChangeHonored/TestLZWStreamEarlyChangeZero— the flag is honoured, the latter at theOpen()→Content()surface (also coversparamsFromDict+streamFilterChain).TestLZWTruncatedNoPanic— a stream cut mid-code must not panic (readBitszero-padding).TestASCII85RoundTripStdlib/TestASCII85EdgeCases— partial groups,z, whitespace,<~, invalid byte.TestStreamFilterChainArray— a chained[/ASCII85Decode /FlateDecode]filter.Also removes the dead
errorsimport (var _ = errors.New) fromlzw.go.Coverage:
internal/filter80.7% → 93.2%;decodeLZW,readBits,decodeASCII85→ 100%;paramsFromDict0% → 67%;streamFilterChain42% → 67%.