Skip to content

#1298 - Fix in LZWDecoder when valid PDF stream data does not start with LZW Clear Table code (256) #1299

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

scottmore
Copy link
Contributor

@scottmore scottmore commented May 1, 2025

This problem occurs in PostScript calculator functions in several PDFs we have encountered, changes include unit tests to cover both common LZW and this more uncommon scenario.

Description of the new Feature/Bugfix

The fix is really a simplification of the main decoding loop in LZWDecoder and detecting explicitly when there is no "oldCode", i.e. previous code. Hopefully the decode() method is slightly easier to follow now and supports this valid scenario where the LZW encoded data does not start with the Clear Table (256) special code.

Related Issue: #1298

Unit-Tests for the new Feature/Bugfix

Three unit tests were added:

  • [LZWDecoderTest.shouldDecodeType4PSCalcFunction1 ] Highlights the problem with LZWDecoder producing garbled output before the fix
  • [LZWDecoderTest.shouldDecodeType4PSCalcFunction2] Highlights another possible result of the same problem with LZWDecoder throwing a null exception before the fix
  • [LZWDecoderTest.shouldDecodeCmapData] A standard LZW encoded data sample that worked before and needs to keep working! This is a CMap dictionary from a PDF stream that starts with the common Clear Table code.

Compatibilities Issues

No compatibility issues or changes to function signatures. Fix is internal to LZWDecoer.decode() method

Your real name

Scott More

Testing details

Nothing specific. All PDFs with LZW decode streams should work with no data corruption or exceptions now.

scottmore added 4 commits May 1, 2025 09:24
…art with LZW Clear Table code (256), occurs in PostScript calculator functions in several PDFs we have encountered, changes include unit tests to cover both common LZW and this more uncommon scenario
…t throw RuntimeException and silently proceed, not as big a fan of this change but cannot get past static code analyzers without it
Copy link

sonarqubecloud bot commented May 1, 2025

@scottmore scottmore changed the title Issue #1298 - Fix in LZWDecoder when valid PDF stream data does not start with LZW Clear Table code (256) #1298 - Fix in LZWDecoder when valid PDF stream data does not start with LZW Clear Table code (256) May 1, 2025
@andreasrosdal andreasrosdal merged commit 4a8f2de into LibrePDF:master May 14, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants