Skip to content

Conversation

rdeltour
Copy link
Member

@rdeltour rdeltour commented May 1, 2020

Unless I'm missing something, the spec doesn’t say that a reading system MUST or SHOULD support EPUB encryption.

EPUBCheck currently cannot decrypt encrypted or obfuscated resources. Previously, it reported any attempt to read an encrypted resource as an error, RSC-004. Internally, the code declared obfuscated resources as readable, even if it cannot practically read it, to not report obfuscated fonts as errors.
This is problematic for resources that EPUBCheck will try to parse deeper down the validation workflow (like SVG). The resource is declared as readable, so EPUBCheck tries to parse it, and fails with a NullPointerException (issue #1077).

The inability to read encrypted or obfuscated resources is a limitation of EPUBCheck. EPUBCheck should:

  • report the inability to read the resource as an informative-only message
  • do its best to carry on the validation

This PR tries to do that by downgrading the severity of RSC-004 to USAGE. It also updates the de-encryption filters to accurately report that they cannot read the content. Validation of encrypted or obfuscated resource will then abort early, before trying to parse the content.

Obfuscated fonts, which are likely the most common usage of obfuscation, will keep on not being reported as warnings (see issue #220). The only difference is that they will now trigger the RSC-004 usage message.

Fixes #1077

A USAGE better describes EPUBCheck’s unability to decrypt or de-obfuscate
a resource.

- All the incomplete `EncryptionFilter` implementations now accurately
report they cannot decrypt related content.
- Validation will abort early for encrypted (or obfuscated) files,
reporting RSC-004 as a USAGE message.

Fixes 1077, caused by an attempt to parse an obfuscated SVG document
which was falsely assumed decryptable by the lying `EncryptionFilter`
(boo).
@rdeltour rdeltour added this to the 4.2.3 milestone May 1, 2020
@rdeltour rdeltour requested a review from mattgarrish May 1, 2020 08:36
@rdeltour rdeltour self-assigned this May 1, 2020
@mattgarrish
Copy link
Member

I'm kind of torn on this one. Issuing a warning sounds like the right thing to do, not because of anything the specification says, but because it can lead to content slipping by without being thoroughly checked. The usage message isn't going to show up unless you turn it on, right?

It sort of seems like something the person validating should silence through a flag, so that it's a conscious decision to ignore the encrypted/obfuscated content and any problems that might exist in it.

@rdeltour
Copy link
Member Author

rdeltour commented May 1, 2020

Maybe using INFO instead be appropriate then? INFO messages are displayed by the command line by default. In some way, the message is an information on EPUBCheck’s limitation.

@rdeltour
Copy link
Member Author

rdeltour commented May 1, 2020

@mattgarrish I updated the PR.

Running it on the obfuscated SVG test will give the following output:

$ epubcheck --mode exp src/test/resources/30/expanded/valid/container-obfuscation-svg-valid
Validating using EPUB version 3.2 rules.
INFO(RSC-004): ./src/test/resources/30/expanded/valid/container-obfuscation-svg-valid.epub(-1,-1): File "EPUB/emoji.svg" could not be decrypted.
No errors or warnings detected.
Messages: 0 fatals / 0 errors / 0 warnings / 1 info

EPUBCheck completed

Copy link
Member

@mattgarrish mattgarrish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me!

@rdeltour rdeltour changed the title feat: downgrade RSC-004 (cannot decrypt resource) to USAGE feat: downgrade RSC-004 (cannot decrypt resource) to INFO May 1, 2020
@rdeltour rdeltour merged commit e732068 into master May 1, 2020
@rdeltour rdeltour deleted the fix/1077/obfuscated-svg branch May 1, 2020 15:08
karenhanson added a commit to karenhanson/jhove that referenced this pull request Oct 21, 2020
Two changes to EPUBCheck were causing the tests to change:
The first changes the error messages returned for validation of a non-EPUB:
w3c/epubcheck#1134
The solution was to change the message count.
The second downgrades the encrypted file message to INFO
w3c/epubcheck#1136
The original implementation of the JHOVE EPUB module did not include INFO messages generated by EPUBCheck in the JHOVE report, but looking at the kinds of messages that might be listed as INFO messages I thought it might be useful to include them. The change to add INFO messages is reflected in this commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

null pointer exception on valid ePub w/ obfuscated SVG
2 participants