fix: normalize URLs before checking if the resources exist #1506
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
URLs were not normalized before performing existence checks. So percent-encoded URLs sometimes triggered
RSC-001
orRSC-007
errors.This commit introduces a new
normalize(URL)
method in theURLUtils
class. Normalization is now used when checking a URL. This notably applies to resource and ID existence checks.Important Note:
URL normalization is not well-defined. Some percent-encoding normalization is described in RFC3986, but is not defined in the URL standard. Also, normalization (as useful for EPUBCheck) is also dependent on the URL scheme.
The normalization we apply is quite naïve and might need to be improved in the future. It should however cover the majority of HTTP URL real-world scenarios.
Fix #1479