Skip to content

Conversation

adrianosela
Copy link
Contributor

[CVE-2024-6844] Replace use of (urllib) unquote_plus with unquote

The unquote_plus function was being used here to normalize the request path before applying pattern matching against defined CORS resources.

BUT unquote_plus() decodes + signs into spaces, which is NOT appropriate for URL paths. The + character is not reserved in URL paths — it can be a valid part of the path.

So we want to use unquote instead of unquote_plus (which does not decode + into spaces.

@codecov-commenter
Copy link

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

Thanks for integrating Codecov - We've got you covered ☂️

@LewisCowlesMotive
Copy link

unquote_plus() decodes + signs into spaces, which is NOT appropriate for URL paths. The + character is not reserved in URL paths — it can be a valid part of the path.

I've been trying to understand how this is not appropriate for paths, and have found nothing stating this in any RFC's...

Is it because it is omitted in https://datatracker.ietf.org/doc/html/rfc3986#section-5.2.2 that you believe it is not appropriate, or is that called out explicitly anywhere?

Is there more specific text in any RFC you can point to which explicitly says that this is the case?

There is some old Java notes: https://blog.lunatech.com/posts/2009-02-03-what-every-web-developer-must-know-about-url-encoding which states the above position, but it's not referenced anywhere officially, so I feel like it has very little to do with HTTP specification.

Thanks for doing this; I would just like to understand more about how the commit
67c4b2c introduced a bug, and if there might be anything that can be added to this, to prevent future regressions

@corydolphin
Copy link
Owner

First, thank you very much for the contribution and fix! I really appreciate the help and support maintaining this library.

Second, can you please add unit tests to codify the behavior? If you are able to add some tests to verify the new behavior, I'll happily merge!

@corydolphin corydolphin self-requested a review May 15, 2025 04:09
Copy link
Owner

@corydolphin corydolphin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add unit tests!

@adrianosela
Copy link
Contributor Author

@LewisCowlesMotive I mostly went off of confidence of knowing that the + symbol is valid in paths. The reason why unquote_plus() exists is because in query strings the + symbol is used to denote spaces. For example when searching for something on google, the resulting URL is https://www.google.com/search?q=what+is+my+IP where the + symbols are spaces in the query.

But for paths the plus sign is a valid character e.g. abc.com/this/is/+/valid i.e. the + symbol here is NOT to be decoded as a different character other than +. This is why using unquote_plus for paths is not appropriate.


With respect to actual evidence, the authoritative spec for URI syntax is RFC 3986:

Note: relevant summary straight from chatGPT but Ive verified it

From Section 3.3 - definition of a path:

path          = path-abempty    ; begins with "/" or is empty
              / path-absolute   ; begins with "/" but not "//"
              / path-noscheme   ; begins with a non-colon segment
              / path-rootless   ; begins with a segment
              / path-empty      ; zero characters

Each path-* variant is made up of segment components, defined as:

segment       = *pchar

and:

pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"

From section 2.3 - definition of a subdelimeter:

sub-delims  = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="

Here it is: + is explicitly listed as a sub-delimiter.

Since pchar includes sub-delims, and segment includes pchar, and paths are made of segments:

Therefore, the + character is valid and allowed in path segments.

@adrianosela
Copy link
Contributor Author

Hey @corydolphin -- some unit tests added. I ran the tests with the old code (e.g. unquote_plus) and saw the expected failures:

________________________________ AppExtensionPlusInPath.test_plus_path_origin_allowed ________________________________

self = <tests.extension.test_app_extension.AppExtensionPlusInPath testMethod=test_plus_path_origin_allowed>

    def test_plus_path_origin_allowed(self):
        '''
        Ensure that CORS matches + literally and allows the correct origin
        '''
        response = self.client.get('/service+path', headers={'Origin': 'http://foo.com'})
        self.assertEqual(response.status_code, 200)
>       self.assertEqual(response.headers.get(ACL_ORIGIN), 'http://foo.com')
E       AssertionError: None != 'http://foo.com'

tests/extension/test_app_extension.py:410: AssertionError
_____________________________ AppExtensionPlusInPath.test_plus_path_rejects_other_origin _____________________________

self = <tests.extension.test_app_extension.AppExtensionPlusInPath testMethod=test_plus_path_rejects_other_origin>

    def test_plus_path_rejects_other_origin(self):
        '''
        Origin not allowed for + path should be rejected
        '''
        response = self.client.get('/service+path', headers={'Origin': 'http://bar.com'})
        self.assertEqual(response.status_code, 200)
>       self.assertIsNone(response.headers.get(ACL_ORIGIN))
E       AssertionError: 'http://bar.com' is not None

tests/extension/test_app_extension.py:426: AssertionError

@LewisCowlesMotive
Copy link

This is great thank you. I read the spec. I Think I understand your point better now. Initially I read it as plus should not be in paths. I then re-read it and read up on it's origin from browsers. So while folks might be able to interpret the path, it's non-standard and nothing to do with HTTP or CORS. Thanks.

@adrianosela
Copy link
Contributor Author

Hi @corydolphin, how's it going? Would you be able to merge this lil guy and cut a new release? Otherwise, any other tests you would like to see?

Copy link
Owner

@corydolphin corydolphin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for the discussion and fixes! You all make software development great!

@LewisCowlesMotive let me know if you are interested in becoming a maintainer and we can discuss transition processes. I frankly don’t focus on Flask and may no longer be the best person to maintain this.

@corydolphin corydolphin merged commit 35d8753 into corydolphin:main May 17, 2025
7 checks passed
@adrianosela adrianosela deleted the CVE-2024-6844-fix branch May 17, 2025 19:18
@adrianosela
Copy link
Contributor Author

Hey @corydolphin, I assume you meant to tag me? If so -- I would be interested in sharing the burden but not entirely taking it over. I'll send you an email to follow up outside of this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CVE-2024-6844
4 participants