Skip to content

Conversation

bplaxco
Copy link
Contributor

@bplaxco bplaxco commented Apr 16, 2025

Description:

Add percent decoding (aka URL decoding) support.

Goals:

  • Be fast
  • Try not to add a lot of extra passes on the code
  • Make it easier to add more formats in the future
  • Nice to have: handle cases where other encodings might be partially encoded in another one[1]

[1]: For example:

This aGVsbG8sIHdvcmxkIQ%3D%3D%0A should decode first to aGVsbG8sIHdvcmxkIQ== and then hello, world!. But this 'hello%2C%20world%21 aGVsbG8sIHdvcmxkIQ== shouldn't need two passes to to be fully decoded.

TODO

  • Get working base implementation
  • Update docs
  • Refactor
  • (Extra) add hex encoding
  • Add tests in detect_test.go (make sure to test multiple levels and that the parent child relationship works properly with segments)
  • Add tests to confirm detection for when to decode things in multiple steps works as expected
  • Try to think of some edge cases to add to the tests (@zricethezav && @rgmz suggestions welcome ^_^)
  • Performance test, tune & refactor
  • Fix scan bug when scanning gitleaks repo

Checklist:

  • Does your PR pass tests?
  • Have you written new tests for your changes?
  • Have you lint your code locally prior to submission?

@bplaxco bplaxco force-pushed the urldecoder branch 5 times, most recently from bef24cd to 7a0d3f5 Compare April 19, 2025 05:54
@bplaxco bplaxco force-pushed the urldecoder branch 8 times, most recently from 7e1eefd to 5ebf6a7 Compare May 2, 2025 07:56
@bplaxco bplaxco force-pushed the urldecoder branch 7 times, most recently from b4ddd81 to b62dba3 Compare May 9, 2025 02:43
@bplaxco bplaxco changed the title [WIP] Urldecoding support Urldecoding support May 11, 2025
@bplaxco bplaxco changed the title Urldecoding support [WIP] Urldecoding support May 11, 2025
@bplaxco bplaxco changed the title [WIP] Urldecoding support Urldecoding support May 12, 2025
@bplaxco bplaxco changed the title Urldecoding support [WIP] Urldecoding support May 12, 2025
@bplaxco bplaxco changed the title [WIP] Urldecoding support Percent/URL Decoding Support May 12, 2025
Copy link
Collaborator

@zricethezav zricethezav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point I'm convinced @bplaxco is some kind of wizard or time traveling AI

}

segments = append(segments, segment)
logging.Debug().Msgf(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

love this debug message

@zricethezav zricethezav merged commit 78eebac into gitleaks:master May 12, 2025
2 checks passed
@bplaxco bplaxco deleted the urldecoder branch May 18, 2025 02:34
alayne222 pushed a commit to alayne222/gitleaks that referenced this pull request May 28, 2025
* Add initial percent decoding support

* Refactor multi encoding support

* Add detect tests to confirm positions

* Avoid a few extra passes during decoding

* Do multiple passes for finding encodings re

* Fix issue with overlapping encodings when doing separate passes
Comment on lines +28 to +30
if err == nil && isPrintableASCII(decodedValue) {
return string(decodedValue)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bplaxco Hypothetically, couldn't there be chunks that have both standard and url encoding? I guess since this is recursive it should catch both cases?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants