-
Notifications
You must be signed in to change notification settings - Fork 37.8k
build: Add linter checking for accidental introduction of locale dependence #13041
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@laanwj Do you think the |
8a19194
to
4ba8cc1
Compare
@laanwj |
toupper towlower towupper | ||
) | ||
REGEXP_IGNORE_EXTERNAL="src/(crypto/ctaes/|leveldb/|secp256k1/|tinyformat.h|univalue/)" | ||
REGEXP_IGNORE_KNOWN_VIOLATIONS="^(src/base58.cpp:.*isspace|src/qt/rpcconsole.cpp:.*isdigit|src/rest.cpp:.*strtol|src/torcontrol.cpp:.*strtol|src/uint256.cpp:.*isspace|src/uint256.cpp:.*tolower|src/util.cpp:.*tolower|src/utilmoneystr.cpp:.*isdigit|src/utilmoneystr.cpp:.*isspace|src/utilstrencodings.cpp:.*isspace|src/utilstrencodings.cpp:.*strtoll|src/utilstrencodings.cpp:.*strtol)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a shocking number of known violations.
I didn't know e.g. strtol
is also locale dependent, otherwise I wouldn't have used it in torcontrol for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to split this list over multiple lines, so that it's possible to remove an exception by removing a line?
4ba8cc1
to
d7c3267
Compare
@laanwj Updated version with the known violations list split over multiple lines. Please re-review :-) |
d7c3267
to
0b4965b
Compare
Candidates for inclusion in the list of locale dependent functions:
|
The use of |
0b4965b
to
29b73b5
Compare
Updated: Added more functions to Please re-review :-) |
utACK 29b73b509f1aa537a835f9e75878622c3a18335f |
29b73b5
to
b2af0ad
Compare
Updated:
Please re-review :-) |
1a99d9a
to
ddb1719
Compare
I think |
@laanwj You saw that Should I keep |
Oh that's good! No, hadn't noticed that yet.
Yes I think so! |
ddb1719
to
6271495
Compare
@laanwj Updated version with |
Thinking of it, it would make sense to add this list to the developer notes as well (next to the mention of the locale risks) , so it doesn't come out of the blue. |
echo | ||
EXIT_CODE=1 | ||
fi | ||
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to simplify this by processing them at once? I'm skeptical grouping by function will be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Empact For performance or usability reasons?
If you mean for performance reasons, then please note that the costly operation git grep
is run only once.
From a usability perspective I think it makes sense to group the output per function – try running the script with an empty KNOWN_VIOLATIONS
and you'll see what I mean :-) But I'm open to alternative outputs, so please provide a diff and I'll check out your suggestion :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speaking from a maintainability perspective - seems in the general case KNOWN_VIOLATIONS will be populated so the list will be relatively short and only introduced by they who open the PR, in which case the grouping isn't very beneficial IMO. If not much usability benefit is gained, then if the code can be simplified by removing it I think it's worthwhile to do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Empact Do you have any specific output formatting in mind? Please give an example on how it would look :-)
utACK f947ef02f9c8eee5957bb02e12cee4edd636c9fa |
On Jun 5, 2018 11:15 AM, "Wladimir J. van der Laan" < notifications@github.com> wrote:
utACK f947ef0
<f947ef0>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#13041 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAfpiNwEhc18PlReCb8tP5iuXoUiYLn6ks5t5sqtgaJpZM4TdXns>
.
|
Need to move |
f947ef0
to
698cfd0
Compare
@ken2812221 Thanks! Moved! @laanwj @cnavigato Please re-review :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK 698cfd0
…of locale dependence 698cfd0 docs: Mention lint-locale-dependence.sh in developer-notes.md (practicalswift) 0a4ea2f build: Add linter for checking accidental locale dependence (practicalswift) Pull request description: This linter will check for code accidentally introducing locale dependencies. Unnecessary locale dependence can cause bugs that are very tricky to isolate and fix. We should avoid using locale dependent functions if possible. Context: #12881 (comment) Example output: ``` $ contrib/devtools/lint-locale-dependence.sh The locale dependent function tolower(...) appears to be used: src/init.cpp: if (s[0] == '0' && std::tolower(s[1]) == 'x') { Unnecessary locale dependence can cause bugs that are very tricky to isolate and fix. Please avoid using locale dependent functions if possible. Advice not applicable in this specific case? Add an exception by updating the ignore list in contrib/devtools/lint-locale-dependence.sh ``` **Note to reviewers:** What is the most appropriate `LOCALE_DEPENDENT_FUNCTIONS` function list? What should be added or removed? Tree-SHA512: 14e448828804bb02bf59070647e38b52fce120c700c903a4a8472769a2cee5dd529bd3fc182386993cb8720482cf4250b63a0a477db61b941ae4babe5c65025f
14b29a7 Fix reference to lint-locale-dependence.sh (Hennadii Stepanov) Pull request description: The wrong reference sneaked through bitcoin#13041 and bitcoin#13281. Tree-SHA512: 39cc74297d00141206ce68b84575288a20e39e20ef717fa8b8c09076b5fb46d8281320b144a596094365f2a0704e5dd6bf960e35980ae4730546a72957403d69
…uction of locale dependence 698cfd0 docs: Mention lint-locale-dependence.sh in developer-notes.md (practicalswift) 0a4ea2f build: Add linter for checking accidental locale dependence (practicalswift) Pull request description: This linter will check for code accidentally introducing locale dependencies. Unnecessary locale dependence can cause bugs that are very tricky to isolate and fix. We should avoid using locale dependent functions if possible. Context: bitcoin#12881 (comment) Example output: ``` $ contrib/devtools/lint-locale-dependence.sh The locale dependent function tolower(...) appears to be used: src/init.cpp: if (s[0] == '0' && std::tolower(s[1]) == 'x') { Unnecessary locale dependence can cause bugs that are very tricky to isolate and fix. Please avoid using locale dependent functions if possible. Advice not applicable in this specific case? Add an exception by updating the ignore list in contrib/devtools/lint-locale-dependence.sh ``` **Note to reviewers:** What is the most appropriate `LOCALE_DEPENDENT_FUNCTIONS` function list? What should be added or removed? Tree-SHA512: 14e448828804bb02bf59070647e38b52fce120c700c903a4a8472769a2cee5dd529bd3fc182386993cb8720482cf4250b63a0a477db61b941ae4babe5c65025f
…uction of locale dependence 698cfd0 docs: Mention lint-locale-dependence.sh in developer-notes.md (practicalswift) 0a4ea2f build: Add linter for checking accidental locale dependence (practicalswift) Pull request description: This linter will check for code accidentally introducing locale dependencies. Unnecessary locale dependence can cause bugs that are very tricky to isolate and fix. We should avoid using locale dependent functions if possible. Context: bitcoin#12881 (comment) Example output: ``` $ contrib/devtools/lint-locale-dependence.sh The locale dependent function tolower(...) appears to be used: src/init.cpp: if (s[0] == '0' && std::tolower(s[1]) == 'x') { Unnecessary locale dependence can cause bugs that are very tricky to isolate and fix. Please avoid using locale dependent functions if possible. Advice not applicable in this specific case? Add an exception by updating the ignore list in contrib/devtools/lint-locale-dependence.sh ``` **Note to reviewers:** What is the most appropriate `LOCALE_DEPENDENT_FUNCTIONS` function list? What should be added or removed? Tree-SHA512: 14e448828804bb02bf59070647e38b52fce120c700c903a4a8472769a2cee5dd529bd3fc182386993cb8720482cf4250b63a0a477db61b941ae4babe5c65025f
…se of uninitialized memory 870f0cd build: Add MemorySanitizer (MSan) in Travis to detect use of uninitialized memory (practicalswift) Pull request description: Add MemorySanitizer (MSan) in Travis to detect use of uninitialized memory. First UBSan, then ASan followed by TSan... and now: yes, the wait is over -- **MSan is finally here!** :) Some historical context: * 2017: Continuous compilation with Clang Thread Safety analysis enabled (#10866, #10923) * 2018: Continuous testing with trapping on signed integer overflows (`-ftrapv`) (#12686) * 2018: Continuous testing of use of locale dependent functions (#13041) * 2018: Continuous testing of format strings (#13705) * 2018: Continuous compilation with MSVC `TreatWarningAsError` (#14151) * 2018: Continuous testing under UndefinedBehaviorSanitizer – UBSan (#14252, #14673, #17006) * 2018: Continuous testing under AddressSanitizer – ASan (#14794, #17205, #17674) * 2018: Continuous testing under ThreadSanitizer – TSan (#14829) * 2019: Continuous testing in an unsigned char environment (`-funsigned-char`) (#15134) * 2019: Continuous compile-time testing of assumptions we're making (#15391) * 2019: Continuous testing of fuzz test cases under Valgrind (#17633, #18159, #18166) * 2020: Finally... MemorySanitizer – MSAN! :) What is the next step? What tools should we add to CI to keep bugs from entering `master`? :) ACKs for top commit: MarcoFalke: ACK 870f0cd Tree-SHA512: 38327c8b75679d97d469fe42e704cacd1217447a5a603701dd8a58ee50b3be2c10248f8d68a479ed081c0c4b254589d3081c9183f991640b06ef689061f75578
Backport useful lints from upstream Cherry-picked from the following upstream PRs: - bitcoin/bitcoin#6892 - bitcoin/bitcoin#11151 - bitcoin/bitcoin#11300 - bitcoin/bitcoin@96d91b7 - bitcoin/bitcoin#12097 - bitcoin/bitcoin#12098 - bitcoin/bitcoin#12442 - bitcoin/bitcoin#12572 - bitcoin/bitcoin#12757 - bitcoin/bitcoin#11878 - bitcoin/bitcoin#12933 - bitcoin/bitcoin#12871 - bitcoin/bitcoin#12972 - bitcoin/bitcoin#13281 - bitcoin/bitcoin#13385 - bitcoin/bitcoin#13041 - bitcoin/bitcoin#13454 - bitcoin/bitcoin#13448 - bitcoin/bitcoin#13510 - bitcoin/bitcoin#13851 - bitcoin/bitcoin#13863 - bitcoin/bitcoin#14115 - bitcoin/bitcoin#14831 - bitcoin/bitcoin#15164 - bitcoin/bitcoin#15170 - bitcoin/bitcoin#15166 - bitcoin/bitcoin#16036 - bitcoin/bitcoin#16768 Several of the lints fail for our current codebase; these will be addressed in a subsequent PR.
Backport useful lints from upstream Cherry-picked from the following upstream PRs: - bitcoin/bitcoin#6892 - bitcoin/bitcoin#11151 - bitcoin/bitcoin#11300 - bitcoin/bitcoin@96d91b7 - bitcoin/bitcoin#12097 - bitcoin/bitcoin#12098 - bitcoin/bitcoin#12442 - bitcoin/bitcoin#12572 - bitcoin/bitcoin#12757 - bitcoin/bitcoin#11878 - bitcoin/bitcoin#12933 - bitcoin/bitcoin#12871 - bitcoin/bitcoin#12972 - bitcoin/bitcoin#13281 - bitcoin/bitcoin#13385 - bitcoin/bitcoin#13041 - bitcoin/bitcoin#13454 - bitcoin/bitcoin#13448 - bitcoin/bitcoin#13510 - bitcoin/bitcoin#13851 - bitcoin/bitcoin#13863 - bitcoin/bitcoin#14115 - bitcoin/bitcoin#14831 - bitcoin/bitcoin#15164 - bitcoin/bitcoin#15170 - bitcoin/bitcoin#15166 - bitcoin/bitcoin#16036 - bitcoin/bitcoin#16768 - bitcoin/bitcoin#13494 Several of the lints fail for our current codebase; these will be addressed in a subsequent PR.
14b29a7 Fix reference to lint-locale-dependence.sh (Hennadii Stepanov) Pull request description: The wrong reference sneaked through bitcoin#13041 and bitcoin#13281. Tree-SHA512: 39cc74297d00141206ce68b84575288a20e39e20ef717fa8b8c09076b5fb46d8281320b144a596094365f2a0704e5dd6bf960e35980ae4730546a72957403d69
14b29a7 Fix reference to lint-locale-dependence.sh (Hennadii Stepanov) Pull request description: The wrong reference sneaked through bitcoin#13041 and bitcoin#13281. Tree-SHA512: 39cc74297d00141206ce68b84575288a20e39e20ef717fa8b8c09076b5fb46d8281320b144a596094365f2a0704e5dd6bf960e35980ae4730546a72957403d69
14b29a7 Fix reference to lint-locale-dependence.sh (Hennadii Stepanov) Pull request description: The wrong reference sneaked through bitcoin#13041 and bitcoin#13281. Tree-SHA512: 39cc74297d00141206ce68b84575288a20e39e20ef717fa8b8c09076b5fb46d8281320b144a596094365f2a0704e5dd6bf960e35980ae4730546a72957403d69
14b29a7 Fix reference to lint-locale-dependence.sh (Hennadii Stepanov) Pull request description: The wrong reference sneaked through bitcoin#13041 and bitcoin#13281. Tree-SHA512: 39cc74297d00141206ce68b84575288a20e39e20ef717fa8b8c09076b5fb46d8281320b144a596094365f2a0704e5dd6bf960e35980ae4730546a72957403d69
…etect use of uninitialized memory 870f0cd build: Add MemorySanitizer (MSan) in Travis to detect use of uninitialized memory (practicalswift) Pull request description: Add MemorySanitizer (MSan) in Travis to detect use of uninitialized memory. First UBSan, then ASan followed by TSan... and now: yes, the wait is over -- **MSan is finally here!** :) Some historical context: * 2017: Continuous compilation with Clang Thread Safety analysis enabled (bitcoin#10866, bitcoin#10923) * 2018: Continuous testing with trapping on signed integer overflows (`-ftrapv`) (bitcoin#12686) * 2018: Continuous testing of use of locale dependent functions (bitcoin#13041) * 2018: Continuous testing of format strings (bitcoin#13705) * 2018: Continuous compilation with MSVC `TreatWarningAsError` (bitcoin#14151) * 2018: Continuous testing under UndefinedBehaviorSanitizer – UBSan (bitcoin#14252, bitcoin#14673, bitcoin#17006) * 2018: Continuous testing under AddressSanitizer – ASan (bitcoin#14794, bitcoin#17205, bitcoin#17674) * 2018: Continuous testing under ThreadSanitizer – TSan (bitcoin#14829) * 2019: Continuous testing in an unsigned char environment (`-funsigned-char`) (bitcoin#15134) * 2019: Continuous compile-time testing of assumptions we're making (bitcoin#15391) * 2019: Continuous testing of fuzz test cases under Valgrind (bitcoin#17633, bitcoin#18159, bitcoin#18166) * 2020: Finally... MemorySanitizer – MSAN! :) What is the next step? What tools should we add to CI to keep bugs from entering `master`? :) ACKs for top commit: MarcoFalke: ACK 870f0cd Tree-SHA512: 38327c8b75679d97d469fe42e704cacd1217447a5a603701dd8a58ee50b3be2c10248f8d68a479ed081c0c4b254589d3081c9183f991640b06ef689061f75578
…uction of locale dependence 698cfd0 docs: Mention lint-locale-dependence.sh in developer-notes.md (practicalswift) 0a4ea2f build: Add linter for checking accidental locale dependence (practicalswift) Pull request description: This linter will check for code accidentally introducing locale dependencies. Unnecessary locale dependence can cause bugs that are very tricky to isolate and fix. We should avoid using locale dependent functions if possible. Context: bitcoin#12881 (comment) Example output: ``` $ contrib/devtools/lint-locale-dependence.sh The locale dependent function tolower(...) appears to be used: src/init.cpp: if (s[0] == '0' && std::tolower(s[1]) == 'x') { Unnecessary locale dependence can cause bugs that are very tricky to isolate and fix. Please avoid using locale dependent functions if possible. Advice not applicable in this specific case? Add an exception by updating the ignore list in contrib/devtools/lint-locale-dependence.sh ``` **Note to reviewers:** What is the most appropriate `LOCALE_DEPENDENT_FUNCTIONS` function list? What should be added or removed? Tree-SHA512: 14e448828804bb02bf59070647e38b52fce120c700c903a4a8472769a2cee5dd529bd3fc182386993cb8720482cf4250b63a0a477db61b941ae4babe5c65025f
This linter will check for code accidentally introducing locale dependencies.
Unnecessary locale dependence can cause bugs that are very tricky to isolate and fix. We should avoid using locale dependent functions if possible.
Context: #12881 (comment)
Example output:
Note to reviewers: What is the most appropriate
LOCALE_DEPENDENT_FUNCTIONS
function list? What should be added or removed?