-
Notifications
You must be signed in to change notification settings - Fork 164
Add fast path for ASCII case folding #159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I've cherry-picked the newly added benchmarks on top of master as well for a baseline comparison (this is available at https://github.com/mszabo-wikia/re2j/tree/case-insensitive-benchmark). I executed the benchmarks using OpenJDK 8u332 on a 2021 M1 Pro. Here are the baseline results:
And the results with the ASCII case folding improvements from this patch:
|
// for the likely scenario where both runes are ASCII characters. | ||
static boolean equalsIgnoreCase(int r1, int r2) { | ||
// Runes already match, or one of them is EOF | ||
if (r1 < 0 || r2 < 0 || r1 == r2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That -1 means EOF should be noted in the doc comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 da63ef6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only looked at the production logic, but LGTM.
One of our production services uses re2j to match several hundred mostly case-insensitive patterns of varying complexity against text. We observed that approximately 12% of CPU time was being spent in toLowerCase() as called from simpleFold(), due to the necessity of doing at least one character data lookup per Inst.Rune in the common case that the input rune being examined did not match the instruction. As a fix, implement a method equalsIgnoreCase() that performs Unicode-aware case-insensitive comparison between two runes, with a fast path for the common case where both input runes are ASCII, and use it in Inst for single-rune case-insensitive comparison. This takes character data lookups out of the hot path. The existing re2j benchmarks did not exercise case-insensitive patterns, so add a new benchmark that executes a mostly ASCII regex pattern on a text containing a mix of ASCII and Unicode characters (generated using a Hungarian "lorem ipsum" text generator). Also add unit tests for the new equality comparison logic. Signed-off-by: Máté Szabó <mszabo@fandom.com>
784eac3
to
da63ef6
Compare
Codecov Report
@@ Coverage Diff @@
## master #159 +/- ##
==========================================
- Coverage 89.13% 89.06% -0.08%
==========================================
Files 19 19
Lines 3029 3037 +8
Branches 613 618 +5
==========================================
+ Hits 2700 2705 +5
- Misses 188 189 +1
- Partials 141 143 +2
Continue to review full report at Codecov.
|
Thanks! |
One of our production services uses re2j to match several hundred mostly
case-insensitive patterns of varying complexity against text.
We observed that approximately 12% of CPU time was being spent in
toLowerCase() as called from simpleFold(), due to the necessity of doing
at least one character data lookup per Inst.Rune in the common case that
the input rune being examined did not match the instruction.
As a fix, implement a method equalsIgnoreCase() that performs
Unicode-aware case-insensitive comparison between two runes, with a fast
path for the common case where both input runes are ASCII, and use it in
Inst for single-rune case-insensitive comparison. This takes character
data lookups out of the hot path.
The existing re2j benchmarks did not exercise case-insensitive patterns,
so add a new benchmark that executes a mostly ASCII regex pattern on a
text containing a mix of ASCII and Unicode characters (generated using
a Hungarian "lorem ipsum" text generator).
Also add unit tests for the new equality comparison logic.