-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
perf: extract regex patterns #29620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: extract regex patterns #29620
Conversation
|
I also followed up on it in the warning itself, but I believe the CodeQL is inaccurate. The pattern is the exact same, the only difference is the placement. And given that marking this as ready for review inherently notifies you, I decided not to mention you additionally. I assume you would like ui-templates to be left unchanged? I didn't make any changes there yet because I remember from my other perf PR that you decided it is not a concern |
Hey, Pooya! The benchmark you made wouldn't show any noticeable improvements because it's a single iteration, and that wasn't the point of this PR. You can see that even in this PR I left many inline patterns untouched because it only compiles once, and therefore performance there won't be increased. However, this PR is for loops/hot code paths (like most of these patterns are) as that's where the performance will really improve. It is also logical, because each iteration will encounter the same reference to the pattern as opposed to a brand new pattern to be recompiled I will be able to provide a more thorough benchmark soon Edit: I have provided a reproduction based on your example. It's also expected that given this is a fairly simple pattern, its relative performance improvements would be less than that of more complicated patterns, like some of those I covered in the PR, especially when the loop uses multiple regex patterns (like the route key which has 3 patterns per iteration) |
@GalacticHypernova On the benchmark with 100,000 iterations, the performance improvement is 1.1-1.5ms. That seems negligible, and possibly within the margin of error of any test. It's so tricky - with tiny improvements like this on the one hand there's no real downside, on the other I wouldn't be surprised if the engine could optimise this kind of thing itself - and will we rewrite things back in three months if/when it does? Also, are there any readability costs here? |
@danielroe Allow me to begin by saying that the margin of error is the .4 ms difference. The benchmark was ran multiple times and the improvement was consistent. This specific benchmark indeed shows a relative small improvement, but it's important to remember it's a simple benchmark with a simple pattern. In real world cases, a lot more is happening in parallel with much more complex patterns, which will slow everything down. Consider for example the build process. Nuxt builds do way more than just running regex, and use way more complicated patterns with capturing groups, lookaheads/lookbehinds, atomic groups, and more. The readability here is also not overly impacted, because the change is not that big, it's just pre-compiling the regex, assigning it to a variable, and then using the variable in stead of the inline pattern. Now, can the engine optimize it on its own? Unfortunately, it can't, and that's why this PR is essential for resource-limited environments: I decided to test the actual memory differences: This was taken multiple times (as always) with a memory profiler in the Devtools of Microsoft Edge, so the results are reliable given that they are consistent. This includes objects discarded both by Major and Minor GC, which can be selected in the Devtools' You might be wondering why I didn't use the
The results of the memory profiling indicate that there is over 50% decrease in memory usage when using the variable approach, from 5MB to 2MB, and that's only in this simple benchmark. On weaker devices or outdated browsers that use less powerful engines, this is especially significant, but it would also very likely be noticeable on stronger devices. Is there anything you're particularly concerned about that you would like me to address? P.S. I hope this wasn't too much text. I wanted to explain it in as much detail as I could to highlight the importance of such changes, which don't end solely on runtime performance, but also have a huge impact on the memory consumption, which is more often than not the main bottleneck in resource-constrained environments and a hugely neglected aspect in performance. In addition, these seemingly small changes could quickly accumulate in real world use cases when at scale (for example, in performance-critical apps), providing more benefit than a simple, targeted benchmark could ever demonstrate. |
let's give it a go! |
Thanks! I'm sure this will have a great impact |
🔗 Linked issue
📚 Description
Extracting regex patterns avoid the need to recompile them on each use, especially in loops or hot code paths.
This, in return, will provide runtime performance improvements with faster-executing code, and memory efficiency improvements with less object creation overhead and GC pressure with inline usage.
Benchmark (taken from Pooya's example, each part ran separately to prevent any sort of inline cache from altering the performance results):
Results:
Browser:
Node:
It also shows a 50% decrease in memory usage in this simple benchmark alone, and has been tested using a memory profiler: