Skip to content

Conversation

dconeybe
Copy link
Contributor

@dconeybe dconeybe commented Jul 7, 2025

This PR optimizes the UTF-8 string comparison logic to improve performance and simplify the algorithm. It achieves this by directly comparing UTF-16 code units and handling surrogate pairs appropriately. The changes result in faster string comparisons without sacrificing correctness, with the added benefit of a simpler and more readable algorithm.

Highlights

  • Performance Improvement: Improves the performance of UTF-8 string comparison logic, bringing it back to near its original speed before a previous fix introduced a performance degradation.
  • Algorithm Simplification: The performance improvements also happily led to a simplification of the algorithm.
  • UTF-16 Code Unit Comparison: The comparison logic now directly compares UTF-16 code units for efficiency, leveraging the way UTF-8 and UTF-16 represent Unicode code points.
  • Surrogate Handling: The code handles surrogate pairs correctly, ensuring that strings containing surrogates are ordered appropriately relative to non-surrogate strings.

The semantics of the UTF-8 string comparison logic were originally fixed by #2275, but this fix caused a material performance degradation, which was then improved by #2299 The performance was, however, still suboptimal, and this PR further improves the speed back to close to its original speed and, serendipitously, simplifies the algorithm too.

This commit is a port of firebase/firebase-js-sdk#9143

The semantics of this logic were originally fixed by #2275, but this fix
caused a material performance degradation, which was then improved by #2299
The performance was, however, still suboptimal, and this PR further improves the
speed back to close to its original speed and, serendipitously, simplifies the
algorithm too.

This commit is a port of firebase/firebase-js-sdk#9143
@dconeybe dconeybe marked this pull request as ready for review July 7, 2025 20:21
@dconeybe dconeybe requested review from a team as code owners July 7, 2025 20:21
@dconeybe dconeybe added the owlbot:run Add this label to trigger the Owlbot post processor. label Jul 7, 2025
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Jul 7, 2025
@dconeybe dconeybe merged commit bc6a03e into main Jul 7, 2025
24 of 25 checks passed
@dconeybe dconeybe deleted the dconeybe/Utf8StringComparePerformanceFix branch July 7, 2025 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: firestore Issues related to the googleapis/nodejs-firestore API. size: m Pull request size is medium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants