-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Issue #6882: REGEXP_EXTRACT Capture Groups #6918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add a new variant that takes a list of field names and produces a STRUCT of all the capture groups.
Fix release unused variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just had a comment about an optimization we could do
(could be considered not an optimization as well, in the case where you have a really big string and the groups are only a small portion of the string, then keeping this string alive needlessly might be wasteful)
Reuse the input string memory instead of allocating new buffers for each group column.
A good point, but I have to say that having worked on a text shaping engine, you don't usually end up with a lot of filler data, so I think reuse is the way to go. |
Switch to CreateStringPiece to make sure strings get created correctly.
Fix formatting from merge.
Use the new safe casting infrastructure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Looks good. Some comments:
CR feedback: Case-insensitivity, validity masks, more tests.
More validity masks.
Fix result output for vsize=2
Thanks! LGTM |
Add a new variant that takes a list of field names and produces a STRUCT of all the capture groups.