Skip to content

Conversation

zhouliqi
Copy link
Contributor

@zhouliqi zhouliqi commented Nov 5, 2022

It supports unicode. #5156

Copy link
Contributor

@Tishj Tishj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, I just have one little remark, otherwise it looks good to me :)

auto codepoint_haystack = Utf8Proc::UTF8ToCodepoint(input_haystack, sz);
if (to_replace.count(codepoint_haystack) != 0) {
Utf8Proc::CodepointToUtf8(to_replace[codepoint_haystack], c_sz, c);
result.insert(result.end(), c, c + c_sz);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we can make a pretty decent estimation as to what the size of result will be, can we add a result.reserve(..) to this before this loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Reserve a enough size for the result to avoid allocate memory frequently.

@Mytherin Mytherin changed the base branch from master to feature November 21, 2022 09:25
@Mytherin Mytherin merged commit d8ee87a into duckdb:feature Nov 21, 2022
@Mytherin
Copy link
Collaborator

Thanks!

@jjerphan jjerphan mentioned this pull request Dec 4, 2022
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants