-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Closed
Labels
Description
What happens?
When trying to use regex to match specific character by providing unicode code it seems that non-breakable space (chr: 160) is converted to regular space (chr: 32).
The RE2 engine seems to supports this fine: https://regex101.com/r/7SjXN9/1
To Reproduce
with
data(wsc, zipcode) as (
values (32, '00' || chr(32) || '001'), (160, '00' || chr(160) || '001')
)
select *
from data
where 1=1
and regexp_matches(zipcode, '^00\x{00A0}001$')
and regexp_matches(zipcode, '^00\x{0020}001$')
OS:
Linux
DuckDB Version:
0.9.2
DuckDB Client:
CLI
Full Name:
Tomasz Taraś
Affiliation:
Orsted
Have you tried this on the latest main
branch?
I have tested with a release build (and could not test with a main build)
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
- Yes, I have