Skip to content

Conversation

sjamesr
Copy link
Contributor

@sjamesr sjamesr commented Jun 6, 2020

This change makes RE2.match(CharSequence input, int start, int end, ...) treat input as extending from (0, input.size()] for the
purpose of zero-width assertions. Previously, input was considered to
extend from (start, end].

This is required when evaluating capturing groups. RE2/J re-matches the
capturing group within a previous successful match, e.g.

Matcher m = Pattern.compile("(he)llo").matcher("hello world");
m.find() -> true
m.group(0) -> "hello"
m.group(1) -> "he"

During the evaluation of the last statement, RE2/J re-evaluates the
pattern within group(0) (i.e. "hello"). Before this change, RE2/J would
consider the position after 'o' and before 'w' to be both end-of-text
and end-of-line. This would cause zero-width matchers (e.g. $) to
incorrectly match, generating erroneous group matches in some cases.

Fixes #96, see that issue for more
information.

@sjamesr sjamesr self-assigned this Jun 6, 2020
@sjamesr sjamesr requested review from adonovan and alandonovan and removed request for adonovan June 6, 2020 15:31
This change makes `RE2.match(CharSequence input, int start, int end,
...)` treat `input` as extending from `(0, input.size()]` for the
purpose of zero-width assertions. Previously, `input` was considered to
extend from `(start, end]`.

This is required when evaluating capturing groups. RE2/J re-matches the
capturing group within a previous successful match, e.g.

```
Matcher m = Pattern.compile("(he)llo").matcher("hello world");
m.find() -> true
m.group(0) -> "hello"
m.group(1) -> "he"
```

During the evaluation of the last statement, RE2/J re-evaluates the
pattern within group(0) (i.e. "hello"). Before this change, RE2/J would
consider the position after 'o' and before 'w' to be both end-of-text
and end-of-line. This would cause zero-width matchers (e.g. $) to
incorrectly match, generating erroneous group matches in some cases.

Fixes google#96, see that issue for more
information.
@sjamesr sjamesr merged commit 0a7c5df into google:master Jun 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Weird incorrect match for named group
2 participants