-
Notifications
You must be signed in to change notification settings - Fork 499
Closed
Description
Explainer: https://github.com/tc39/proposal-regexp-modifiers
Spec text: https://tc39.es/proposal-regexp-modifiers/
regexp-modifiers testing plan
- Syntax errors: Throw both in parsing and when constructed with new Regexp("...")
- Basic regular expression flags (n.b. source text refers to matched text for "regular expression flags" production of grammar)
- Source text contains other code points than
i
,m
,s
- Source text contains combining codepoints alongside
i
,m
,s
- Source text contains other non-display codepoints alongside
i
,m
,s
- Source text contains ZWNJ, ZWJ, ZWNBSP alongside
i
,m
,s
(I think this is right? https://tc39.es/ecma262/#sec-unicode-format-control-characters) - Source text contains
i
,m
, and/ors
more than once - Source text in a case-ignoring context contains code points that case fold to
i
,m
,s
e.g.I
,M
,S
- Source text contains code points outside the basic latin range that, were they canonicalized by a unicode-mapping regex, would map to e.g.
i
,m
, ors
(e.g. ſ (U+017F) would map tos
, U+0130 toi
) (ref. https://www.unicode.org/Public/12.1.0/ucd/CaseFolding.txt)- (e.g.
/foo(?\u{017F}:bar)/u
is a syntax error,/foo(?s:bar)/u
is not)
- (e.g.
- Source text contains other code points than
- Arithmetic regular expression flags
- First or second source text exhibits any of the 'basic regular expression flags' errors
- Both source texts are empty
- Code point matched by first flags is also contained in source text matched by second flags
- Various forms of (?ims-ims) - no colon - is a syntax error
- Source text cannot use unicode escape sequences to express code points
i
,m
,s
- Basic regular expression flags (n.b. source text refers to matched text for "regular expression flags" production of grammar)
- Valid syntax
- Basic regular expression flags parse correctly
- Source text with any valid combination of flags or arithmetic flags - reasonable to enumerate
- Behavior
- Disabling flag in subexpression behaves correctly when corresponding top-level flag is and isn't already set
- Enabling flag in subexpression behaves correctly when corresponding top-level flag is and isn't already set
- Constructing a RegExp from a literal but changing flags by an argument to the RegExp constructor does (or does not) correctly change behavior of a subexpression that enables or removes flags.
i
- Ignore case applies appropriately inside subexpression, but not outside; when turned on, off, and when nested inside a subexpression that has previously modified behavior
- Behavior as normal when other flags modified but
i
flag not modified - Callers of Canonicalize:
- Backreferences ignore case in captures
- Individual characters ignore case
- Character sets ignore case
- Character escapes ignore case
- Character class escapes ignore case
- \w class, \b, \B all ignore case
m
-
^
and$
apply appropriately inside subexpression, but not outside; when turned on, off, and when nested inside a subexpression that has previously modified behavior
-
s
-
.
applies appropriately inside subexpression, but not outside; when turned on, off, and when nested inside a subexpression that has previously modified behavior
-
- Subexpressions with flags set do not cause RegExp()...
.flags
or/.../.flags
to have the flags set, e.g.(new RegExp("(?i:a)")).flags
does not includei
. - ^ for RegExp.prototype.dotAll, .multiline, ignoreCase