Skip to content

Port/implement s.n.regex.Regexp hashCode(), equals(), & tests. #1698

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

LeeTibbert
Copy link
Contributor

@LeeTibbert LeeTibbert commented Aug 31, 2019

  • This PR ports re2j PR #79 "Add Regexp.hashCode and tests.", dated 2018-10-23,
    to Scala Native.

    My thanks and appreciation to the re2j developer @sjamesr for the
    re2j PR.

  • Regexp#equals existed in re2j when re2j was originally ported to SN
    but Regexp#hashCode did not. SN refused to compile with this
    mismatch, so the former was dropped from the port. Now that the
    pair is available in re2j, both routines has been ported & implemented.

  • The original scalanative.regex code used the default reference
    equality in Regexp. equals(). Changing Regexp#equals to use content
    equality necessitated a change in Simplify.scala to use reference
    equality in two critical places.

  • The original re2j code supported only the Perl syntax for
    named capture groups "(?P<foo?)". Java supports only a slightly
    different syntax "(?<foo?)". The original port to scalanative
    changed the code to support only the Java syntax.

    This PR changes Parser.scala to support both idioms. This allows
    two re2j test cases for equals() and hashCode() to execute and
    pass. The change was documented, see below.

    Two java.util.regex Suites were edited to allow the now supported
    Perl syntax.

  • The unit-test Suite RegexpHashcodeEqualsSuite.scala was added.

Documentation:

  • The standard changelog entry is requested.

  • A section was added to docs/lib/javalib.rst to describe some
    RE2 Perl expressions now available in java.util.regex. These
    are not strictly Java 8 conformant but are likely to be useful
    to developers porting Perl regexen. Both the Java and
    Perl regular expressions for named capture groups are available.

Testing:

  • Built and tested ("test-all") in release-fast mode using sbt 1.2.8 on
    X86_64 only . All tests pass.

  • A manual trial sphinx-build -b html of javalib.rst was done and
    the result inspected in a browser.

LeeTibbert pushed a commit to LeeTibbert/scala-native-fork that referenced this pull request Sep 1, 2019
  * This PR requires earlier PRs: scala-native#1397, scala-native#1693, and scala-native#1698.
    Travis CI will fail until those are merged.

  * This PR ports [re2j](https://github.com/google/re2j/) PR scala-native#49,
    "Named group support", dated 2018-03-04,
    commit d0ec5a7cfec67a08735b720a956c92d1440b3789 to Scala Native.

    My thanks and appreciation to the re2j developer @sjamesr for the
    re2j PR.

 * Some named group support was activated/implemented when re2j was
   originally ported to Scala Native.  This PR removes that code
   in favor to the quote "new" re2j named group support.  I believe
   there is no major functional change. As more fully described below,
   a few Exceptions and their messages have been changed to conform to Java 8.

   My hope is that using some close relative of the re2j code will
   help future devos, myself included, identify and track relevant changes in
   the re2j repository.

   Requiescat in Pace, sn.regex named group support, we hardly knew ye!

 * Deviations of note from re2j code:

      + Adjusted Exception thrown and message to match Java 8.
      The re2j message was better in the sense that it gave the
      name not found, but SN is trying to hew close to Java 8.

      + Converted from namedGroups from Map[String, Integer] to
      Map[String, Int] to avoid boxing values which are known to be Ints.

      + Re-worked the named group methods start(name), end(name), and
        group(name) to use a common subroutine rather than open coding.
	This makes it easier to get the lookup & exception handling
	right and consistent.

  * The unit-test points were ported/added to sn.regex.MatcherSuite,
    sn.regex.ParserSuite, and java.util.MatcherSuite. The majority
    of the testing of both Perl and Java syntax named group
    handling is done in sn.regex to stay close to the code which
    actually does the work.

Documentation:

  * The standard changelog entry is requested.

Testing:

  * Built and tested ("test-all") in release-fast mode using sbt 1.2.8 on
    X86_64 only . All tests pass.
LeeTibbert pushed a commit to LeeTibbert/scala-native-fork that referenced this pull request Sep 1, 2019
  * This PR requires earlier PRs: scala-native#1397, scala-native#1693, and scala-native#1698.
    Travis CI will fail until those are merged.

  * This PR ports [re2j](https://github.com/google/re2j/) PR scala-native#49,
    "Named group support", dated 2018-03-04,
    commit d0ec5a7cfec67a08735b720a956c92d1440b3789 to Scala Native.

    My thanks and appreciation to the re2j developer @sjamesr for the
    re2j PR.

 * Some named group support was activated/implemented when re2j was
   originally ported to Scala Native.  This PR removes that code
   in favor to the quote "new" re2j named group support.  I believe
   there is no major functional change. As more fully described below,
   a few Exceptions and their messages have been changed to conform to Java 8.

   My hope is that using some close relative of the re2j code will
   help future devos, myself included, identify and track relevant changes in
   the re2j repository.

   Requiescat in Pace, sn.regex named group support, we hardly knew ye!

 * Deviations of note from re2j code:

      + Adjusted Exception thrown and message to match Java 8.
      The re2j message was better in the sense that it gave the
      name not found, but SN is trying to hew close to Java 8.

      + Converted from namedGroups from Map[String, Integer] to
      Map[String, Int] to avoid boxing values which are known to be Ints.

      + Re-worked the named group methods start(name), end(name), and
        group(name) to use a common subroutine rather than open coding.
	This makes it easier to get the lookup & exception handling
	right and consistent.

  * unit-test points were ported/added:
      + sn.regex.MatcherSuite
      + sn.regex.ParserSuite
      + java.util.regex.MatcherSuite
      + java.util.regex.PatternSuite

    The majority of the testing of both Perl and Java syntax named group
    handling is done in sn.regex to stay close to the code which
    actually does the work.

Documentation:

  * The standard changelog entry is requested.

Testing:

  * Built and tested ("test-all") in release-fast mode using sbt 1.2.8 on
    X86_64 only . All tests pass.
Lee Tibbert added 2 commits May 15, 2020 15:05
  * This PR ports [re2j](https://github.com/google/re2j/) PR
    #[79](google/re2j#85)
    "Add Regexp.hashCode and tests.", dated 2018-10-23, to Scala Native.

    My thanks and appreciation to the re2j developer @sjamesr for the
    re2j PR.

 * Regexp#equals  existed in re2j was originally ported to SN
   but Regexp#hashCode did not. SN refused to compile with this
   mismatch, so the former was dropped from the port. Now that the
   pair is available in re2j, both routines has been ported & implemented.

 * The original scalanative.regex code used the default reference
   equality in Regexp. equals(). Changing Regexp#equals to use content
   equality necessitated a change in Simplify.scala to use reference
   equality in two critical places.

  * The original re2j code supported only the Perl syntax for
    named capture groups "(?P<foo?)". Java supports only a slightly
    different syntax "(?<foo?)". The original port to scalanative
    changed changed the code to support only the Java syntax.

    This PR changes Parser.scala to support both idioms.  This allows
    two re2j test cases for equals() and hashCode() to execute and
    pass.  The change was documented, see below.

    Two java.util.regex Suites were edited to allow the now supported
    Perl syntax.

  * The unit-test Suite RegexpHashcodeEqualsSuite.scala was added.

Documentation:

  * The standard changelog entry is requested.

  * A section was added to `docs/lib/javalib.rst` to describe some
    RE2 Perl expressions now available in `java.util.regex`. These
    are not strictly Java 8 conformant but are likely to be useful
    to developers porting Perl regexen. Both the Java and
    Perl regular expressions for named capture groups are available.

Testing:

  * Built and tested ("test-all") in release-fast mode using sbt 1.2.8 on
    X86_64 only . All tests pass.

  * A  manual trial `sphinx-build -b html ` of javalib.rst was done and
    the result inspected in a browser.
@LeeTibbert LeeTibbert force-pushed the PR_sn_regex_Regexp_HashCode branch from 7d5f0bf to 59fff77 Compare May 15, 2020 20:42
@LeeTibbert
Copy link
Contributor Author

proactively rebased, may need to do it again after review.
Travis CI is all green.

I believe that a number of other PRs depend upon this one. I
propose that I rebase them after this one is merged, in the fullness of time.

jchyb pushed a commit to jchyb/scala-native that referenced this pull request Oct 5, 2021
  * This PR requires earlier PRs: scala-native#1397, scala-native#1693, and scala-native#1698.
    Travis CI will fail until those are merged.

  * This PR ports [re2j](https://github.com/google/re2j/) PR scala-native#49,
    "Named group support", dated 2018-03-04,
    commit d0ec5a7cfec67a08735b720a956c92d1440b3789 to Scala Native.

    My thanks and appreciation to the re2j developer @sjamesr for the
    re2j PR.

 * Some named group support was activated/implemented when re2j was
   originally ported to Scala Native.  This PR removes that code
   in favor to the quote "new" re2j named group support.  I believe
   there is no major functional change. As more fully described below,
   a few Exceptions and their messages have been changed to conform to Java 8.

   My hope is that using some close relative of the re2j code will
   help future devos, myself included, identify and track relevant changes in
   the re2j repository.

   Requiescat in Pace, sn.regex named group support, we hardly knew ye!

 * Deviations of note from re2j code:

      + Adjusted Exception thrown and message to match Java 8.
      The re2j message was better in the sense that it gave the
      name not found, but SN is trying to hew close to Java 8.

      + Converted from namedGroups from Map[String, Integer] to
      Map[String, Int] to avoid boxing values which are known to be Ints.

      + Re-worked the named group methods start(name), end(name), and
        group(name) to use a common subroutine rather than open coding.
	This makes it easier to get the lookup & exception handling
	right and consistent.

  * unit-test points were ported/added:
      + sn.regex.MatcherSuite
      + sn.regex.ParserSuite
      + java.util.regex.MatcherSuite
      + java.util.regex.PatternSuite

    The majority of the testing of both Perl and Java syntax named group
    handling is done in sn.regex to stay close to the code which
    actually does the work.

Documentation:

  * The standard changelog entry is requested.

Testing:

  * Built and tested ("test-all") in release-fast mode using sbt 1.2.8 on
    X86_64 only . All tests pass.

Add a probabilistic fix to "stack safe" test.

This is a micro-edit point fix. A better comprehensive, deterministic fix
is being exercised in work-in-progress for a future PR.

Prior to this commit, the "stack safe" test could intermittently
fail with duplicate groups. This fix increases the size of the string
to drastically reduce the chance of a duplicate name.

Note that this code to generate a group name can still generate, with
a 10 in 62 probability, a digit for the first character of the group
name. Such a name is not a valid Java 8 group name and will cause
the test to fail intermittently and in a hard to debug manner.

Again, a better fix is on the way. Both fixes are in a future PR
to simplify the merge process.
jchyb pushed a commit to jchyb/scala-native that referenced this pull request Oct 7, 2021
  * This PR requires earlier PRs: scala-native#1397, scala-native#1693, and scala-native#1698.
    Travis CI will fail until those are merged.

  * This PR ports [re2j](https://github.com/google/re2j/) PR scala-native#49,
    "Named group support", dated 2018-03-04,
    commit d0ec5a7cfec67a08735b720a956c92d1440b3789 to Scala Native.

    My thanks and appreciation to the re2j developer @sjamesr for the
    re2j PR.

 * Some named group support was activated/implemented when re2j was
   originally ported to Scala Native.  This PR removes that code
   in favor to the quote "new" re2j named group support.  I believe
   there is no major functional change. As more fully described below,
   a few Exceptions and their messages have been changed to conform to Java 8.

   My hope is that using some close relative of the re2j code will
   help future devos, myself included, identify and track relevant changes in
   the re2j repository.

   Requiescat in Pace, sn.regex named group support, we hardly knew ye!

 * Deviations of note from re2j code:

      + Adjusted Exception thrown and message to match Java 8.
      The re2j message was better in the sense that it gave the
      name not found, but SN is trying to hew close to Java 8.

      + Converted from namedGroups from Map[String, Integer] to
      Map[String, Int] to avoid boxing values which are known to be Ints.

      + Re-worked the named group methods start(name), end(name), and
        group(name) to use a common subroutine rather than open coding.
	This makes it easier to get the lookup & exception handling
	right and consistent.

  * unit-test points were ported/added:
      + sn.regex.MatcherSuite
      + sn.regex.ParserSuite
      + java.util.regex.MatcherSuite
      + java.util.regex.PatternSuite

    The majority of the testing of both Perl and Java syntax named group
    handling is done in sn.regex to stay close to the code which
    actually does the work.

Documentation:

  * The standard changelog entry is requested.

Testing:

  * Built and tested ("test-all") in release-fast mode using sbt 1.2.8 on
    X86_64 only . All tests pass.

Add a probabilistic fix to "stack safe" test.

This is a micro-edit point fix. A better comprehensive, deterministic fix
is being exercised in work-in-progress for a future PR.

Prior to this commit, the "stack safe" test could intermittently
fail with duplicate groups. This fix increases the size of the string
to drastically reduce the chance of a duplicate name.
jchyb pushed a commit to jchyb/scala-native that referenced this pull request Oct 7, 2021
  * This PR requires earlier PRs: scala-native#1397, scala-native#1693, and scala-native#1698.
    Travis CI will fail until those are merged.

  * This PR ports [re2j](https://github.com/google/re2j/) PR scala-native#49,
    "Named group support", dated 2018-03-04,
    commit d0ec5a7cfec67a08735b720a956c92d1440b3789 to Scala Native.

    My thanks and appreciation to the re2j developer @sjamesr for the
    re2j PR.

 * Some named group support was activated/implemented when re2j was
   originally ported to Scala Native.  This PR removes that code
   in favor to the quote "new" re2j named group support.  I believe
   there is no major functional change. As more fully described below,
   a few Exceptions and their messages have been changed to conform to Java 8.

   My hope is that using some close relative of the re2j code will
   help future devos, myself included, identify and track relevant changes in
   the re2j repository.

   Requiescat in Pace, sn.regex named group support, we hardly knew ye!

 * Deviations of note from re2j code:

      + Adjusted Exception thrown and message to match Java 8.
      The re2j message was better in the sense that it gave the
      name not found, but SN is trying to hew close to Java 8.

      + Converted from namedGroups from Map[String, Integer] to
      Map[String, Int] to avoid boxing values which are known to be Ints.

      + Re-worked the named group methods start(name), end(name), and
        group(name) to use a common subroutine rather than open coding.
	This makes it easier to get the lookup & exception handling
	right and consistent.

  * unit-test points were ported/added:
      + sn.regex.MatcherSuite
      + sn.regex.ParserSuite
      + java.util.regex.MatcherSuite
      + java.util.regex.PatternSuite

    The majority of the testing of both Perl and Java syntax named group
    handling is done in sn.regex to stay close to the code which
    actually does the work.

Documentation:

  * The standard changelog entry is requested.

Testing:

  * Built and tested ("test-all") in release-fast mode using sbt 1.2.8 on
    X86_64 only . All tests pass.

Add a probabilistic fix to "stack safe" test.

This is a micro-edit point fix. A better comprehensive, deterministic fix
is being exercised in work-in-progress for a future PR.

Prior to this commit, the "stack safe" test could intermittently
fail with duplicate groups. This fix increases the size of the string
to drastically reduce the chance of a duplicate name.
@WojciechMazur
Copy link
Contributor

Rebased and merged in #2407, thank you for providing the original solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants