Skip to content

Add MoonScript lexer #1091

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

omnivelociraptor
Copy link
Contributor

This change...

  • Generates a MoonScript lexer with the pygments2chroma_xml.py script using Pygments version 2.19.1
  • Replaces rexexp2-incompatible Python-style patterns with compatible ones:
    • Capture group pattern ?P<name> with .NET-style ?<name>
    • Named back reference ?P=name with ECMAScript-style \k<name>
  • Removes and, or, and not from the Keyword rule, since they are also listed under the OperatorWord rule
  • Fixes incorrect string escape rules discovered while writing the test data
  • Adds test data for the lexer

built with:

Package  Version
-------- -------
pip      25.0.1
Pygments 2.19.1
pystache 0.6.8
...in the generated MoonScript lexer.

The regex2 library only handles the Python-style capture group
`?P<name>` if a flag is set and it doesn't handle the Python-style named
back reference `?P=<name>` at all.

Replace '?P<name>' with .NET-style '?<name>' and '?P=<name>' with
ECMAScript-style '\k<name>', which regex2 parses
...because they are also listed under OperatorWord rule
Comment on lines +53 to +81
<state name="ws">
<rule pattern="(?:--\[(?&lt;level&gt;=*)\[[\w\W]*?\](\k&lt;level&gt;)\])"><token type="CommentMultiline"/></rule>
<rule pattern="(?:--.*$)"><token type="CommentSingle"/></rule>
<rule pattern="(?:\s+)"><token type="TextWhitespace"/></rule>
</state>
<state name="varname">
<rule><include state="ws"/></rule>
<rule pattern="\.\."><token type="Operator"/><pop depth="1"/></rule>
<rule pattern="[.:]"><token type="Punctuation"/></rule>
<rule pattern="(?:[^\W\d]\w*)(?=(?:(?:--\[(?&lt;level&gt;=*)\[[\w\W]*?\](\k&lt;level&gt;)\])|(?:--.*$)|(?:\s+))*[.:])"><token type="NameProperty"/></rule>
<rule pattern="(?:[^\W\d]\w*)(?=(?:(?:--\[(?&lt;level&gt;=*)\[[\w\W]*?\](\k&lt;level&gt;)\])|(?:--.*$)|(?:\s+))*\()"><token type="NameFunction"/><pop depth="1"/></rule>
<rule pattern="(?:[^\W\d]\w*)"><token type="NameProperty"/><pop depth="1"/></rule>
</state>
<state name="funcname">
<rule><include state="ws"/></rule>
<rule pattern="[.:]"><token type="Punctuation"/></rule>
<rule pattern="(?:[^\W\d]\w*)(?=(?:(?:--\[(?&lt;level&gt;=*)\[[\w\W]*?\](\k&lt;level&gt;)\])|(?:--.*$)|(?:\s+))*[.:])"><token type="NameClass"/></rule>
<rule pattern="(?:[^\W\d]\w*)"><token type="NameFunction"/><pop depth="1"/></rule>
<rule pattern="\("><token type="Punctuation"/><pop depth="1"/></rule>
</state>
<state name="goto">
<rule><include state="ws"/></rule>
<rule pattern="(?:[^\W\d]\w*)"><token type="NameLabel"/><pop depth="1"/></rule>
</state>
<state name="label">
<rule><include state="ws"/></rule>
<rule pattern="::"><token type="Punctuation"/><pop depth="1"/></rule>
<rule pattern="(?:[^\W\d]\w*)"><token type="NameLabel"/></rule>
</state>
Copy link
Contributor Author

@omnivelociraptor omnivelociraptor Jun 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MoonScript inherits these from the Lua lexer, but I don't think MoonScript uses them. I know MS doesn't use goto or label or multi-line comments, and I don't actually see any way for these states to be reached. I think they can be safely removed. The tests still pass if these lines are deleted because I didn't see any way to test them.

@alecthomas alecthomas merged commit 970eacc into alecthomas:master Jun 20, 2025
2 checks passed
@omnivelociraptor omnivelociraptor deleted the ccm-add-moonscript-lexer branch June 20, 2025 15:44
@alecthomas
Copy link
Owner

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants