-
Notifications
You must be signed in to change notification settings - Fork 614
lint: add WARN for regex features that contain unescaped dot #2635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @mike-hunhoff, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
This pull request introduces a new linting rule to detect regular expressions that contain unescaped periods. The goal is to help prevent accidental misinterpretations of regular expressions where a literal period was intended but could be interpreted as a wildcard. The linting rule warns developers about such cases, suggesting they escape the period unless the wildcard behavior is intended.
Highlights
- New Linting Rule: A new linting rule,
FeatureRegexContainsUnescapedPeriod
, has been added toscripts/lint.py
to check for regular expressions with unescaped periods. - Warning for Unescaped Periods: The new linting rule generates a warning when it finds an unescaped period in a regular expression, recommending that the period be escaped unless it's meant to be a regex dot operator.
- Integration with Existing Linting Framework: The new linting rule is integrated into the existing linting framework by adding it to the
FEATURE_LINTS
tuple.
Changelog
- scripts/lint.py
- Imported the
Regex
feature fromcapa.features.common
. - Added a new linting class
FeatureRegexContainsUnescapedPeriod
to detect unescaped periods in regular expressions. - The
FeatureRegexContainsUnescapedPeriod
class checks for unescaped periods in regex features and provides a warning with a recommendation to escape the period. - Added the new linting rule
FeatureRegexContainsUnescapedPeriod
to theFEATURE_LINTS
tuple to include it in the linting process.
- Imported the
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Trivia time!
What is the origin of the term 'regex'?
Click here for the answer
The term 'regex' is short for 'regular expression', which comes from the mathematical theory called 'regular language'. This theory was developed in the 1950s by Stephen Cole Kleene, who is considered one of the founders of theoretical computer science.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add bug fixes, new features, breaking changes and anything else you think is worthwhile mentioning to the master (unreleased)
section of CHANGELOG.md. If no CHANGELOG update is needed add the following to the PR description: [x] No CHANGELOG update needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new lint check to warn about regular expressions containing unescaped periods. This is a good addition to help prevent false positives in capa rules. The implementation seems reasonable, but there are a few areas that could be improved for clarity and correctness.
Summary of Findings
- Regex pattern extraction: The logic for extracting the regex pattern from the feature value could be simplified and made more robust.
- Unnecessary return: The
check_features
method returnsFalse
early when no unescaped period is found, but this is unnecessary and can be removed. - Inconsistent escaping check: The logic for checking escaped periods is not consistent and may lead to missed cases or false positives.
Merge Readiness
The pull request introduces a valuable lint check. However, there are some issues that need to be addressed before merging. Specifically, the regex pattern extraction and the escaping check logic should be improved. I am unable to approve this pull request, and recommend that it not be merged until the high severity issues are addressed. Users should have others review and approve this code before merging.
CHANGELOG updated or no update needed, thanks! 😄
see mandiant/capa-rules#1028 for results |
sidebar: how was the AI review? helpful or spam? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice work
Co-authored-by: Willi Ballenthin <wballenthin@google.com>
Co-authored-by: Willi Ballenthin <wballenthin@google.com>
I found the code review portion to be useful and I vote we keep it around for now. The initial feedback gets you thinking and it was 50/50 on suggestions (1 was helpful, 1 was not). |
see mandiant/capa-rules#1026 for an example of how unescaped
.
can results in FPs. This adds a WARN for regular expressions that contain unescaped.
.