Add linter for future reserved words #207

morgo · 2023-03-14T21:27:01Z

Thank you for your interest. Please refer to the Contributing link in the sidebar for guidance. Be sure to reference an existing issue number in your pull request description.

Fixes #120

I took the liberty of making it a warning. It seems useful to me.

evanelias · 2023-03-14T23:48:15Z

Thanks for the PR! I'll aim to review later this week.

The most recent test failure is just from the Dockerized db being flaky / dropping conns or timing out. Seems to be happening at random on GitHub Actions the past few weeks, although I can't ever repro it locally 🤷 I just kicked off a re-run, usually it works fine the second time. If not, I'll try again at a later hour.

coveralls · 2023-03-14T23:51:44Z

Coverage: 93.282% (+0.02%) from 93.263% when pulling da593a2 on morgo:lint-future-reserved-words into 94eaa99 on skeema:main.

evanelias · 2023-03-16T23:27:52Z

Thanks again for the PR!

One tricky wrinkle here is that whenever Skeema dumps a CREATE TABLE statement to the filesystem (skeema init, skeema pull, skeema format, etc), it uses the canonical format from SHOW CREATE TABLE, which means all table names and column names will be backtick-quoted. And some tables/columns may be using names that are already reserved words (for the current db version), and that's legal SQL due to the backtick-quoting.

So as a result of that, I believe this linter implementation would flag those too right? i.e. it's not strictly finding future reserved words. If so, I could see a couple paths forward:

(a) Keep the implementation as-is, but change the linter annotation wording, to not be about "later" releases / "forward-compatibility" etc

or

(b) Keep the focus on future reserved words by tracking the version which introduces each reserved word, and only flag ones which aren't already a reserved word in the user's current DB version

There's almost certainly other approaches I haven't considered, so also totally open to discussing alternatives.

Of these two, (a) is simpler but potentially noisier. In my experience, whenever a company had the misfortune of picking a reserved word for a table or column name long ago, the engineers tend to already be acutely aware of it -- from the constant need to use backticks around the name in ad-hoc SQL queries. So having Skeema complain about it further might just be frustrating. They could disable the linter rule to avoid that, but then they lose the benefit of future-proofing new tables and columns.

Similarly, regarding MySQL-specific vs MariaDB-specific reserved words: Some companies try to remain cross-compatible, but others likely don't ever plan to migrate between the two, so flagging the reserved words from the other DBMS may be viewed as noisy.

(b) feels more useful, but is trickier to implement. Because both MySQL version X and MariaDB version Y can hypothetically introduce the same reserved word, the data structure for tracking versions would likely need to be map[string][]tengo.Flavor -- that is, a mapping from (lowercased reserved word) to (slice of flavors that introduce this reserved word). For example, "except": []{tengo.FlavorMySQL80, tengo.FlavorMariaDB103}. One benefit though is that the list of reserved words to track is much smaller: anything appearing on both MySQL 5.5 and MariaDB 10.1's reserved word lists can be omitted entirely, since Skeema doesn't support versions older than that, so there's never a case where those ones are "future" reserved words.

If going with (b), the last arg to the checker function will be useful -- it's a linter.Options struct value, which contains a Flavor field indicating what dbms + version the user is actually running. The tengo.Flavor.Min() method can be used for comparisons.

I'll likely have some additional review comments later, depending a bit on which approach is taken.

morgo · 2023-03-20T14:41:01Z

I prefer (a) and renaming the error message. It is good practice even in current versions not to use reserved words. I think I have seen bugs from internal SQL (or tool SQL) statements tripping on reserved words, so even if the user's ORM supports it - if they had upfront knowledge they may not have chosen this path.

I see your point about being 'nagging' because renames are typically difficult though :(

I am not sure how practical it is to implement, but #160 seems like a better implementation of silencing nags.

evanelias · 2023-03-20T21:10:41Z

OK, sounds good. I might tweak it a bit post-merge to apply to names of other object types, but still tbd. I'll leave some review comments momentarily.

Re: #160, yeah agreed that would be ideal for silencing nags in the future, but in the near-term it just isn't practical to prioritize relative to some other more pressing feature requests.

internal/linter/check_reserved.go

This commit enhances lint-reserved-word (from #207) in several ways: * Non-table objects, such as stored procedures and functions, now have their names checked for reserved words as well. * Names will only be flagged by the linter if they match a reserved word in some version of your same DBMS vendor (MySQL vs MariaDB). That is, if you're running MySQL then only MySQL reserved words are considered; ditto for MariaDB. If you're migrating from one vendor to another, you can still perform cross-vendor linting by using workspace=docker and overriding the flavor on the command-line. For example, ahead of a MariaDB-to-MySQL8 migration, this command-line will lint using a MySQL 8 container -- switching the reserved word list flagging to use MySQL's list instead of MariaDB's: skeema lint --workspace=docker --docker-cleanup=destroy --flavor=mysql:8.0 * The linter annotation message now differs for current reserved words (already reserved in your DB version, or in a previous version) vs future reserved words (not reserved yet in your version, but will be if you upgrade). * Internally, the reserved word list is now tracked in the tengo subpackage, instead of the linter subpackage. This will likely be expanded in the future to also track regular non-reserved keywords, for purposes of improving the parser code (reducing ad hoc regex usage throughout the codebase) and also solving issues like #175 and #199.

evanelias · 2023-03-24T21:35:27Z

Just committed 4c2ddd0 on top, which reworked this a fair bit -- see that commit message for reasoning. To be clear, there was nothing wrong with your original implementation here, it was great and the PR is hugely appreciated 👍 It just made sense to make those changes on top, as they're in alignment with some other upcoming work.

morgo added 2 commits March 14, 2023 15:25

Add linter for future reserved words

2aaae0c

move mariadb oracle sql mode out of reserved list (breaks other tests)

bf7e4ae

evanelias self-requested a review March 20, 2023 21:21

evanelias requested changes Mar 20, 2023

View reviewed changes

morgo added 2 commits March 21, 2023 08:44

Incorporate reviewer feedback

d0db7a3

fix test

da593a2

morgo requested a review from evanelias March 21, 2023 15:08

evanelias approved these changes Mar 24, 2023

View reviewed changes

evanelias merged commit c914202 into skeema:main Mar 24, 2023

morgo deleted the lint-future-reserved-words branch May 12, 2023 14:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add linter for future reserved words #207

Add linter for future reserved words #207

Uh oh!

morgo commented Mar 14, 2023

Uh oh!

evanelias commented Mar 14, 2023

Uh oh!

coveralls commented Mar 14, 2023 •

edited

Loading

Uh oh!

evanelias commented Mar 16, 2023

Uh oh!

morgo commented Mar 20, 2023

Uh oh!

evanelias commented Mar 20, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

evanelias commented Mar 24, 2023

Uh oh!

Uh oh!

Add linter for future reserved words #207

Add linter for future reserved words #207

Uh oh!

Conversation

morgo commented Mar 14, 2023

Uh oh!

evanelias commented Mar 14, 2023

Uh oh!

coveralls commented Mar 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

evanelias commented Mar 16, 2023

Uh oh!

morgo commented Mar 20, 2023

Uh oh!

evanelias commented Mar 20, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

evanelias commented Mar 24, 2023

Uh oh!

Uh oh!

coveralls commented Mar 14, 2023 •

edited

Loading