-
Notifications
You must be signed in to change notification settings - Fork 104
Foreign key lint detectors #79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1. `non-unique-fk-ref` InnoDB allows FK references to non-unique keys in the parent table but it is advised against in the mysql reference manual. 2. `fk-missing-parent-table` InnoDB lets you create a FK which references a non-existent table (with FOREIGN_KEY_CHECKS=0), added a detector for that. It doesn't seem to allow a reference to an existing table but non-existent column though, so not checking that. 3. `duplicate-fk` Detects if there are several FK definitions that are identical (except for their names).
Thank you for the PR! The detector logic looks good at a high level. I'm going to need to keep this PR open for a few weeks though, while I land some other changes on my end:
Thanks again, and apologies in advance for the delay! |
Sounds great! As I mentioned before I've looked into other FK checks as well, e.g.
Feel free to ping me if you want some feedback!
Most likely it is. I'm using go 1.12.6, the CI system uses 1.9.7. I think there have been a few tweaks to the formatting rules in between. |
I definitely plan on giving modules a try soon.
I'm hesitant to have too many separate FK-related checks, since many MySQL deployments don't use FKs at all (including all of the largest companies), and each new detector entails configuration and documentation overhead. But perhaps some of these could be combined into a single detector, which sanity-checks FK references in general. Along those lines, how would you feel about combining Part of my thinking is that users probably would want to enable all of these FK reference checks or none, so the granularity of separate detectors may be unnecessary, from a configuration complexity standpoint. That said, keeping the duplicate-FK detector separate may make sense, since that one isn't specifically about reference sanity-checking, and wouldn't share code / has a different looping setup anyway.
Thanks! At a high level, two things I'm still undecided on:
|
I guess I'm coming from different type of companies use-case, the databases I work with (and have worked with previously at other companies) make heavy use FK's and rely on it to keep the data consistent (as opposed to having application-level checks). But I see your point about added cost in terms of maintenance etc., building larger detectors might alleviate this, yes.
Actually, I think the fk-missing-parent-table and non-unique-fk-ref should be separate passes. From a user-perspective maybe there could be 2 options to enable/disable these types of FK lint errors. Not sure I explained that very well, would have to think it through a bit more :)
Yes, I think this is the right way to go, it's how I expected it to work when I first looked at skeema :)
That would be a nice feature, but as you note about the FK I guess it could be tricky to do because you need to figure out all dependencies. e.g. for the |
All of the larger MySQL shops avoid (and often outright ban) foreign keys for three major reasons:
An additional consideration, although not as significant as the above, is that foreign key error messages and error codes are often overly vague in MySQL. Skeema was designed with sharding and OSC tools in mind as first-class concerns. Unfortunately that inherently means FKs have not been a primary concern. Full support for all aspects of FKs -- meaning mapping out the dependency chains to run ordered DDL without With that in mind, I'd like to avoid having too many config options regarding foreign keys. That could give users the mistaken impression that all aspects of FKs are fully supported, and/or that Skeema is very focused on FKs. As a professional database consultant there have been a number of cases where I've helped clients remove their FKs to solve performance problems, and having a ton of FK-specific support in my tooling just won't make sense from a business perspective. That's not to say I'm opposed to adding anything FK related; just that I'd like to avoid having 3+ new lint detectors / config options that are all FK related at this time, especially while the total number of lint detectors overall is so small. |
Thanks for the detailed explanation! I wasn't aware of the lack of support with the OSC tools, that's good to know.
As the FK linting isn't useful for your intended users I think it makes most sense to close this PR without merging, particularly since it could have negative value for your business (it will add maintenance cost). I will probably try to keep my fork in sync with upstream and add more FK linting over time as I need it. |
Makes sense, but I'd be happy to keep the PR open a while. Up to you, but here's my thinking: Other users may have similar FK-related linting needs, and this PR could serve as a good place to centralize discussion. If there's sufficient interest among users, I would definitely reprioritize FK support accordingly. In the near term -- over the next few weeks, once the v1.3 linter refactor is done -- I'm still totally open to merging one or two FK-related detectors from here. It's just a question of which ones or whether things are combined. (That v1.3 refactor will necessitate some minor code changes here, but it will help make the integration testing easier, so should be a net win.) Longer term, once more detectors are in place in general, having additional FK-related detectors will be more feasible. I just want to avoid a situation in the near-term where like half of the total lint detectors are FK-related, while the rest of Skeema is less FK-focused. I also plan to look into the feasibility of using dynamically-loaded Golang plugins to support arbitrary linter extensions. Ideally that could allow you to support a larger library of FK checks smoothly, without even needing to use a fork. From what I understand, there are currently some compile-time headaches with Go plugins, but I'm eager to see if Go 1.13 improves the situation. |
Now that v1.3 has been released I'm going to close this PR, since the interfaces related to linter checks have all changed substantially. The good news is that the v1.3 interfaces are designed to make it much easier to create and maintain new linter checks, regardless of whether submitting them upstream or keeping them as a private fork / patchset. In the coming weeks I'll write a thorough doc covering how to code new linter checks. The brief version of what's changed:
|
So I know this PR is from 5 years ago, but I think it's finally the right time to include a "parent side of FK needs a unique index" linter check 😄 Why now? Because lately MySQL is moving towards making that a requirement for FKs in InnoDB -- see the foreign key section of https://www.skeema.io/blog/2024/05/14/mysql84-surprises/ However, a lot has changed in Skeema's codebase in 5 years. And GitHub won't even let me reopen this PR as-is anyway, since we switched from master to main for the default branch a long while back, so the merge target of master no longer exists. @johlo I'm not sure if you're still using Skeema or MySQL (or even GitHub), and definitely have no expectations that you'd want to rework any of this old code on your end. However, I want to make sure you get credit for the idea and initial implementation. So my current thinking (unless you object) is that I'll use some Git hackery to include your commits here in a branch, and then add some commits on top to rework/modernize and avoid a million merge conflicts. A few notes, mostly just for myself:
|
*** This is a work-in-progress commit, which will be rebased/amended ***
*** This is a work-in-progress commit, which will be rebased/amended ***
*** This is a work-in-progress commit, which will be rebased/amended ***
This commit adds a new linter check/option, lint-fk-parent, which detects two types of issues with the parent-side table of a foreign key constraint: * Parent table does not exist * Parent table lacks a unique constraint covering the exact referenced columns The latter problem is especially relevant in MySQL 8.4. For more information see https://www.skeema.io/blog/2024/05/14/mysql84-surprises/#foreign-key-restrictions This linter check does not ever flag foreign keys which extend across schema (database) boundaries, since Skeema's workspace model only operates on a single schema at a time. The logic for this linter check was originally inspired by pull request #79 from 2019. This new commit is being merged to main along with preceding commit db33690 from that PR, in order to assign credit to the author; however this commit must fully rewrite/remove that older commit due to underlying code structure changes in the past five years. Also, it was not possible to merge that old PR directly today due to change of Skeema's default branch name. Separate from the new linter check, this commit also makes some minor tweaks to the linter package: * Old test logic relating to the no-longer-relevant MariaDB server variable innodb_read_only_compressed has been removed; this variable isn't enabled by default in any MariaDB release since Feb 2022 * Linter logic will now panic upon a duplicate attempt to register the same check name, as this indicates programmer error / copypasta * lint-has-fk will now properly set the annotation line number even if multiple spaces occur between the keywords FOREIGN and KEY.
Hi Evan, |
This PR adds 3 foreign-key related lint detectors:
non-unique-fk-ref
InnoDB allows FK references to non-unique keys in the parent table but it is advised against in the mysql reference manual.
Personally I came across this as I had issues with code generation tools (from schema to code) that didn't handle our FK references to non-unique keys.
A lint warning should be useful.
fk-missing-parent-table
InnoDB lets you create a FK which references a non-existent table (with FOREIGN_KEY_CHECKS=0), added a detector for that.
It doesn't seem to allow a reference to an existing table but non-existent column though, so not checking that.
duplicate-fk
Detects if there are several FK definitions that are identical (except for their names).
Some notes:
FKs, but it could still be interesting to check that they are valid in the linter? e.g. if FKs are used for documentation purposes.
getCoveringKey(table *tengo.Table, fkColumns []*tengo.Column) *tengo.Index
is useful for other code as well and should be moved to thetengo
package as a method ontengo.Table
.