Apply ruff/flake8-comprehensions rule C420 #13284

DimitriPapadopoulos · 2025-03-17T08:38:04Z

C420 Unnecessary dict comprehension for iterable; use `dict.fromkeys` instead

Starting with ruff 0.10.0, the above rule has been stabilized and is no longer in preview.

C420 Unnecessary dict comprehension for iterable; use `dict.fromkeys` instead Starting with ruff 0.10.0, the above rule has been stabilized and is no longer in preview.

ichard26 · 2025-03-20T20:37:38Z

I looked into the reasoning for this rule. The reasons are A) readability, and B) efficiency. I contest the efficiency argument as both approaches will be implemented at the C level, thus the per item overhead should be similar, but the method approach has to pay the expensive overhead of the method lookup and call.

But sure, why not. I have never used the dict.fromkeys method, but I do like it better having seen it.

pfmoore · 2025-03-20T21:01:40Z

I'm not going to argue here, as the change is now merged, but I prefer the comprehension. The fromkeys method is fine, but it took me a while to work out that the extra argument was the value used for each entry (and I wouldn't have worked it out without looking at the comprehension that it was replacing).

I would like to request that we don't simply add ruff rules because they "have been stabilized and are no longer in preview".

We should stick with the list of rules we have enabled, for stability, and only add new ones when there's demonstrated benefits (preferably batching up such changes to do them in one combined PR). I don't want to re-introduce the practice of having endless debates over every style choice, but equally I don't want our style rules changing unpredictably, based purely on "what ruff changed this week" 🙁

ichard26 · 2025-03-20T21:31:51Z

I would like to request that we don't simply add ruff rules because they "have been stabilized and are no longer in preview".

I took the "out of preview" comment to mean that this rule is enabled by default in the upcoming Ruff 0.10.0 release, thus at some point we're going to have to adopt this change (whenever pre-commit.ci sends us an update PR, really). We're just accepting the change early. I'm not really a fan of early PRs like these since it's pointless busy work, but rejecting them is also work.

I agree that I should've left this open for debate. I would revert the change, but the eventual Ruff 0.10.0 release would undo that, so that's actually not helpful. I'd be fine with a loose policy to not accept linting PRs unless it's a linting tool update or a well-reasoned change (as you've mentioned).

Sorry :(

notatallshaw · 2025-03-20T21:34:41Z

This rule will have been applied next time ruff got updated anyway because this rule group is enabled:

pip/pyproject.toml

Line 175 in 0285e1a

"C4",

Ruff continues to go through rapid development: https://github.com/astral-sh/ruff/releases. So if we have broad rule groups enabled and we want to update ruff we will have to continue to accept PRs that are stabilized in these rule groups or PRs that disable specific rules.

We could be more selective on what rules are enabled or more carefully going through the release notes. But given the development pace and the size of the rule list (https://docs.astral.sh/ruff/rules/) I actually think it's less effort to be reactive than proactive here.

FWIW, I'm exactly 50-50 on this rule, but I'm pretty familiar from fromkeys, and I find it a little easier to spot that a constant values is being used vs. in a comprehension, but were this PR not rule driven I would have considered this unnecessary churn.

This reverts commit 0285e1a.

ichard26 · 2025-03-20T21:41:35Z

See #13294. I'm personally +0 on the rule so given the (I assume) -1 expressed here, I'd totally be fine with ignoring this rule specifically. It is surprising they stabilized this rule.

pfmoore · 2025-03-20T22:14:07Z

Thanks. I'm fairly strongly against blindly accepting ruff's rules changes, especially given the rapid pace of change and their tendency to include questionable rules preferring particular code constructs over others (such as this one). IMO, that goes way beyond the linting remit that I thought we'd agreed on when we first decided to implement auto-linting and autoformatting.

I don't have a good suggestion for how to avoid this, though - I have no appetite for debating rules every time ruff releases a new version, nor do I want to go through each rule that's currently available. If there were a way to programmatically take our existing rule configuration and generate an explicit list of every rule we currently apply, which we could then "freeze" in our config, I'd be OK with doing that. But I'm not expecting anyone else to do that work, and I don't have the motivation to do it myself, so I guess we're stuck 🙁

It is surprising they stabilized this rule.

I don't see a problem with them stabilising the rule (assuming that just means "make it not-experimental"). What I do see a problem with is putting it in a group of "unnecessary X" constructs. Choosing a comprehension rather than fromkeys isn't "unnecessary", it's a style choice, and one where there isn't necessarily a single right answer - which is better depends on context.

Sigh. At this point we're right back to why I always had reservations about autoformatters.

I'm just going to drop this now, with a final note that these sort of inflexible style rules make working on pip's codebase a lot less appealing to me 🙁

@ichard26 Thanks for the reversion PR, and I apologise for dragging you into one of my pet peeves 🙂

ichard26 · 2025-03-20T22:55:21Z

Sigh. At this point we're right back to why I always had reservations about autoformatters.

As a former maintainer of Black, I'm going to be biased, but I generally think this is unavoidable. I actually don't really like Black's code style either, but it's acceptable enough that I'll use it to avoid the endless code formatting debates (e.g., single vs double quotes). For solo projects, I've stopped using Black. This is a discussion that has been ligitated ad nauseam, so I'll leave it at that.

For linting, OTOH, there is still room for debate because linters are insanely configurable. Also, when we do debate over a style rule, it's probably for a good reason (e.g., how should this logic be structured?). We may not (probably won't?) agree, but the result will be more readable code. I realize code formatting is also a part of readability, but I think Black's style is acceptable enough to be readable (even if it's admittedly rather ugly).

I'm just going to drop this now, with a final note that these sort of inflexible style rules make working on pip's codebase a lot less appealing to me 🙁

I'm willing to investigate our linting story and clamp down on the linting rules we have enabled. I'm pragmatic and if our linting configuration is actually making it less desirable to contribute, then it's a net negative. Would you appreciate that? I'm not willing to debate over using an autoformatter and (I'm not really sure if there is room to be more pragmatic), but I'd be happy to take a proper look at the linting we do.¹

Finally, I want to acknowledge that I haven't been around for long enough so I guess I still have the energy to talk about this (🙃). I don't know what the discussions over black/mypy/flake8/ruff have looked like in the past. If you really do not want to discuss this further, then please don't reply. I'm trying to find something actionable/a path forward even if we clearly disagree in certain respects.

Actually, I'd assume that the mypy check is what bothers you the most, but unfortunately for us, I do like using type annotations even if they are annoying at times. This isn't something we're going to agree on, thus the status quo will prevail. ↩

pfmoore · 2025-03-20T23:40:15Z

Thanks for the offer. I would appreciate it if we took a bit more control of the linting rules we use. Originally we used flake8 (I think) and had a fairly limited set of rules. I proposed switching to ruff, largely for the speed and the unified interface if I recall. Unfortunately, since then, ruff's list of rules has grown enormously, and seems to have gone way beyond what I'd consider to be "linting" (this rule, and a lot of the pyupgrade ones, being examples of that). I feel bad about this, because I feel like I triggered the introduction of a bunch of rules that I frankly don't agree with.

I'm happy to offer opinions on rules, but I don't think it's productive to get into debates - trying to justify things like "I prefer the comprehension over using fromkeys" is exhausting and pointless (it's a matter of preference, after all). Generally, I'd just prefer fewer, less controversial rules over a "take everything" approach. And in case it's not obvious, I'd hope that by working on reducing the number of rules we use, we'd avoid needing to fix existing code - which is another reason I prefer limiting what we use.

As for type annotations, I don't dislike them. I'm concerned about the risk of overspecifying types, but that's more for libraries than for a standalone application like pip. And I leave complex type stuff to someone else¹. But I see no reason to change our type checking approach.

I'm sorry about the autoformatter comment. I meant to say linters. My mistake.

Anything that needs me to understand the difference between covariance and contravariance just gets me frustrated 🙂 ↩

DimitriPapadopoulos · 2025-03-21T09:42:48Z

I contest the efficiency argument as both approaches will be implemented at the C level, thus the per item overhead should be similar, but the method approach has to pay the expensive overhead of the method lookup and call.

Both defending and contesting the rule without facts is pointless. However:

I usually assume that linter suggestions are based on benchmarks, but I must admit that has been proven false in the past. See for example UP038 rewrites code to make it slower and more verbose astral-sh/ruff#7871 about UP038 which eventually got deprecated.
I usually assume that if a method such as dict.fromkeys has been added to Python, it's for a good reason (performance or readability and maintainability).

In this specific case, ruff 0.10.0 indeed enabled C420 by default, requiring to either explicitly disable the rule or apply it:

Stabilization

The following rules have been stabilized and are no longer in preview:
[...]
unnecessary-dict-comprehension-for-iterable (C420)
[...]

DimitriPapadopoulos · 2025-03-21T09:59:38Z

Some rulesets look really useful to me: I, ISC, RUF100. The usefulness of other rules is indeed debatable: C4, PERF, UP. I am happy to help remove the latter rulesets or keep only those rules you feel improve the codebase.

I'm happy to offer opinions on rules, but I don't think it's productive to get into debates - trying to justify things like "I prefer the comprehension over using fromkeys" is exhausting and pointless (it's a matter of preference, after all).

I nevertheless feel that applying ruff rules is globally positive. Again, I usually assume that the implementers of a new ruff rule or a new feature in Python have done their performance/maintainability homework, even if that has not always been the case in the past. Precisely because it's a matter of preference, why not let an automated process handle low-level choices and keep our creativity for higher-level tasks? Running a formatter and ruff on a codebase is supposed to result in globally more predictable and readable code in the long term, despite the short-term annoyance of having to adapt to new features. At least that's the theory behind formatters and linters. Practice may differ from theory, though.

pfmoore · 2025-03-21T12:02:30Z

Practice may differ from theory, though.

In particular, in practice linters, when their rules are enforced rather than advisory, remove choice over how to express given tasks in code. Linter rules are applied globally, whereas context is often critical when considering style choices.

I think of linters as like spellcheckers - they should fix clear mistakes without any fuss. Unfortunately, they often overstep that remit, and act more like grammar checkers or worse, imposing style choices that are debatable at best, and harm readability at worst.

Again, I usually assume that the implementers of a new ruff rule or a new feature in Python have done their performance/maintainability homework, even if that has not always been the case in the past.

Performance isn't (IMO) the job of a linter to decide. Nor is maintainability, to the extent that only the project maintainers can decide what's maintainable for them. And ruff's track record isn't perfect, as you note, so there's a risk in trusting their judgement (especially on matters of taste, which is most of what linters do).

Precisely because it's a matter of preference, why not let an automated process handle low-level choices

Because linters don't allow for the possibility of "choose the best option for the given piece of code". They are all or nothing (ignoring #noqa directives). That's not "preference", it's a project rule. And we're deciding whether we want to choose our project rules or let the ruff maintainers do that for us.

I'm not against automation of our project standards for code style. That's useful, and linters are a good tool for doing it. I am against letting someone else decide our code style standards for us. Especially if those standards continually change. That's why I'd prefer a small set of lint rules, ideally all explicitly requested, over a broad policy of applying whatever ruff chooses to dump into multiple (arguably only vaguely defined) categories.

Practicality dictates that we need to compromise a little here, and accept some broad categories.

Looking at our rule configuration:

ASYNC - Why are we applying this? We barely use any async functionality, so why have rules about it?
B - This is "whatever flake8-bugbear had". There's no cohesive theme here, so what's the logic for assuming we blindly apply everything. On a quick scan, they seem OK, so I'd be inclined to leave this with a ploicy that we review changes to the list every time we upgrade ruff.
C4 - This is the one that kicked off this whole debate. It's "policy on how people should use list comprehensions" and I'd rather remove it and trust people's judgement. Do we really need strict consistency here?
C90 - Function is too complex. No. Let reviewers judge that, not a tool.
E - A bunch of unrelated opinions, again. As they are considered "errors", they are probably OK, and I don't have the energy to debate every one, I'd be willing to accept this (with reservations). Again, we should review changes to the list when we upgrade ruff, though.
F - Meh. Much the same opinion as for "E".
G - I suppose.
I - Not linting, this is related to the fact that we use ruff as an isort replacement. That's a separate discussion (I like isort, and I don't have a problem with using ruff for the functionality).
ISC - Only ISC001 is self-evident to me. The others I'll accept, but grudgingly (again because context is important).
PERF - Performance advice is not the job of a linter! We can debate these rules, but not as "style choices".
PLE - Another "bunch of unrelated stuff" set (this from pylint rather than pyflakes). See E.
PLR0 - Why only the "0" set? And mandated refactoring is so far from being "linting" that I think it's borderline dishonest to try to lump them under "project style rules".
W - Like "E", but these are acknowledged as "warnings" rather than "errors". They don't look particularly bad, though.
RUF100 - This is a very specific rule, so I'm OK with it. Presumably it was added intentionally, and I don't want to re-debate cases where we clearly made a deliberate choice.
UP - I'm on record as being against this. We shouldn't rewrite working code just because there's a newer approach available. And we certainly shouldn't enforce doing so via a linter.
FA102 - Again, a single specific rule.

With the rules that are quoted as being from other linters (pyflakes, pylint) how true is that these days anyway? I feel like those rulesets at this point are simply appealing to the credibility of other projects. It would be more honest to just bundle them all together as "general stuff" that ruff does, and not pretend that they are different categories.

Anyhow, I don't know how much use all that is. It feels like this is always what happens with linter discussions, they end up getting into a rabbit hole of debate, condensing all of the arguments we wanted the linter to bypass into one place, and making it theoretical rather than rooted in a specific issue. Is that a net plus? I'm not sure.

I'm happy to go back to mostly ignoring linter questions unless/until they trigger an issue that I do have an opinion on. But if anyone wants to work on improving our linter config, at least my opinions are now on record.

DimitriPapadopoulos · 2025-03-21T12:49:48Z

E - A bunch of unrelated opinions, again. As they are considered "errors", they are probably OK, and I don't have the energy to debate every one, I'd be willing to accept this (with reservations). Again, we should review changes to the list when we upgrade ruff, though.

Some E rules are part of ruff's default ruleset:

By default, Ruff enables Flake8's F rules, along with a subset of the E rules, omitting any stylistic rules that overlap with the use of a formatter, like ruff format or Black.

I usually prefer using the default ruleset, specifying additional rules with extend-select instead of select, because this automatically disables E rules that are not compatible with the formatter. The downside is that this pulls the default ruleset, which may evolve as new versions of ruff are released - in practice the default ruleset has been stable for quite a long time.

F - Meh. Much the same opinion as for "E".

Same, F rules are part of the default ruleset.

Anyway, I'm happy to remove rulesets such as UP or disable rules if you want me to.

notatallshaw · 2025-03-21T14:07:49Z

Taking a step back and taking a look at the state of ruff at the moment:

The rule names represent their lineage, not directly a type, or category, of rule
Ruff tends to accept/implement rules from other projects unless they are "obviously" wrong or contradicting existing rules (unless specifically an opposing style choice where both are known to be used)
Ruff maintainers agree the default rule set and names could do with significant reorganization: https://discuss.python.org/t/official-lint-rules-tool/72089/24

Because of this there should not be any implication from why a rule exists, why it has been stabilized, or why it is in the default ruleset. For reasons beyond preferred styling you would need to look at the history of the rule, both in ruff and the lineage of where it comes from, and actually test if that reasoning is still true.

I think it would be a reasonable proposal for pip to switch to a minimal list of specific rules, and not use rule groups, that match strong criteria like:

Almost certainly an error
Using something deprecated
Overwhelmingly the preferred styling (e.g. Enforce f-strings via Ruff #12393)
Redundant code (e.g. where code has no effect and can be removed)

Then it can be documented in the linting section why each rule was chosen and the standard required to add new rules.

While I said that being reactive rather than proactive takes less effort, I think this discussion has already taken up more effort than I expected, and now I’m leaning towards taking up this task myself and reviewing each rule.

Anyway, I'm happy to remove rulesets such as UP or disable rules if you want me to.

I appreciate the offer, but IMO such a PR raised by a maintainer, I think the discussion here has shown that small choices in linting can be contentious, and affect using the codebase as a whole.

pfmoore · 2025-03-21T14:35:10Z

I think the discussion here has shown that small choices in linting can be contentious, and affect using the codebase as a whole.

Although it may also simply show that I'm stroppy when it comes to the subject of linters 🙂 Please don't think that this is a big deal to me. In the grand scheme of things it's pretty minor - it just happens to be one of my pet peeves.

ichard26 · 2025-03-21T15:28:59Z

I don't have the time at this very moment to write an elaborate reply, but I did want to propose that we configure Ruff to reuse our old flake8 (and our old plugins?) rule set and then work our way from that with the specific additions (as exemplified earlier). That way we don't have to discuss every single rule, but only the ones that we never really agreed to but accidentally enabled when switching to Ruff. Honestly, we could just use the flake8 set, keep the obviously good additions, and then let rules be added individually naturally over time.

I'm assuming that flake8 itself wasn't particularly controversial or was already sufficiently tuned, though.

notatallshaw · 2025-03-21T15:54:52Z

It looks to me like the flake8 config was largely relying on the flake8 defaults with bugbear, logging-format, and implicit str concat added:

pip/setup.cfg

Line 13 in 78ab4cf

[flake8]
pip/.pre-commit-config.yaml

Line 24 in 78ab4cf

- repo: https://github.com/PyCQA/flake8

If I were to make a PR for this, the value proposition of reviewing each rule against a strong minimal criteria is much greater than determining the default rules of flake8 circa 2022 and figuring out which ones have been migrated to ruff and just enabling those. But I would have no objection to someone else carrying out that task if they saw it as valuable.

Revert "Apply ruff/flake8-comprehensions rule C420 (#13284)"

Apply ruff/flake8-comprehensions rule C420

0c10a8b

C420 Unnecessary dict comprehension for iterable; use `dict.fromkeys` instead Starting with ruff 0.10.0, the above rule has been stabilized and is no longer in preview.

ichard26 added the skip news Does not need a NEWS file entry (eg: trivial changes) label Mar 17, 2025

ichard26 merged commit 0285e1a into pypa:main Mar 20, 2025
28 checks passed

ichard26 added a commit to ichard26/pip that referenced this pull request Mar 20, 2025

Revert "Apply ruff/flake8-comprehensions rule C420 (pypa#13284)"

dcbfc44

This reverts commit 0285e1a.

DimitriPapadopoulos deleted the C420 branch March 21, 2025 09:43

uranusjr added a commit that referenced this pull request Mar 23, 2025

Merge pull request #13294 from ichard26/revert-ruff

4b265b3

Revert "Apply ruff/flake8-comprehensions rule C420 (#13284)"

github-actions bot locked as resolved and limited conversation to collaborators Apr 6, 2025

Apply ruff/flake8-comprehensions rule C420 #13284

Apply ruff/flake8-comprehensions rule C420 #13284

Uh oh!

Conversation

DimitriPapadopoulos commented Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ichard26 commented Mar 20, 2025

Uh oh!

Uh oh!

pfmoore commented Mar 20, 2025

Uh oh!

ichard26 commented Mar 20, 2025

Uh oh!

notatallshaw commented Mar 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ichard26 commented Mar 20, 2025

Uh oh!

pfmoore commented Mar 20, 2025

Uh oh!

ichard26 commented Mar 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Footnotes

Uh oh!

pfmoore commented Mar 20, 2025

Footnotes

Uh oh!

DimitriPapadopoulos commented Mar 21, 2025

Stabilization

Uh oh!

DimitriPapadopoulos commented Mar 21, 2025

Uh oh!

pfmoore commented Mar 21, 2025

Uh oh!

DimitriPapadopoulos commented Mar 21, 2025

Uh oh!

notatallshaw commented Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pfmoore commented Mar 21, 2025

Uh oh!

ichard26 commented Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

notatallshaw commented Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

DimitriPapadopoulos commented Mar 17, 2025 •

edited

Loading

notatallshaw commented Mar 20, 2025 •

edited

Loading

ichard26 commented Mar 20, 2025 •

edited

Loading

notatallshaw commented Mar 21, 2025 •

edited

Loading

ichard26 commented Mar 21, 2025 •

edited

Loading

notatallshaw commented Mar 21, 2025 •

edited

Loading