Skip to content

Conversation

Anselmoo
Copy link
Contributor

@Anselmoo Anselmoo commented Nov 5, 2023

Expect the files of cclibparser

Via https://docs.sourcery.ai applied some basic refactors to make the code a little bit more pythonic

@Andrew-S-Rosen
Copy link
Contributor

Sweet! I was going to open some PRs for sourcery refactorings but now I don't have to 😄

@Anselmoo
Copy link
Contributor Author

Sweet! I was going to open some PRs for sourcery refactorings, but now I don't have to 😄

Unfortunately, most of the parsers of https://github.com/cclib/cclib/tree/master/cclib/parser cannot be optimized by sourcery in a reasonable time due to the length of the extract-function. Even after several hours, the ORCA parser still stays untouched ... 😢

It might be worse to think about totally refactoring these functions with code patterns like adapter or factory; see also https://github.com/faif/python-patterns.

@Andrew-S-Rosen
Copy link
Contributor

Andrew-S-Rosen commented Nov 14, 2023

@Anselmoo: Interesting. Are you relying on the cloud version or did you try running sourcery locally?

The unwieldy extract methods make me a little sad :( They should probably be refactored to call several private functions. That'd make it easier to refactor and debug, but alas.

@Anselmoo
Copy link
Contributor Author

@Anselmoo: Interesting. Are you relying on the cloud version or did you try running sourcery locally?

@Andrew-S-Rosen thx for question

sourcery review --check --fix --verbose .

I run locally ...

@Andrew-S-Rosen
Copy link
Contributor

Ooof! Oh well... 😅

@Anselmoo
Copy link
Contributor Author

Anselmoo commented Nov 14, 2023

  1. In terms of cloud solution, you are talking about https://github.com/marketplace/sourcery-ai ? I find that the local solution, after 2-3 time running over and over, find more refactors.
  2. Also, it can suggest extract_methods to avoid duplicating code sections

@Andrew-S-Rosen
Copy link
Contributor

@Anselmoo:

  1. Yes, that's right. I also prefer the local solution.
  2. Makes sense. I was going to suggest something similar, e.g. having it run on a subset of files or for a subset of rules at a time.

@Andrew-S-Rosen
Copy link
Contributor

Andrew-S-Rosen commented Nov 14, 2023

One cautionary point: Did you set the minimum Python version when you ran sourcery locally (e.g. via the sourcery.yaml file)? By default, it will make some changes that are not compatible with Python 3.8, which cclib still supports.

@Anselmoo
Copy link
Contributor Author

I realized that a little later during the PR. I was taking a look at the pyproject.toml,

cclib/pyproject.toml

Lines 1 to 3 in 8800925

[tool.black]
line-length = 100
target-version = ['py39']

where the definitions stay at 3.9. Probably, somebody should open an Issue and definite the minimum / supported versions, in a setup.cfg or/and setup.py.

I am more curious about master vs. main, so if it runs once, the reactors should be applied for python 3.8.

@Andrew-S-Rosen
Copy link
Contributor

Probably, somebody should open an Issue and definite the minimum / supported versions, in a setup.cfg or/and setup.py.

Yup, I just raised that issue in #1305.

@berquist berquist added this to the v1.8.1 milestone Nov 15, 2023
Copy link
Member

@berquist berquist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for not responding to this on time, especially since a lot of good discussion has happened here. Rather than just say this is a "drive by", I'll try and respond to everything here and provide some reasoning for why things are the way they are, for better or for worse.

First, the PR itself. This is difficult to review because of its size and how many files it touches, but the whole thing doesn't need review for us to discuss it. It looks like there are at least three kinds of changes here:

  • Positive stylistic changes that are more than just formatting.
  • Formatting changes like we'd get with Black or Ruff.
  • Some stylistic changes that are harmful.

I've left comments about the kinds of stylistic changes are (not so) good. I'd expect many but not necessarily all of these to be cause by pylint or Ruff, and wouldn't be added as part of reformatting. We would be open to these changes, separate from any formatting changes.

The idea with #1122 is that we would do reformatting in one pass once ~most of the open PRs have been merged, so it can be contained in a single PR with a .git-blame-ignore-revs that keeps git blame helpful and doesn't hurt contributors. It turns out that PRs are not getting merged and we're hurting contributors anyway. A reason we are not doing an implementation refactor is that we are working on version 2.0 of the library based on small parser combinators. It's a WIP in the main branch which will be changed to the default once everything is done. We'll run Black on all the new code as we develop and I'm not sure yet what we'll do on all the old code. It will not be a full rewrite since all the nitty-gritty for each attribute will be taken out of the individual extract methods.

For versioning, currently supported versions are visible in our CI image repository (https://github.com/shivupa/cclib-ci). There is no particular reason why cclib shouldn't currently work with 3.11 or 3.12. We'll remove 3.7 and 3.8 from CI once 3.11 and 3.12 are added. The full pyproject.toml migration (#1180) will also happen soon.

Back to this PR. If you are willing to make smaller PRs with some of these non-formatting changes, that would be helpful. I'll close this one once the discussion reaches some conclusion.

@@ -95,22 +95,18 @@ def write_trajectory(filename, ccdata, popname="mulliken", index=None):

properties = {}
if hasattr(ccdata, "scfenergies"):
properties.update({"energy": ccdata.scfenergies[i]})
properties["energy"] = ccdata.scfenergies[i]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some groups are against these sorts of changes, but we do think these are positive.

@@ -85,7 +86,7 @@ def makeopenbabel(
obmol.ConnectTheDots()
obmol.PerceiveBondOrders()
obmol.SetTotalSpinMultiplicity(mult)
obmol.SetTotalCharge(int(charge))
obmol.SetTotalCharge(charge)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.

missing = [x for x in required_attrs if not hasattr(data, x)]

if missing:
if missing := [x for x in required_attrs if not hasattr(data, x)]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't written anywhere (yet), and we are still discussing it, but we are generally not going to use the walrus, and it is not due to Python version requirements, but rather readability reasons.

if self.identified_program is None:
self.identified_program = program
if do_break:
break
else:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an an example of a negative change: the comment means something and was accidentally tossed as though the branch was

else:
    pass

self.identified_program = None
if do_break:
break
if self.identified_program == None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A positive change.

])
# fmt: on
return atomradius
return np.array(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is...not good. It is trashing comments that contain Black directives.

@Anselmoo
Copy link
Contributor Author

@berquist thx, for your very detailed answer!

  • Formatting changes like we'd get with Black or Ruff.

I would prefer to use Ruff, because it also include isort, and many more
The nice thing is that we can also include the target version in pyproject.toml and use the built-in PyUpgrade to support the maintenance.

Using https://github.com/chartboost/ruff-action as a GitHub Action allows building dependent CI pipelines and quickly rejecting improper linted PRs.

So, maybe first taking care about the pyproject.toml?

@berquist
Copy link
Member

I am going to use https://pre-commit.ci/. Local development should allow as close to the same workflow as CI as possible.

@Anselmoo
Copy link
Contributor Author

I am going to use https://pre-commit.ci/. Local development should allow as close to the same workflow as CI as possible.

The great thing about https://pre-commit.ci is that it can automatically create pull requests to fix some issues.

@Anselmoo
Copy link
Contributor Author

See also: #1307 (comment)

@Anselmoo Anselmoo changed the title Apply some refactorsto code base Apply some refactors to code base Nov 22, 2023
@berquist
Copy link
Member

Closing because #1307 will make this a mess and we have some understanding of the kinds of changes we're looking for.

@berquist berquist closed this Nov 29, 2023
@Anselmoo
Copy link
Contributor Author

Anselmoo commented Dec 2, 2023

@berquist should this PR not be reopen after make rebase to #1307? Or better make from stretch again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants