Skip to content

Refactor InstallRequirement #5051

@pradyunsg

Description

@pradyunsg

I've had this cooking in my head for a bit now. These aren't critical path for some feature but this definitely something I want to do to make pip more approachable for new contributors and make it easier to reason about the installation flow. I've hinted this before in #4713 (comment).

The idea is that, eventually, the resolver should consume Requirements, get Candidates during resolution and spew out Distributions. The Distributions would then be operated upon during installation/uninstallation etc, downloading/caching would deal with Candidates, listing/freezing/checking would deal with Distributions. This 3-class model works cleanly for each of these use cases.

Currently, all three "jobs" are served by InstallRequirement (and InstallationCandidate :P), and it's sometimes hard to keep track of what's happening since there's a lot of stuff that's happening in that one class.

This would result in:

  • sanity while actually keeping track of the installation flow
  • make it easier later on to improve unit test coverage - scope for improving test speed
  • better decoupling of UI elements from actual code -- easier Redesigning pip's output (for install, wheel and download commands) #4649
  • ease when modifying behaviours for certain kinds of Distributions -- easier to implement PEP 517/518

I'm curious what @pypa/pip-committers and others think about this since, all in all, this work would result in changes in about a quarter of pip._internal, in terms of LOC.


What follows is basically the mind-dump for this (from about 4 months back, I think):

  • Refactor Refactor Refactor! {confirm it looks fine, figure out path}
    • Do away with all the respect for InstallRequirement and start chopping it.
    • Rename InstallationCandidate to CandidateDistribution
    • Move all "link" related information into CandidateDistribution
    • Create pip._internal.models.requirement.Requirement - Holds information that is given by the user
      • Essentially, a helpful wrapper around packaging.requirements.Requirement
    • pip._internal.models.candidate.CandidateDistribution - Holds information fetched from the index
      • package-version visible to the resolver
      • url-hash visible to download logic
      • TODO: check how is this different from current InstallationCandidate
      • TODO: Figure out how VCS looks with this?
    • pip._internal.models.distribution.Distribution - Describing the build, install, uninstall and other related behaviours for different kinds of distributions.
      • might wrap a pkg_resources.Distribution?
      • LegacySourceDistribution (non PEP 518/517)
      • ModernSourceDistribution (PEP 518/517)
      • WheelDistribution
      • Maybe do this early?
    • An intermediator to hold the logic of conversion from CandidateDistribution to a Distribution object
      • need to "build" to get dependencies and Distribution should hold that behaviour, not CandidateDistribution.

The reason I want to use the pip._internal.models is to represent that these are essentially objects representing data stored elsewhere that would be operated upon and isn't that what models are, in some sense? :P


PS: This is after I am done with zazo and then bringing it in. zazo is the project where I'm making the resolver.

Metadata

Metadata

Assignees

Labels

auto-lockedOutdated issues that have been locked by automationstate: needs discussionThis needs some more discussiontype: refactorRefactoring code

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions