Skip to content

Add option to search in lock file first when reuse #3522

@adsko

Description

@adsko

Feature Description

Hello!

When pdm add <package> is called, PDM does a lot of network calls for package retrieval. At the same time, all of those information are available in lock file if exists.

Right now, when we add any dependency with reuse strategy, PDM tries to call get_dependencies method. That method goes into repository and retrieves the package to get list of dependencies.

But when we have lockfile and reuse strategy is in use, most data may exist (as long as we do not need to refresh lock file)

Problem and Solution

It will speed up adding new package a lot. With it PDM won't need to go into network call to retrieve list of dependencies + hashes.

PoC:

Fast example for get_dependencies:

        try:
            try:
               # Get from lockfile
                deps, requires_python, _ = self.locked_repo.get_dependencies(candidate)
            except CandidateNotFound:
                deps, requires_python, _ = self.repository.get_dependencies(candidate)
        except (RequirementError, InvalidPyVersion, InvalidSpecifier) as e:
            # When the metadata is invalid, skip this candidate by marking it as conflicting.
            # Here we pass an empty criterion so it doesn't provide any info to the resolution.
            logger.error("Invalid metadata in %s: %s", candidate, e)
            raise RequirementsConflicted(Criterion([], [], [])) from None

With above change, PDM need to propagate hashes also (as it makes some mess with Candidates), but for testing purposes something like that solves the problem for most of cases (it's PoC only, I know it's bugged still):

to_do = []
        for entry in data:
            try:
                if self.locked_repo.candidates[entry.name].version == entry.version:
                    entry.hashes = self.locked_repo.candidates[entry.name].hashes
                else:
                    to_do.append(data)
            except KeyError:
                to_do.append(entry)

        self.repository.fetch_hashes(to_do)

So with those changes installation time from 3-5min went into 10s.

Additional Context

Above examples were done only for PoC, so:

  1. It's not tested
  2. has bugs like: some hashes are still retrieved even if lockfile knows about them
  3. requires_python is lost in lock file. Looks like lock based repo is returning that data differently.

Are you willing to contribute to the development of this feature?

  • Yes, I am willing to contribute to the development of this feature.

Metadata

Metadata

Assignees

Labels

⭐ enhancementImprovements for existing features

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions