-
Notifications
You must be signed in to change notification settings - Fork 261
Pyproject validation: report an error when a license file does not exist #825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pyproject validation: report an error when a license file does not exist #825
Conversation
Reviewer's Guide by SourceryThis pull request introduces a check to ensure that the license file specified in the pyproject.toml exists. It also adds a utility for temporarily changing the current directory during tests. Sequence diagram for license file validationsequenceDiagram
participant Factory as Poetry Factory
participant FS as File System
Factory->>Factory: validate()
Note over Factory: Check project.license
alt project has license data
Factory->>Factory: Get license data
alt license has file property
Factory->>FS: Check if license file exists
alt file does not exist
Factory->>Factory: Add error to result
end
end
end
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @MaddyGuthridge - I've reviewed your changes - here's some feedback:
Overall Comments:
- Consider moving filesystem checks out of the validation phase. While this fix works, performing filesystem operations during pyproject.toml validation could cause issues in scenarios where we only want to parse/validate the file structure (e.g., CI/CD pipelines, documentation tools).
Here's what I looked at during the review
- 🟡 General issues: 2 issues found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
src/poetry/core/factory.py
Outdated
license_data = project.get("license") | ||
if isinstance(license_data, dict) and "file" in license_data: | ||
license_path: str = license_data["file"] | ||
if not Path(license_path).exists(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): The license file path should be resolved relative to the pyproject.toml location, not the current working directory
This could cause false validation results if the code is run from a different directory. Consider using the config file's directory as the base path for resolution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the bot's assessment here, but to my knowledge there is no clear way to access the path of the license file within the validation. Changing it would likely require significant refactoring, which makes me wonder if this is the best place for this bug fix...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my answer from #800 (comment) applies here, too. License files and readme files are quite similar and should be treated the same. Thus, the question is whether it would be sufficient for your use case to handle it in poetry check
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for raising this point, @MaddyGuthridge and @radoering. I agree that refactoring to access the pyproject.toml's directory might be extensive, and that consistency with how README files are handled is important.
@MaddyGuthridge, could you elaborate on why accessing the pyproject.toml's path for license file resolution would require significant refactoring? Understanding the specific challenges would help in exploring alternative solutions. Perhaps there's a less intrusive way to achieve this.
Regarding handling this in poetry check
, @radoering, while that might be sufficient for some use cases, it still leaves a gap in the validation process. Ideally, the validation should be accurate regardless of where it's invoked. A user might encounter this issue in other contexts beyond poetry check
, and having consistent behavior is crucial.
Let's explore a few options:
-
Investigate minimal refactoring: Even if directly accessing the pyproject.toml's path is difficult, there might be a less invasive way to get the base directory. Perhaps we can pass it as a parameter during the validation process.
-
Defer file existence check: If resolving the path is too complex for this stage, we could consider deferring the file existence check to a later stage where the project's root directory is readily available. This would avoid the immediate validation error but ensure the check happens eventually.
-
Document the limitation: If neither of the above is feasible, we should clearly document this limitation, explaining that the license file path is resolved relative to the current working directory during this specific validation step. This would at least inform users about the potential for false positives/negatives.
I'm leaning towards option 1, as it addresses the root cause. Let's discuss the feasibility of a minimal refactoring approach before considering other options. I'm happy to help explore this further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm perfectly happy to handle it in poetry check
(which I presume does strict validation). My reason for handling it in non-strict validation is that otherwise, a very confusing error message is given later-on in Poetry, as per python-poetry#10105. I'm interpreting your suggested fix as comprising of a few steps:
- Remove the check for the license path from code paths that are called from
poetry run
and similar commands which don't need to validate the license. - Move this more-detailed error message to
poetry check
. - Ensure that commands such as
poetry build
(which do need to check the license file) also give helpful error info.
To clarify for @sourcery-ai (and onlookers), by significant refactoring, I don't mean refactoring the entire project, but rather changing the interface in a way that may not be desired by the core maintainers (I want to avoid this without their approval given I am an outside contributor).
How does this sound, @radoering ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying, @MaddyGuthridge. I understand your concern about changing interfaces without maintainer approval. Your proposed 3-step approach sounds like a good compromise, addressing both the confusing error message and the validation accuracy.
To summarize your proposal:
-
Remove license check from non-essential paths: This prevents the confusing error during commands like
poetry run
where license validation isn't strictly required. -
Move detailed error to
poetry check
: This ensures a comprehensive error message is presented during the dedicated checking phase. -
Ensure helpful errors during essential operations: Commands like
poetry build
, where the license file is crucial, should also provide informative error messages.
This approach avoids the need for immediate refactoring while still improving the user experience. I agree that this is a good path forward, especially for an outside contribution. It balances the need for a fix with the constraints of working with an existing codebase.
@radoering, what are your thoughts on this revised approach? Does it align with your suggestions from the linked issue comment?
@MaddyGuthridge, if this approach is agreed upon, I'm happy to help you think through the implementation details or review a revised PR based on these steps. Let me know if you'd like any assistance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MaddyGuthridge I checked your issue again and noticed that license
is different from readme
in the way that we read the license file in Factory._configure_package_metadata()
:
poetry-core/src/poetry/core/factory.py
Line 187 in eccf08b
raw_license = (root / license_file).read_text(encoding="utf-8") |
(We do not read the readme file that early.)
Maybe, just catch the error there and raise one with a better error message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing that out, @radoering! Catching the error in Factory._configure_package_metadata()
and providing a better error message there seems like the most direct and least intrusive solution. This avoids the need for refactoring the validation logic or moving the check to a different part of the codebase.
@MaddyGuthridge, this approach aligns well with your original goal of providing a clearer error message without extensive changes. It also addresses the concern about the confusing error appearing in contexts where license validation isn't strictly necessary.
If you're comfortable with this approach, I'd suggest focusing your efforts on modifying _configure_package_metadata()
to handle the missing license file gracefully and provide a user-friendly error message. This would likely be the most efficient and maintainable solution.
I'm happy to review a revised PR with this change. Let me know if you have any questions or would like to discuss the implementation details further.
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
for more information, see https://pre-commit.ci
A little beyond the scope of this PR but I wonder if it could be helpful to have some simple |
Based on the advice above, I've opened #827. It ended up being a much simpler approach than this, so I'll close this PR. |
Resolves: python-poetry#10105
My first Poetry contribution!
temporary_cd
context manager to thetestutils
module. Is there a better way of doing this?poetry run pytest
as per the original comment)?Summary by Sourcery
Tests: