-
-
Notifications
You must be signed in to change notification settings - Fork 654
CI conda: Ignore baseline test failures #36678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are only a couple of test failures in these files. These ignore patterns are way to coarse.
(removing blocker label for that reason to not hide possible issues, and because it's not working anyway in this context)
It's working as designed. Green checkmarks. Failures marked as "failed in baseline" |
So if another new test in the same file fails, the conda ci is red? |
Perhaps you should clarify what you think the purpose of the ci-conda workflows is. Marking these files as baseline failures is done so that on unrelated PRs, the red checkmarks on the conda workflows do not annoy developers. |
And obviously work is needed to actually diagnose, report, and fix these failures. The approach of #36372 ("skip_conda") hides these failures too well. |
They have the same purpose as the Build & Test workflow for people using the conda environment. So yes, they should duplicate the test coverage. (In the long term, once the conda runs are a bit more stable and there are no more conda-specific issues, we can remove the Build & Test workflow). |
B&T is mainly for checking sage library changes to help developers and reviewers. It is unlikely that some code change in the sage library works on a platform and fails on another. That is why B&T runs only on a specific decent platform and tests on all other platforms are done only for weekly release. I do not use conda (linux neither). I regard conda as another platform. So it looks strange to me that currently 6 B&T workflows for conda platforms run for each PR. I think there is already too much duplication. Perhaps conda is special, and separate B&T for conda is necessary for PRs (perhaps upgrading packages). Then why not just two B&T for conda (one for ubuntu and one for mac) for PRs and more tests for weekly release? About #36372, I object to introducing conda-specific doctest tags into sage library. That is another cluttering of sage library, in addition to all doctest tags for modularization. We decided to live with modularization. But no platform-specific doctest tags, please. |
FWIW, I approved the change to run these jobs on all PRs in #36373. Tobias has already implemented a mitigation (see #36373 (comment)) there on my request by making it "fail-fast" and setting a max-parallel 2. But I think at least the macOS jobs need reconsideration / additional adjustment. Given that we only have 5 macOS runners, launching 3 jobs (each 1 hour) for every PR is clearly too much. For example, at the time that 10.2.rc1 was released (over an 1 hour ago), at least 7 "ci-conda" workflows were already in the queue (https://github.com/sagemath/sage/actions/workflows/ci-conda.yml?query=is%3Aqueued), with at least 14 macOS jobs waiting for a runner. As a result, the CI-macos run of 10.2.rc1 (https://github.com/sagemath/sage/actions/runs/6827163283) as well as the cibuildwheel job for macOS (https://github.com/sagemath/sage/actions/runs/6827163257/job/18568938031) have not been able to start yet. |
Implemented in #36694 |
883e05f
to
e349b00
Compare
I agree and have updated #36694 to do this. |
Documentation preview for this PR (built with commit 3c2edf3; changes) is ready! 🎉 |
Currently cond ci is the main reason breaking and delaying checks. Let's get this in. |
It seems to be the established practice: #36647 (among others) Anyway, I'm okay with ignoring the failing tests using an external file. But this mechanism should only exclude the tests that are really failing to a) prevent that additional errors are introduced and b) to make it easier to work on these issues (it's essential to know which line actually breaks randomly in source file of +1000 lines). |
Just focus on the failing test that made the file excluded. We may add a comment telling the failing test (or tests) when we add the file to the list of the ignored files. It would make things complicated if we try to exclude only the failing tests. Why make things complicated just for rare cases (almost looking imaginary to me)? |
One sees this in the logs -- because with the approach here, the failures are not hidden but only marked as "[failed in baseline]". |
That clearly only works for the issues that occur often/reproducible. I was talking about the random failures that happen once every 100 runs or so (which is the order of magnitude of the remaining failures that we are currently seeing from time to time). |
The random failure you are talking about seems not related with PRs. Why do you want to observe it in PR checks? To investigate such a random failure, I would just devise massive repetitive test runs with just one version of sage. No? |
I also don't understand the comment in #36678 (comment) |
I don't want to observe them at all. It just reality that these failing tests are occurring very rarely, which makes it hard to reproduce them locally. Heck, even now there are random test failures of the Build & Test workflow (e.g. https://github.com/sagemath/sage/actions/runs/6970122859/job/18967437726). Do you want to exclude the whole file from the tests because of this?
I actually did this. But that only revealed a couple of errors. But if you have computation time over that you don't need, I would appreciate it very much if you could run the conda tests in a loop and let me know about the failures.
Take a failure that occurs once every 100 runs. If you only know that the failure happens in a given file, you need to run all doctests in that file for 100 times (I know, I know statistics...), which could easily take a few hours depending on the file, only to reproduce it locally. And that would imply that the error actually happens on the given system (some of the errors are occurring only on eg macos). If you know the exact test that fails, you copy that one to a test file, put a loop around it and have it reproduced in a couple of mins. |
... then you open an Issue to report it. There is no need for adding a record of the exact failing test to the source code. |
Asked and answered already. |
That example was for the situation where I have 15min extra time and want to use that time to fix a conda bug. With your solution here, I already may need 2 hours to just find out which test is actually failing. Another example: #36372 (comment) it was very easy to recognize that another PR fixed a certain failure on conda because the info was there directly in the code. With the coarse ignore rules proposed in this PR here, that would have been not that easy. (That also brings up the question who is maintaining the exclusion list, and how?) |
I don't think this adds anything to the discussion here. I'll explain again:
|
and items in the exclusion list are commented with the the GitHub Issue numbers... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Matthias, can you please just stop adding "positive review" labels on your PRs. Thanks! Labeling it as "r: invalid" and "s: needs review" as a motion to close this PR. There are clear short comings with the approach of this PR as described above. |
Nope, all your claims have been refuted. |
sagemathgh-36694: CI conda: On `pull_request`, only run 1 macOS job and 1 Linux job <!-- ^^^^^ Please provide a concise, informative and self-explanatory title. Don't put issue numbers in there, do this in the PR body below. For example, instead of "Fixes sagemath#1234" use "Introduce new method to calculate 1+1" --> <!-- Describe your changes here in detail --> <!-- Why is this change required? What problem does it solve? --> as discussed in sagemath#36678 (comment), sagemath#36616 (comment) On pushes to a tag, all 3 Linux and 3 macOS jobs are still run, as demonstrated in https://github.com/mkoeppe/sage/actions/runs/6829619271 <!-- If this PR resolves an open issue, please link to it here. For example "Fixes sagemath#12345". --> <!-- If your change requires a documentation PR, please link it appropriately. --> After merging, the branch protection rule https://github.com/sagemath/sa ge/settings/branch_protection_rules/33965567 will need to be updated. It controls what workflows show as "Required" in the PR Checks. ### 📝 Checklist <!-- Put an `x` in all the boxes that apply. --> <!-- If your change requires a documentation PR, please link it appropriately --> <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> <!-- Feel free to remove irrelevant items. --> - [x] The title is concise, informative, and self-explanatory. - [ ] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [ ] I have created tests covering the changes. - [ ] I have updated the documentation accordingly. ### ⌛ Dependencies <!-- List all open PRs that this PR logically depends on - sagemath#12345: short description why this is a dependency - sagemath#34567: ... --> <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> URL: sagemath#36694 Reported by: Matthias Köppe Reviewer(s): Kwankyu Lee, Tobias Diez
sagemathgh-36694: CI conda: On `pull_request`, only run 1 macOS job and 1 Linux job <!-- ^^^^^ Please provide a concise, informative and self-explanatory title. Don't put issue numbers in there, do this in the PR body below. For example, instead of "Fixes sagemath#1234" use "Introduce new method to calculate 1+1" --> <!-- Describe your changes here in detail --> <!-- Why is this change required? What problem does it solve? --> as discussed in sagemath#36678 (comment), sagemath#36616 (comment) On pushes to a tag, all 3 Linux and 3 macOS jobs are still run, as demonstrated in https://github.com/mkoeppe/sage/actions/runs/6829619271 <!-- If this PR resolves an open issue, please link to it here. For example "Fixes sagemath#12345". --> <!-- If your change requires a documentation PR, please link it appropriately. --> After merging, the branch protection rule https://github.com/sagemath/sa ge/settings/branch_protection_rules/33965567 will need to be updated. It controls what workflows show as "Required" in the PR Checks. ### 📝 Checklist <!-- Put an `x` in all the boxes that apply. --> <!-- If your change requires a documentation PR, please link it appropriately --> <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> <!-- Feel free to remove irrelevant items. --> - [x] The title is concise, informative, and self-explanatory. - [ ] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [ ] I have created tests covering the changes. - [ ] I have updated the documentation accordingly. ### ⌛ Dependencies <!-- List all open PRs that this PR logically depends on - sagemath#12345: short description why this is a dependency - sagemath#34567: ... --> <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> URL: sagemath#36694 Reported by: Matthias Köppe Reviewer(s): Kwankyu Lee, Tobias Diez
This makes the CI conda pass by marking affected files as "known baseline failures", thus reducing the noise that this workflow makes on PRs (including #36666 (comment))
Less intrusive version of #36372.
In addition to the failure marked there, here we collect a number of more failures observed over the course of a week in https://github.com/sagemath/sage/actions/workflows/ci-conda.yml?query=is%3Acompleted
📝 Checklist
⌛ Dependencies