Skip to content

Conversation

mkoeppe
Copy link
Contributor

@mkoeppe mkoeppe commented Nov 8, 2023

This makes the CI conda pass by marking affected files as "known baseline failures", thus reducing the noise that this workflow makes on PRs (including #36666 (comment))

Less intrusive version of #36372.

In addition to the failure marked there, here we collect a number of more failures observed over the course of a week in https://github.com/sagemath/sage/actions/workflows/ci-conda.yml?query=is%3Acompleted

📝 Checklist

  • The title is concise, informative, and self-explanatory.
  • The description explains in detail what this PR is about.
  • I have linked a relevant issue or discussion.
  • I have created tests covering the changes.
  • I have updated the documentation accordingly.

⌛ Dependencies

Copy link
Contributor

@tobiasdiez tobiasdiez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are only a couple of test failures in these files. These ignore patterns are way to coarse.

(removing blocker label for that reason to not hide possible issues, and because it's not working anyway in this context)

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 9, 2023

because it's not working anyway in this context

It's working as designed. Green checkmarks. Failures marked as "failed in baseline"
https://github.com/sagemath/sage/actions/runs/6804955526/job/18503541137?pr=36678#step:11:10205

@tobiasdiez
Copy link
Contributor

So if another new test in the same file fails, the conda ci is red?

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 9, 2023

Perhaps you should clarify what you think the purpose of the ci-conda workflows is.
To my understanding they are being run so that the major breakage that we have seen in the past (from unmonitored package upgrades in conda-forge) does not go unnoticed.
I don't think it's really the job of this workflow to duplicate the test coverage done by build.yml

Marking these files as baseline failures is done so that on unrelated PRs, the red checkmarks on the conda workflows do not annoy developers.

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 9, 2023

And obviously work is needed to actually diagnose, report, and fix these failures. The approach of #36372 ("skip_conda") hides these failures too well.

@tobiasdiez
Copy link
Contributor

Perhaps you should clarify what you think the purpose of the ci-conda workflows is. To my understanding they are being run so that the major breakage that we have seen in the past (from unmonitored package upgrades in conda-forge) does not go unnoticed. I don't think it's really the job of this workflow to duplicate the test coverage done by build.yml

They have the same purpose as the Build & Test workflow for people using the conda environment. So yes, they should duplicate the test coverage.

(In the long term, once the conda runs are a bit more stable and there are no more conda-specific issues, we can remove the Build & Test workflow).

@kwankyu
Copy link
Collaborator

kwankyu commented Nov 10, 2023

They have the same purpose as the Build & Test workflow for people using the conda environment. So yes, they should duplicate the test coverage.

B&T is mainly for checking sage library changes to help developers and reviewers. It is unlikely that some code change in the sage library works on a platform and fails on another. That is why B&T runs only on a specific decent platform and tests on all other platforms are done only for weekly release.

I do not use conda (linux neither). I regard conda as another platform. So it looks strange to me that currently 6 B&T workflows for conda platforms run for each PR. I think there is already too much duplication.

Perhaps conda is special, and separate B&T for conda is necessary for PRs (perhaps upgrading packages). Then why not just two B&T for conda (one for ubuntu and one for mac) for PRs and more tests for weekly release?

About #36372, I object to introducing conda-specific doctest tags into sage library. That is another cluttering of sage library, in addition to all doctest tags for modularization. We decided to live with modularization. But no platform-specific doctest tags, please.

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 10, 2023

it looks strange to me that currently 6 B&T workflows for conda platforms run for each PR. I think there is already too much duplication.

FWIW, I approved the change to run these jobs on all PRs in #36373. Tobias has already implemented a mitigation (see #36373 (comment)) there on my request by making it "fail-fast" and setting a max-parallel 2.

But I think at least the macOS jobs need reconsideration / additional adjustment. Given that we only have 5 macOS runners, launching 3 jobs (each 1 hour) for every PR is clearly too much.

For example, at the time that 10.2.rc1 was released (over an 1 hour ago), at least 7 "ci-conda" workflows were already in the queue (https://github.com/sagemath/sage/actions/workflows/ci-conda.yml?query=is%3Aqueued), with at least 14 macOS jobs waiting for a runner. As a result, the CI-macos run of 10.2.rc1 (https://github.com/sagemath/sage/actions/runs/6827163283) as well as the cibuildwheel job for macOS (https://github.com/sagemath/sage/actions/runs/6827163257/job/18568938031) have not been able to start yet.

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 11, 2023

I think at least the macOS jobs need reconsideration / additional adjustment. Given that we only have 5 macOS runners, launching 3 jobs (each 1 hour) for every PR is clearly too much.

Implemented in #36694

@vbraun vbraun force-pushed the develop branch 2 times, most recently from 883e05f to e349b00 Compare November 12, 2023 16:25
@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 14, 2023

Then why not just two B&T for conda (one for ubuntu and one for mac) for PRs and more tests for weekly release?

I agree and have updated #36694 to do this.

Copy link

Documentation preview for this PR (built with commit 3c2edf3; changes) is ready! 🎉

@kwankyu
Copy link
Collaborator

kwankyu commented Nov 23, 2023

Currently cond ci is the main reason breaking and delaying checks.

Let's get this in.

@tobiasdiez
Copy link
Contributor

tobiasdiez commented Nov 25, 2023

I would. We should. "B&T for conda" is specifically for conda. If there is a failure with it, I would exclude the file from being tested for other PRs until the failure is fixed.

Really? You would exclude the whole file and not just mark the single failing test with "# random"?

We are talking about some doctest failing only in conda ci, and there is no problem with the doctest on the rest of the platforms. Why tag it with # random?

It seems to be the established practice: #36647 (among others)
But I agree with you that random is not a good solution for this, but since there are currently no means to tag a doctest "this breaks on system xyz, but not others" it's the next best thing. To have a more conceptual solution, I introduced the coda-specific exception.

Anyway, I'm okay with ignoring the failing tests using an external file. But this mechanism should only exclude the tests that are really failing to a) prevent that additional errors are introduced and b) to make it easier to work on these issues (it's essential to know which line actually breaks randomly in source file of +1000 lines).

@kwankyu
Copy link
Collaborator

kwankyu commented Nov 25, 2023

I would. We should. "B&T for conda" is specifically for conda. If there is a failure with it, I would exclude the file from being tested for other PRs until the failure is fixed.

Really? You would exclude the whole file and not just mark the single failing test with "# random"?

We are talking about some doctest failing only in conda ci, and there is no problem with the doctest on the rest of the platforms. Why tag it with # random?

It seems to be the established practice: #36647 (among others) But I agree with you that random is not a good solution for this, but since there are currently no means to tag a doctest "this breaks on system xyz, but not others" it's the next best thing. To have a more conceptual solution, I introduced the coda-specific exception.

Anyway, I'm okay with ignoring the failing tests using an external file. But this mechanism should only exclude the tests that are really failing to a) prevent that additional errors are introduced and b) to make it easier to work on these issues (it's essential to know which line actually breaks randomly in source file of +1000 lines).

Just focus on the failing test that made the file excluded. We may add a comment telling the failing test (or tests) when we add the file to the list of the ignored files. It would make things complicated if we try to exclude only the failing tests. Why make things complicated just for rare cases (almost looking imaginary to me)?

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 25, 2023

b) to make it easier to work on these issues (it's essential to know which line actually breaks randomly in source file of +1000 lines).

One sees this in the logs -- because with the approach here, the failures are not hidden but only marked as "[failed in baseline]".

@tobiasdiez
Copy link
Contributor

tobiasdiez commented Nov 25, 2023

b) to make it easier to work on these issues (it's essential to know which line actually breaks randomly in source file of +1000 lines).

One sees this in the logs -- because with the approach here, the failures are not hidden but only marked as "[failed in baseline]".

That clearly only works for the issues that occur often/reproducible. I was talking about the random failures that happen once every 100 runs or so (which is the order of magnitude of the remaining failures that we are currently seeing from time to time).

@kwankyu
Copy link
Collaborator

kwankyu commented Nov 25, 2023

The random failure you are talking about seems not related with PRs. Why do you want to observe it in PR checks? To investigate such a random failure, I would just devise massive repetitive test runs with just one version of sage. No?

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 25, 2023

I also don't understand the comment in #36678 (comment)

@tobiasdiez
Copy link
Contributor

Why do you want to observe it in PR checks?

I don't want to observe them at all. It just reality that these failing tests are occurring very rarely, which makes it hard to reproduce them locally. Heck, even now there are random test failures of the Build & Test workflow (e.g. https://github.com/sagemath/sage/actions/runs/6970122859/job/18967437726). Do you want to exclude the whole file from the tests because of this?

To investigate such a random failure, I would just devise massive repetitive test runs with just one version of sage. No?

I actually did this. But that only revealed a couple of errors. But if you have computation time over that you don't need, I would appreciate it very much if you could run the conda tests in a loop and let me know about the failures.

I also don't understand the comment in #36678 (comment)

Take a failure that occurs once every 100 runs. If you only know that the failure happens in a given file, you need to run all doctests in that file for 100 times (I know, I know statistics...), which could easily take a few hours depending on the file, only to reproduce it locally. And that would imply that the error actually happens on the given system (some of the errors are occurring only on eg macos). If you know the exact test that fails, you copy that one to a test file, put a loop around it and have it reproduced in a couple of mins.

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 26, 2023

[...] If you know the exact test that fails [...]

... then you open an Issue to report it.

There is no need for adding a record of the exact failing test to the source code.

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 26, 2023

even now there are random test failures of the Build & Test workflow [...]. Do you want to exclude the whole file from the tests because of this?

Asked and answered already.

@tobiasdiez
Copy link
Contributor

[...] If you know the exact test that fails [...]

... then you open an Issue to report it.

There is no need for adding a record of the exact failing test to the source code.

That example was for the situation where I have 15min extra time and want to use that time to fix a conda bug. With your solution here, I already may need 2 hours to just find out which test is actually failing.

Another example: #36372 (comment) it was very easy to recognize that another PR fixed a certain failure on conda because the info was there directly in the code. With the coarse ignore rules proposed in this PR here, that would have been not that easy. (That also brings up the question who is maintaining the exclusion list, and how?)

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Dec 3, 2023

I don't think this adds anything to the discussion here. I'll explain again:

  1. Test failure is noticed
  2. You record it in a GitHub Issue. There you include the necessary information so that it does not take 2 hours to reconstruct it.
  3. One adds stuff to the exclusion list to eliminate noise in the tests. It's not for humans to read, and definitely not for humans to use 2 hours of guessing time to reconstruct the failure.
  4. The exclusion list is maintained like everything else in Sage -- by PRs, not by specific people in charge.

@kwankyu
Copy link
Collaborator

kwankyu commented Dec 3, 2023

  1. Test failure is noticed
  2. You record it in a GitHub Issue. There you include the necessary information so that it does not take 2 hours to reconstruct it.
  3. One adds stuff to the exclusion list to eliminate noise in the tests.

and items in the exclusion list are commented with the the GitHub Issue numbers...

Copy link
Collaborator

@kwankyu kwankyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@tobiasdiez
Copy link
Contributor

tobiasdiez commented Dec 8, 2023

Matthias, can you please just stop adding "positive review" labels on your PRs. Thanks!

Labeling it as "r: invalid" and "s: needs review" as a motion to close this PR. There are clear short comings with the approach of this PR as described above.

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Dec 8, 2023

There are clear short comings with the approach of this PR as described above.

Nope, all your claims have been refuted.

@vbraun vbraun merged commit 0a69c2a into sagemath:develop Dec 19, 2023
tobiasdiez added a commit to tobiasdiez/sage that referenced this pull request Dec 19, 2023
vbraun pushed a commit to vbraun/sage that referenced this pull request Feb 19, 2024
sagemathgh-36694: CI conda: On `pull_request`, only run 1 macOS job and 1 Linux job
    
<!-- ^^^^^
Please provide a concise, informative and self-explanatory title.
Don't put issue numbers in there, do this in the PR body below.
For example, instead of "Fixes sagemath#1234" use "Introduce new method to
calculate 1+1"
-->
<!-- Describe your changes here in detail -->

<!-- Why is this change required? What problem does it solve? -->
as discussed in
sagemath#36678 (comment),
sagemath#36616 (comment)

On pushes to a tag, all 3 Linux and 3 macOS jobs are still run, as
demonstrated in https://github.com/mkoeppe/sage/actions/runs/6829619271

<!-- If this PR resolves an open issue, please link to it here. For
example "Fixes sagemath#12345". -->
<!-- If your change requires a documentation PR, please link it
appropriately. -->

After merging, the branch protection rule https://github.com/sagemath/sa
ge/settings/branch_protection_rules/33965567 will need to be updated. It
controls what workflows show as "Required" in the PR Checks.

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->
<!-- If your change requires a documentation PR, please link it
appropriately -->
<!-- If you're unsure about any of these, don't hesitate to ask. We're
here to help! -->
<!-- Feel free to remove irrelevant items. -->

- [x] The title is concise, informative, and self-explanatory.
- [ ] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation accordingly.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on
- sagemath#12345: short description why this is a dependency
- sagemath#34567: ...
-->

<!-- If you're unsure about any of these, don't hesitate to ask. We're
here to help! -->
    
URL: sagemath#36694
Reported by: Matthias Köppe
Reviewer(s): Kwankyu Lee, Tobias Diez
vbraun pushed a commit to vbraun/sage that referenced this pull request Feb 22, 2024
sagemathgh-36694: CI conda: On `pull_request`, only run 1 macOS job and 1 Linux job
    
<!-- ^^^^^
Please provide a concise, informative and self-explanatory title.
Don't put issue numbers in there, do this in the PR body below.
For example, instead of "Fixes sagemath#1234" use "Introduce new method to
calculate 1+1"
-->
<!-- Describe your changes here in detail -->

<!-- Why is this change required? What problem does it solve? -->
as discussed in
sagemath#36678 (comment),
sagemath#36616 (comment)

On pushes to a tag, all 3 Linux and 3 macOS jobs are still run, as
demonstrated in https://github.com/mkoeppe/sage/actions/runs/6829619271

<!-- If this PR resolves an open issue, please link to it here. For
example "Fixes sagemath#12345". -->
<!-- If your change requires a documentation PR, please link it
appropriately. -->

After merging, the branch protection rule https://github.com/sagemath/sa
ge/settings/branch_protection_rules/33965567 will need to be updated. It
controls what workflows show as "Required" in the PR Checks.

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->
<!-- If your change requires a documentation PR, please link it
appropriately -->
<!-- If you're unsure about any of these, don't hesitate to ask. We're
here to help! -->
<!-- Feel free to remove irrelevant items. -->

- [x] The title is concise, informative, and self-explanatory.
- [ ] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation accordingly.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on
- sagemath#12345: short description why this is a dependency
- sagemath#34567: ...
-->

<!-- If you're unsure about any of these, don't hesitate to ask. We're
here to help! -->
    
URL: sagemath#36694
Reported by: Matthias Köppe
Reviewer(s): Kwankyu Lee, Tobias Diez
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: scripts disputed PR is waiting for community vote, see https://groups.google.com/g/sage-devel/c/IgBYUJl33SQ p: critical / 2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants