Skip to content

Conversation

can-anyscale
Copy link
Contributor

@can-anyscale can-anyscale commented Nov 17, 2023

Why are these changes needed?

As a follow-up to #40451, this change makes all the Ray release tests use pydantic>=2.5.0.

Additional changes:

  • The linter now uses mypy==1.7.0.
  • All Ray release tests, including workspace templates, now use deepspeed>=0.12.3.
  • This change fix the serve_resnet_benchmark.py release test, which is broken on master.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Release tests
      • This change updates Ray release tests.

Signed-off-by: can <can@anyscale.com>
Copy link
Collaborator

@aslonnie aslonnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this ready for review? could you say more in the PR title and description for the context?

@can-anyscale
Copy link
Contributor Author

w00t did I add you as a reviewer; this is for @shrekris-anyscale , he will pick this up

Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
@shrekris-anyscale shrekris-anyscale changed the title byod change Make Ray releases test use pydantic>=2.5.0 Nov 18, 2023
@shrekris-anyscale shrekris-anyscale changed the title Make Ray releases test use pydantic>=2.5.0 Make Ray releases tests use pydantic>=2.5.0 Nov 18, 2023
Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
@shrekris-anyscale shrekris-anyscale requested review from kouroshHakha and removed request for shrekris-anyscale November 19, 2023 20:15
@shrekris-anyscale
Copy link
Contributor

serve_resnet_benchmark passed with the change. Buildkite link

@shrekris-anyscale
Copy link
Contributor

The only remaining stable/non-flaky/non-jailed test that failed is long_running_serve.aws (Buildkite link). This also failed on master last night due to an infra error (link). I don’t think this test uses pydantic in any unique way, and the error traceback was the same on master and on the PR, so I think the PR run also failed due to an infra error.

@shrekris-anyscale
Copy link
Contributor

shrekris-anyscale commented Nov 20, 2023

Here's a link to the Buildkite run where the rest of the stable/non-flaky/non-jailed tests pass.

Note that this run happened before I made the serve_resnet_benchmark fix, so the serve_resnet_benchmark failure is no longer relevant.

@shrekris-anyscale shrekris-anyscale added the release-blocker P0 Issue that blocks the release label Nov 20, 2023
Copy link
Contributor

@ericl ericl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stamping for codeowners

@shrekris-anyscale
Copy link
Contributor

shrekris-anyscale commented Nov 20, 2023

I kicked off a long_running_serve.aws run on the commit just before the pydantic version was updated: Buildkite link.

Screen Shot 2023-11-20 at 11 00 06 AM

Copy link
Contributor

@kouroshHakha kouroshHakha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the failures on RLlib. They are not related. Separately pinged @sven1977 on what needs to be done there.

Copy link
Contributor

@c21 c21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The jailed test failure on Data side should be irrelevant (out of disk error). Okay to merge.

@shrekris-anyscale
Copy link
Contributor

The long_running_serve.aws smoke test ran successfully: Buildkite link

@shrekris-anyscale
Copy link
Contributor

shrekris-anyscale commented Nov 21, 2023

I kicked off a long_running_serve.aws run on #41237 just before the pydantic version was updated: Buildkite link.

The run failed with the same error as this PR. It's unlikely that this PR is the root cause. There are no remaining test failures connected to this PR.

@architkulkarni This PR is ready to merge.

@edoakes
Copy link
Collaborator

edoakes commented Nov 23, 2023

Thanks for helping out with this @can-anyscale

ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Nov 29, 2023
As a follow-up to ray-project#40451, this change makes all the Ray release tests use pydantic>=2.5.0.

Additional changes:

The linter now uses mypy==1.7.0.
All Ray release tests, including workspace templates, now use deepspeed>=0.12.3.
This change fix the serve_resnet_benchmark.py release test, which is broken on master.

---------

Signed-off-by: can <can@anyscale.com>
Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
Co-authored-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-blocker P0 Issue that blocks the release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants