Fix Failed Loaded Modules being considered Ready #763

HonakerM · 2024-09-04T18:39:09Z

What this PR does / why we need it:
There is a bug in Caikit when a model failed to load but its LoadedModel instance is still marked as "loaded" because the loaded() check only considers if the future finished not if the model is actually loaded. This PR fixes this by adding an additional parameter require_instance which ensures that a local instance is actually available.

Special notes for your reviewer:

If applicable:

this PR contains documentation
this PR contains unit tests
this PR has been tested for backwards compatibility

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>

gabe-l-hart

One NIT on the docstring and a general question about why this is staying optional rather than just fixing the logic

caikit/runtime/model_management/loaded_model.py

gabe-l-hart · 2024-09-05T15:15:58Z

caikit/runtime/model_management/loaded_model.py

@@ -118,7 +118,11 @@ def model(self) -> ModuleBase:
        self.wait()
        return self._model

-    def loaded(self) -> bool:
+    def loaded(self, require_instance: bool = False) -> bool:


Why is the default False? I guess really, why make this an argument at all? It seems like the old logic was just wrong? Did you trace history to figure out why or self._caikit_model_future.done() was added?

I didn't trace history but logically it seems loaded used to see if the ModelFuture was generally completed not just successfully completed. When I checked caikit general it doesn't seem like we use loaded outside the servicer so assumed it was used in a subclass/other implementation.

After looking through the history it seems like it is only used in the info servicer. I'm game to change this to the default

caikit/runtime/model_management/loaded_model.py

Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com> Signed-off-by: Michael Honaker <37811263+HonakerM@users.noreply.github.com>

gabe-l-hart

Ok, yeah, I looked through history and it looks like I added this all in one line, so that second clause was not added for a specific reason. The rest of the code seems to clearly indicate that self._model is only ever set to self._caikit_model_future.result(), so I would say let's just remove that second clause and not have this argument.

Also, linting and passing tests!

HonakerM · 2024-09-05T15:32:46Z

so I would say let's just remove that second clause and not have this argument.

Sounds good!

Also, linting and passing tests!

I just committed your changes 👀

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>

gabe-l-hart · 2024-09-05T15:43:31Z

I just committed your changes 👀

🤦 This is why I still don't really trust suggestions in GH. You can't see how long the lines are!

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>

HonakerM · 2024-09-05T17:39:55Z

@gabe-l-hart it looks like a test is failing because it was using loaded() to check if a job completed without calling model(). Any suggestions on a solution? Should I change the test or add a new function completed() to check if the job completed

gabe-l-hart · 2024-09-05T17:42:02Z

Any suggestions on a solution? Should I change the test or add a new function completed() to check if the job completed

Hmmm. Yeah, that sounds like the right option. I'd suggest calling it load_finished or something that makes it clear that it's just checking the status of the load operation and not the outcome.

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>

gabe-l-hart

Ship it. Thanks Casey!

HonakerM requested review from gabe-l-hart, joerunde, prashantgupta24, gkumbhat, hickeyma, evaline-ju, alex-jw-brooks, tharapalanivel and aluu317 as code owners September 4, 2024 18:39

Fix Failed Loaded Modules being considered Ready

37a3037

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>

HonakerM force-pushed the add_restrictive_load_check branch from 0533fa5 to 37a3037 Compare September 4, 2024 18:55

gabe-l-hart requested changes Sep 5, 2024

View reviewed changes

HonakerM and others added 2 commits September 5, 2024 11:20

Update caikit/runtime/model_management/loaded_model.py

4e21efb

Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com> Signed-off-by: Michael Honaker <37811263+HonakerM@users.noreply.github.com>

Update caikit/runtime/model_management/loaded_model.py

d78e2c1

Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com> Signed-off-by: Michael Honaker <37811263+HonakerM@users.noreply.github.com>

gabe-l-hart requested changes Sep 5, 2024

View reviewed changes

Remove new param

fa26039

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>

Fix Formatting in Test

754ea9e

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>

Address NewFunc changes

514491f

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>

HonakerM requested a review from gabe-l-hart September 5, 2024 19:30

gabe-l-hart approved these changes Sep 5, 2024

View reviewed changes

gabe-l-hart merged commit d72d5bd into caikit:main Sep 5, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix Failed Loaded Modules being considered Ready #763

Fix Failed Loaded Modules being considered Ready #763

Uh oh!

HonakerM commented Sep 4, 2024

Uh oh!

gabe-l-hart left a comment

Uh oh!

Uh oh!

gabe-l-hart Sep 5, 2024

Uh oh!

HonakerM Sep 5, 2024

Uh oh!

Uh oh!

gabe-l-hart left a comment

Uh oh!

HonakerM commented Sep 5, 2024

Uh oh!

gabe-l-hart commented Sep 5, 2024

Uh oh!

HonakerM commented Sep 5, 2024

Uh oh!

gabe-l-hart commented Sep 5, 2024

Uh oh!

gabe-l-hart left a comment

Uh oh!

Uh oh!

Uh oh!

Fix Failed Loaded Modules being considered Ready #763

Fix Failed Loaded Modules being considered Ready #763

Uh oh!

Conversation

HonakerM commented Sep 4, 2024

Uh oh!

gabe-l-hart left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gabe-l-hart Sep 5, 2024

Choose a reason for hiding this comment

Uh oh!

HonakerM Sep 5, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gabe-l-hart left a comment

Choose a reason for hiding this comment

Uh oh!

HonakerM commented Sep 5, 2024

Uh oh!

gabe-l-hart commented Sep 5, 2024

Uh oh!

HonakerM commented Sep 5, 2024

Uh oh!

gabe-l-hart commented Sep 5, 2024

Uh oh!

gabe-l-hart left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!