-
Notifications
You must be signed in to change notification settings - Fork 70
Implement Prediction Job #751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
6efcfae
to
815c02a
Compare
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
815c02a
to
c0419bb
Compare
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
e19b10d
to
875f66b
Compare
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
875f66b
to
ba034eb
Compare
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
394a6a4
to
6badeab
Compare
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
6badeab
to
1a10065
Compare
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
1a10065
to
38b3ced
Compare
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
c049d97
to
c38fdaa
Compare
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
e403758
to
854323e
Compare
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
b49d449
to
b4be1ee
Compare
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A pile of NITs still (mostly copy-pasta) and a few structural things I still think need to be addressed to avoid API breaking changes:
- Preserve kwarg names for
ModelFuture
to avoid breaking existing trainers - Don't switch 200 -> 202 (I kind of hate this one, but I think it's important)
- Let's see if we can solve the
py-to-proto
duplicate import issue too!
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
3416581
to
1db2be0
Compare
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
21c487f
to
e30724e
Compare
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
f682417
to
e8a31f8
Compare
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple more NITs and a small outstanding list:
- We need to figure out if changing
TrainingStatus
->JobStatus
in theTrainingStatusResponse
is an API breaking change job_id
->prediction_id
in data model to mirrortraining_id
for training jobs- Duplicate log code search (we should really put in a CI check for this)
@@ -43,7 +43,7 @@ class ModelPointer(DataObjectBase): | |||
@dataobject(RUNTIME_PACKAGE) | |||
class TrainingStatusResponse(DataObjectBase): | |||
training_id: Annotated[str, FieldNumber(1)] | |||
state: Annotated[TrainingStatus, FieldNumber(2)] | |||
state: Annotated[JobStatus, FieldNumber(2)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we figured out whether this will cause an overall API breaking change by changing the name of the protobuf?
class PredictionJobInfoRequest(DataObjectBase): | ||
"""DataModel to request information about a PredictionJob""" | ||
|
||
job_id: Annotated[str, FieldNumber(1)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're officially differentiating training_id
and prediction_id
elsewhere, I think maybe this should be named prediction_id
instead of job_id
? That would also mirror TrainingInfoRequest
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I'll make the update. I originally thought of making it more generic but since training is specific predictions should be as well
self.notify_backends_with_context(model_id, context) | ||
|
||
# Retrieve the model from the model manager | ||
log.debug("<RUN52259029D>", "Retrieving model '%s'", model_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate log code! Let's make sure we do a grep for these. You could also just remove them on any messages debug
or lower
## Interface-agnostic entrypoints ## | ||
#################################### | ||
|
||
def get_job_result(self, job_id: str) -> DataObjectBase: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do job_id
-> prediction_id
in the interface, I think these should all change to <verb>_prediction_<noun>
Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com> Signed-off-by: Michael Honaker <37811263+HonakerM@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com> Signed-off-by: Michael Honaker <37811263+HonakerM@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com> Signed-off-by: Michael Honaker <37811263+HonakerM@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com> Signed-off-by: Michael Honaker <37811263+HonakerM@users.noreply.github.com>
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
0e8065e
to
c03ef83
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ONE. MORE. NIT!
caikit/core/model_manager.py
Outdated
Args: | ||
future_id (str): The ID string from the original prediction | ||
submission's ModelFuture | ||
|
||
Returns: | ||
model_future (JobPredictorFutureBase): The future handle | ||
to the job which holds the status of the in-flight prediction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Args: | |
future_id (str): The ID string from the original prediction | |
submission's ModelFuture | |
Returns: | |
model_future (JobPredictorFutureBase): The future handle | |
to the job which holds the status of the in-flight prediction. | |
Args: | |
prediction_id (str): The ID string from the original prediction | |
submission's ModelFuture | |
Returns: | |
prediction_future (JobPredictorFutureBase): The future handle | |
to the job which holds the status of the in-flight prediction. |
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ship it! I agree that we should call this an API breaking change even though it's very minimal. We should spell out exactly what would break in the release notes.
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
What this PR does / why we need it:
This PR refactors the model trainer abstraction to be a more generic "job" abstraction. With this job abstraction we are able to implement both inferences and trainings in the background and poll their status/results.
This type of functionality has multiple use-cases inside Caikit with a few listed below:
This PR creates new http and grpc endpoints for the job predictions:
HTTP
GRPC:
Special notes for your reviewer:
If applicable: