Skip to content

[FEATURE]: Save inference output as a dataset which can be downloaded #405

@ividal

Description

@ividal

Motivation

When the user runs an inference job (#339), an extra "field" is calculated containing a model’s predictions. The semantics of this field might be different depending on the purpose inference was used for (e.g., prediction or ground truth generation), but in all cases it is a prediction of a model. Only the user should keep track of what the data was used for.

Inference should create a new dataset, with one extra column containing the model's predictions. This dataset should be stored on object storage and the database should be updated.

Alternatives

  • An idea is to allow the prediction field to be named differently based on whether the job's goal was to generate "ground-truth" or another form of prediction. However, this could lead to errors. To maintain simplicity, we could standardize the output of an inference job to always represent "predictions", leaving it to the user to handle any further differentiation as needed.
  • Having MLflow (Outline the introduction of MLFlow for experiment tracking #438) handle this entirely. Ideally this work would be independent, though: whether there is a tracking service or not, it is reasonable that the inference job stores the output somewhere and returns the URI. Whether this URI is tracked with one service or another is independent.

Metadata

Metadata

Assignees

Labels

apiChanges which impact API/presentation layerbackendenhancementNew feature or request

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions