-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[ENH] Added save/load functionality for Deep Learning models #3128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -46,6 +46,15 @@ class CNNClassifier(BaseDeepClassifier): | |||
|
|||
Adapted from the implementation from Fawaz et. al | |||
https://github.com/hfawaz/dl-4-tsc/blob/master/classifiers/cnn.py | |||
Examples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
misformatted, please fix
@@ -134,7 +143,11 @@ def _fit(self, X, y): | |||
|
|||
check_random_state(self.random_state) | |||
self.input_shape = X.shape[1:] | |||
self.model_ = self.build_model(self.input_shape, self.n_classes_) | |||
self.model_ = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you kindly explain this change? Should self.model_
not always be None
, and replaced even if it is already fitted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is related to the load
functionality. The thought here is that:
- user first creates object of the estimator class
- user then calls the
load
function with path BaseDeepClassifier
loads the model intoself.model_
- user calls
model.fit
- Within
CNNClassifier.fit
function, we first check ifself.model_
is initially defined (that is, if user has already loaded in the network) or it isNone
. If it isNone
, then we build the network.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds like an odd workflow, can you explain why that would be desired or intuitive?
The "normal" save/load workflow is as follows:
A. save
A1. user builds model model
A2. user saves model to a location saved_model
, via model.save
B. load
B1. user loads model
from saved_model
, via load
(loose function)
Your workflow part B takes three steps.
We should also distinguish, are you trying to save/load a model configuration (unfitted), or a model serialization (fitted)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking in terms of saving/loading a fitted model (model serialisation). Are there any use cases where there would be a point of saving a model configuration?
"""Save the model at the given path.""" | ||
self.model_.save(path) | ||
|
||
def load(self, path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should load
be a method of an estimator? I thought of it as a loose function that produces a fitted estimator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change requests:
- I think
load
should be a loose function, not of an estimator. Otherwise you would always have to construct an estimator before loading sth. Think of the user workflow. - Is there a possibility to serialize the model in memory, if
path=None
? The return would be a reference.
Hey @fkiraly ,
If |
Could you kindly spell out, with code, intended save/load workflows? We should cover:
I think it is really important that the design is intuitive and matches user expectations. I would also recommend we write a STEP, since this might become a central interface point |
Sure,
cnn = CNNClassifier()
cnn.fit(X_train, y_train)
cnn.save("./saved_model")
----------------------------------------------
cnn = CNNClassifier(batch_size=32) #If users wants to mention different parameter
cnn.load("./saved_model")
output = cnn.predict(X_test) Here, I have written the functionality to save a fitted model. If, for some reason, a user were to do
import pickle
vecm = VECM()
vecm.fit(train, fh=fh)
save_output = pickle.dumps(vecm)
----------------------------------------------
model = pickle.loads(save_output)
model.predict(fh=fh) This save functionality is there via The idea behind from sktime import load_network
cnn_network = load_network('path`) In this case, we are making the neural network directly accessible to the user (which we have not done yet in any model by design, either neural or statistical from what I know. Let me know if otherwise). My implementation was with the idea that users should not directly interact with the model, but with an abstraction layer of class/object (either that is a wrapper, as in the case of |
Thanks for elucidating! I still think the problems are as I have explained. The principles I'd like to invoke are:
|
from sktime.classification import load_network
cnn = load_network(pathA)
cnn = CNNClassifier()
cnn.load(pathA)
cnn.predict(X) Here, the user does not need to know the exact configuration with which they had originally fitted the network. This functionality is ensured in the b. If the user wants to load the model to train on more data points: This is useful if after some time you realise there are more data points you can train your model on, or if you want to do something related to transfer learning. In such a case: cnn = CNNClassifier()
cnn.load(path)
cnn.fit(train_X, train_y) Here, the user does not need to know the exact configuration of saved network either, because of the line 146-150 in self.model_ = (
self.build_model(self.input_shape, self.n_classes_)
if self.model_ is None
else self.model_
) The parameters are only required/used in |
No, I think
That's why we need to write our own save/load functions. The
I see - but then we have another problem, which I think is worse: the model will be loaded with the wrong parameters, and any inspection of it will return the wrong parameters. So, after
Hm, model updating is currently something the interface does not support. A call for We could think about an |
PS: you have conflicts with |
Right, I was waiting for this discussion to resolve before solving the merge conflicts, but I will do that asap.
Aaaah, okay, that does resolve a lot of discussion we had. Maybe some day later then.
Okay, I understand. The main issue is when we generally save a class, it saves the state dictionary. This state dictionary contains all the class parameters (as well as current state of the object). The state is saved via pickle, generally. The issue is that
I do not understand this point, can you please give an example? |
An update on this solution in line with what Franz has put forward in #3336 (writing it here since I do not think a STEP document is still required) A simple solution would be to overwrite the proposed def __getstate__(self):
out = self.__dict__.copy()
del out['optimizer']
del out['optimizer_']
return out This will be done in In the proposed def save(self, path):
with ZipFile(path) as zipfile:
with zipfile.open("metadata", mode="w") as meta_file:
meta_file.write(type(self))
with zipfile.open("object", mode="w") as object:
object.write(pickle.dumps(self))
with zipfile.open("network", mode="w") as object:
self.model_.save(path)
return ZipFile(path) And we will just need to reinitialise the particular optimizer when we are loading them again def load_from_path(self, path):
self.model = keras.load(path)
self.optimizer = self.model.optimizer Hence, we have all the information stored (and thus user does not need to remember anything at all), and there is no information loss either. |
Some feedback:
If it is common to all DL estimators (not classifiers), should it not be done in
these are just files in the
Yes, good idea! Re overwriting
Well, I've changed the pickle tests to save/load, so those should pass. Further comments:
|
In the current design this would be done in
We are not technically removing the optimizer. In the comment I made, in the I will be creating a step document with this in mind, I hope the other doubts would be resolved in those. |
I'm closing this in favour of #3425 where I have also added you as an author (for get/setstate) |
Serialization and deserializSerialization and deserialization interface for sktime estimators, supporting the following workflows: 1. serialization to in-memory object and deserialization 2. serialization to single file and deserialization Initially proposed in sktime issues [#3128](sktime/sktime#3128) and [#3336](sktime/sktime#3336). The proposed design introduces: * an object interface points for estimators, `save` * a loose method `load` * a file format specification `skt` for serialized estimator objects which are counterparts and universal. Concrete class or intermediate class specific logic is implemented in: * `save`, a class method * `load_from_path` and `load_from_serial`, class methods called by `load` while no class specific logic is present in `load`.ation for sktime estimators
Reference Issues/PRs
Fixes #3022
What does this implement/fix? Explain your changes.
Added functions in
BaseDeepClassifier
(as discussed with @GuzalBulatova ) to allow users to save and load their keras models.PR checklist
For all contributions