-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Modified the model_serialization to have correct inputs and outputs #5018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modified the model_serialization to have correct inputs and outputs #5018
Conversation
@@ -40,6 +40,36 @@ def is_exporting(): | |||
return exporting_to_onnx._local_data._is_exporting | |||
|
|||
|
|||
class TensorNames: | |||
BatchSizePlaceholder = "batch_size" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please follow python naming conventions for these. I understand that you probably wanted to copy-paste the C# code here, but it makes the resulting python code stick out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will do.
Just to make sure, are we okay with this approach of having a class with static strings or is there a more Pythonic way to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could do it as a enum too (python enums aren't constrained to integer values) - see BufferKey
in buffer.py, but there's not much advantage to doing that here (and you'd have to write a bit more code to cast to strings).
You could also add methods to the class for combining the prefix + index, like TensorNames.get_visual_observation_name(i)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to have as much as possible the same code between C# and Python for the TensorNames, but I think adding helper methods would help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think adding helper methods would help.
It's right there in the name :)
Fee free to add to C# too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
recurrentOutputH = "recurrent_out_h" | ||
recurrentOutputC = "recurrent_out_c" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these used anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but I think there is a benefit in having symmetry with the C# code that does the same thing. I can remove them if someone disagrees strongly with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I'm wrong but seems like in C# recurrentOutputH
and recurrentOutputC
are not used either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are. They are actually the only ones used. recurrentOutput is not used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TensorGenerator and TensorApplier are both using TensorNames.RecurrentOutput but I don't see where they're using recurrentOutputH? same for recurrent input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, the names used are recurrent_out_h
and recurrent_out_c
but these are not grabbed from the TensorNames but directly from Barracuda. But you are right, it does not make sense to keep them explicitly here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying, that makes sense to me now.
… removed the _h and _c placeholders
Proposed change(s)
Before :


And after :
This should not be a breaking change since the model uses the Barracuda implicit input and output names. This will be useful when Barracuda will allow us to use named tensors for LSTM.
EDIT:
Some context. Currently, Barracuda creates new names for inputs and outputs of the LSTM (see here https://github.com/Unity-Technologies/ml-agents/blob/main/com.unity.ml-agents/Runtime/Inference/TensorApplier.cs#L79) and we use these at inference. This means that neither RecurrentOutput nor RecurrentOutputH/C are used at all. When Barracuda switches to use the names of the ONNX model, we need to make sure that there are appropriately named inputs and outputs to the LSTM.
Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)
Slack message : https://unity.slack.com/archives/C8FECS6L9/p1614370112116900
JIRA : MLA-1807
Types of change(s)
Checklist
Other comments