You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
It was not immediately clear to me how the BatchNorm operator works from the documentation. In the original paper, in the training stage, the operator uses the batch statistics for normalization and accumulate those as moving averages, which are to be used in the inference stage.
The MxNet BatchNorm operator has a "use_global_stats" flag, which adjusts, if I understand correctly, that behavior. If set to true, it uses the global statistics from the auxillary arrays and if set to false, it uses batch statistics.
Now, my question is, how does setting "is_train" to True/False in the forward pass affacts the behavior of the BatchNorm, combined with the use_global_stats flag? For example, does setting use_global_stats to False would override is_train flag and cause the operator to use batch statistics everytime? Or is "is_train" not effective for BatchNorm at all?