Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

How does BatchNorm exactly work in MxNet? #3871

@ufukcbicici

Description

@ufukcbicici

It was not immediately clear to me how the BatchNorm operator works from the documentation. In the original paper, in the training stage, the operator uses the batch statistics for normalization and accumulate those as moving averages, which are to be used in the inference stage.

The MxNet BatchNorm operator has a "use_global_stats" flag, which adjusts, if I understand correctly, that behavior. If set to true, it uses the global statistics from the auxillary arrays and if set to false, it uses batch statistics.

Now, my question is, how does setting "is_train" to True/False in the forward pass affacts the behavior of the BatchNorm, combined with the use_global_stats flag? For example, does setting use_global_stats to False would override is_train flag and cause the operator to use batch statistics everytime? Or is "is_train" not effective for BatchNorm at all?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions