Skip to content

PPO reward normalization works only for default gamma #203

@Howuhh

Description

@Howuhh

Problem Description

Current implementation of continuous action PPO uses gym.wrappers.NormalizeReward with default gamma value, for all other gamma's except default 0.99 this normalization will be not correct.

env = gym.wrappers.NormalizeReward(env)

Possible Solution

Very easy, just add gamma=args.gamma as an argument to the normalization wrapper.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions