Skip to content

Add Learning Rate Annealing to PPO #22

@awjuliani

Description

@awjuliani

Current implementation of PPO uses fixed learning rate for duration of training process. This can produce degenerate models later in training, when a smaller learning rate is necessary.

Learning rate should be annealed over time to 0.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions