-
Notifications
You must be signed in to change notification settings - Fork 825
Closed
Description
As CleanRL gets more mature, it's time to re-think the future. With CleanRL 1.0, we'd hope to further improve documentation and design better contribution guidelines. This issue tracks a few desired items for CleanRL 1.0.
1.0
- Refactor the
argparse
parameters to havelearning_rate
total_timesteps
move down to algorithm-specific arguments. #116 - Table of implemented algorithms and links that refer to the paper. (see https://docs.cleanrl.dev/rl-algorithms/overview/)
- Refactor documentation #121
- Re-think Open RL Benchmark. #123
- Friendlier
CONTRIBUTING.md
#117 - Seed issue with
dqn.py
and others #171 - TD3 should also log losses for
qf2
#172 - Also log
episodic_length
for non-PPO scripts. #168 - Investigate
nn.utils.clip_grad_norm_
for DQN, DDPG, and TD3 #148 - Auto-upgrade syntax via
pyupgrade
#158
1.1+
- Hyperparameter optimization #228
- Multi-objective hyperparameter optimization #265
- Adopt or not to adopt tensorboard native hyperparameters recording #184
- Add pre-commit for Markdown formatting #183
- Add
rnd_ppo.py
documentation and refactor #127 - DDPG/TD3 target_actor output clip #196
- PPO improvements #206
- JAX Integration with CleanRL #218
- Upgrade gym version to 0.26.1 #263
Metadata
Metadata
Assignees
Labels
No labels