Skip to content

Conversation

vwxyzjn
Copy link
Owner

@vwxyzjn vwxyzjn commented May 29, 2022

Description

Follow up to #144.

Types of changes

  • New feature

Checklist:

  • I've read the CONTRIBUTION guide (required).
  • I have ensured pre-commit run --all-files passes (required).
  • I have updated the documentation and previewed the changes via mkdocs serve.
  • I have updated the tests accordingly (if applicable).

If you are adding new algorithms or your change could result in performance difference, you may need to (re-)run tracked experiments. See #137 as an example PR.

  • I have contacted @vwxyzjn to obtain access to the openrlbenchmark W&B team (required).
  • I have tracked applicable experiments in openrlbenchmark/cleanrl with --capture-video flag toggled on (required).
  • I have added additional documentation and previewed the changes via mkdocs serve.
    • I have explained note-worthy implementation details.
    • I have explained the logged metrics.
    • I have added links to the original paper and related papers (if applicable).
    • I have added links to the PR related to the algorithm.
    • I have created a table comparing my results against those from reputable sources (i.e., the original paper or other reference implementation).
    • I have added the learning curves (in PNG format with width=500 and height=300).
    • I have added links to the tracked experiments.
  • I have updated the tests accordingly (if applicable).

@vercel
Copy link

vercel bot commented May 29, 2022

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated
cleanrl ✅ Ready (Inspect) Visit Preview Jun 1, 2022 at 10:10PM (UTC)

@gitpod-io
Copy link

gitpod-io bot commented May 29, 2022

@vwxyzjn vwxyzjn marked this pull request as ready for review May 31, 2022 13:52
@vwxyzjn vwxyzjn requested review from dipamc, dosssman and yooceii May 31, 2022 13:52
@vwxyzjn
Copy link
Owner Author

vwxyzjn commented May 31, 2022

@benblack769 @araffin @Miffyli @jkterry1 @kcorder would you mind helping review this PR? In particular, could you help review the following:

Thanks!

@kcorder
Copy link

kcorder commented May 31, 2022

This all looks good to me!

Just some things I think we should try out:

  • we have a NoopReset wrapper for PZ envs
  • Jordan/Ben previously found using the InvertColor agent indicator was better than normal agent indicator

@vwxyzjn
Copy link
Owner Author

vwxyzjn commented May 31, 2022

Thank you @kcorder, I’d be happy to try out the no-op reset wrapper. Is the InvertColor agent indicator in supersuit? Also see https://wandb.ai/costa-huang/cleanRL/reports/MA-ALE--VmlldzoxNzAzMDQx#invert-color-indicator which shows the performance of the invertcolor indicator - at least in pong it does not perform as well as the naive indicator.

@kcorder
Copy link

kcorder commented May 31, 2022

Oh interesting, good to know about agent indicator - I hadn't tried myself.

The NoopReset is here: https://github.com/jkterry1/MA-ALE2/blob/74f562d088c795e7fa4fdeba494f2573ac9c6c7e/env_utils.py#L324-L345

We've been using this InvertColorAgentIndicator - there was a bug fix there since the original code actually

@vwxyzjn
Copy link
Owner Author

vwxyzjn commented Jun 1, 2022

@kcorder thanks for the helpful pointers. While it would be interesting to try this preprocessing, I would like to defer this as future work since we are aiming for a 1.0.0 release soon.

@vwxyzjn vwxyzjn merged commit e547cc7 into master Jun 1, 2022
@vwxyzjn vwxyzjn mentioned this pull request Jun 1, 2022
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants