Skip to content

[Proposal] New info API for vectorized environments #2657

@vwxyzjn

Description

@vwxyzjn

Motivation

The current vector env API would return an info dict per environment. So if if I have 5 envs, then the dict could look like
[{}, {}, {}, {}, {"truncated": True}]

While this API is intuitive, it's not a friendly interface to high-throughput environments such as brax (also see google/brax#164) and envpool. In these envs, they prefer the following paradigm: instead of 5 dictionaries, it's just a single dictionary with an array of length 5.

{"truncated":[False, False, False, False, True]}

Pros and Cons

Brax and envpool's approach feels more ergonomic for high-throughput envs, whereas Gym's approach is maybe more efficient for "sparse" info keys that don't appear often (I could be wrong with this).

Proposal

I propose to change the info API for vectorized environments to adapt brax's current design for info. This might also have implications for wrappers and other APIs. For example, the current vectorized environment API would return info like this (with the RecordEpisodeStatistics used):

[
    {
        "episode": {
            "r": 75.0,
            "l": 75,
            "t": 36.770559,
        },
        "terminal_observation": array([0.2097086, -0.5355723, -0.21343598, -0.11173592], dtype=float32),
    },
    {},
    {},
    {},
    {},
]

I am not sure what's the best way to handle this... Maybe something like the following?

{
    "episode": {
        "r": [75.0, None, None, None, None],
        "l": [75, None, None, None, None],
        "t": [36.770559, None, None, None, None],
    },
    "terminal_observation": [
        array([0.2097086, -0.5355723, -0.21343598, -0.11173592], dtype=float32),
        None,
        None,
        None,
        None,
    ]
}

Alternative 1

Alternatively, it's possible to do this in a very hacky way: brax or env pool could always override the first sub-env's info dict to always provide their desired info so something like:

[{"truncated":[False, False, False, False, True]}, {}, {}, {}, {}]

This way, we don't have to change the current API, however, it's extremely hacky and unintuitive.

Alternative 2

Alternatively, maybe info should be reserved for something sparse like episode and terminal_observation. And for more frequent infos they should be accessed via attributes (i.e., envs.get_attr(name) (enabled with #1600). So something like

root_joints = brax_envs.get_attr("root_joints")
# instead of 
# root_joints = info["root_joints"]
ale_lives = envpool_envs.get_attr("ale_lives")
# instead of 
# ale_lives = info["ale_lives"]

What are your thoughts @RedTachyon @JesseFarebro, @tristandeleu @XuehaiPan @Trinkle23897 @erikfrey @araffin @Miffyli

Checklist

  • I have checked that there is no similar issue in the repo (required)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions