Add Text to Spaces #2908

ryanrudes · 2022-06-17T21:19:01Z

Description

This adds text spaces to Gym.

Type of change

New feature (non-breaking change which adds functionality)
This change requires a documentation update

Checklist:

I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

pseudo-rnd-thoughts · 2022-06-19T20:42:03Z

Thanks for your PR, are there any particular environments that we wish to use this space with?
I haven't done any text-based ML in a long time, but could the discrete space not be used as a link to the embedding either on observations or actions?
You have a couple of todos in your Text class, what is your planning in finishing them?

I can tell before you run the CI that the pre-commit section will fail.
Could add a commit for the pre-commit, read here for more details

RedTachyon · 2022-06-19T22:00:03Z

The idea is kinda interesting, but this feels like something that should start out with a discussion that would guide the implementation, instead of dropping a PR and then we have to simultaneously judge the idea and the implementation.

So before even approaching a code review, can you do a proper pitch of this idea? With the motivation, example uses, and the proposed structure.

ryanrudes · 2022-06-20T00:42:33Z

Sure, @RedTachyon.

I was hoping the Text space to be useful for environments with textual prompts guiding the agent. Take for instance an agent that uses an autoregressive language model to interpret prompts such as "pick up the red block", "rotate wheel clockwise 90 degrees", etc. This would allow for gym to support research at the intersection between RL and NLP/U.

In use, one thing that concerns me is the sample() method, particularly how one might want the space to sample random textual prompts in a certain situation. The implementation I provided does not contain this, but it is mentioned as a TODO, that is, providing the ability to instantiate the class with a regular expression to allow for coherent sampling. Let's say we have an environment featuring a robot arm that will be directed to identify and grasp a particular block on a countertop based on its color. Prompts will be in the format of "lift the red block". But perhaps we might want to promote a bit of generalization and vary the target block color in addition to the language used in the instruction. Instead of "lift", we might want to select a verb randomly from "pick", "raise", "grasp", etc. And we may want to vary the color of the target block between "orange", "yellow", "blue", etc. We might consider building something to allow something similar to the below example:

from gym.spaces import Text, Box
from gym import Env

import numpy as np
import random

class PickAndLift(Env):
    VERBS = ["lift", "pick", "raise", "grasp"]
    COLORS = ["red", "orange", "yellow", "green", "blue", "purple", "pink", "white", "black"]

    def __init__(self):
        self.observation_space = Text(f"{verb} the {color} block")
        self.action_space = Box(low = -np.inf, high = np.inf, shape = (7,), dtype = np.float32)

    def reset(self) -> str:
        verb = random.choice(self.VERBS)
        color = random.choice(self.COLORS)
        return f"{verb} the {color} block".format(verb = verb, color = color)

    ...

I hope this provides some clarity in terms of the intentions/motivations for adding this text modality to gym.spaces

hannahrawr · 2022-06-20T18:58:04Z

I think the motivation behind this is quite sound, though I think there can be some improvements before this gets merged

regex should definitely be integrated into the space, with at least just a default match-all supported ((?s).* should suffice). Though, one concern is that this might be inconsistent with the charset if we decide to keep both attributes.
- particularly, it is not clear to me how we might keep both regex and charset and ensure that they are mutually consistent.
- regex should be able to replace most of the use of charset or _charlist, but sampling might be difficult. There are external libraries on how to perform random sampling of strings from regex (https://github.com/asciimoo/exrex), but it would be a fairly big decision to add this just to support this space (not to mention the differing license).
Maybe a default toy environment could be implemented, so that there is an example to provide intuition to others on how to use it. The PickAndLift example shown looks well thought out enough to expand on.

hannahrawr

Some additional minor comments on the code itself

hannahrawr · 2022-06-20T19:01:50Z

gym/spaces/text.py

+
+    def __init__(
+        self,
+        max_length: int,


It would probably be best if max_length could be -1, and that would then bypass the check. This would help if the strings themselves could be arbitrarily long (e.g. if we're passing paragraphs of text for whatever reason), so users won't need to specify an arbitrary upper bound

I can't imagine a situation in which a user would genuinely want the sampler to output random combinations of characters between certain length bounds. So, if we constrain sampling to a regular expression, are bounds on the length even necessary at all?

We could remove max_length and leave it to the regex, but if the regex itself does not have a defined length bound, it could match an infinite pattern, and the length of the sampled string will depend on whichever internal limit the sample algorithm might have.

gym/spaces/text.py

ryanrudes · 2022-06-20T20:21:44Z

@andrewtanJS I'm thinking a card game might be a decent choice for a toy environment that could use the Text space.

gym/spaces/text.py

pseudo-rnd-thoughts · 2022-06-29T15:53:00Z

@andrewtanJS @ryanrudes @RedTachyon What changes still need to be made to this PR for it to be reviewed and merged?

ryanrudes · 2022-06-29T21:26:59Z

@pseudo-rnd-thoughts For one, we need to devise and implement a simple toy task that uses the text space. I don't have the time to write this at the moment, perhaps someone else is up to it?

hannahrawr · 2022-06-30T16:20:37Z

@pseudo-rnd-thoughts it'll need the environment example, as well as the regex implementation if we go with it. After some thought, I wonder if this might just add unnecessary complexity or assumptions to the space, and we could just use the existing charset and sample() with the assumption that any environment implementation will flesh it out if they really need to.

@ryanrudes what are your thoughts on this? Would it be easy to support sampling if we add regex? And ultimately, how would the text space be utilized so that the implemented attributes best support the use cases?

ryanrudes · 2022-06-30T19:53:25Z

@pseudo-rnd-thoughts it'll need the environment example, as well as the regex implementation if we go with it. After some thought, I wonder if this might just add unnecessary complexity or assumptions to the space, and we could just use the existing charset and sample() with the assumption that any environment implementation will flesh it out if they really need to.

@ryanrudes what are your thoughts on this? Would it be easy to support sampling if we add regex? And ultimately, how would the text space be utilized so that the implemented attributes best support the use cases?

I agree that adding the regex sampling would certainly overcomplicate things, and possibly for little to no good reason anyways... if we were to aim for an all-encompassing sampling method that would be appropriate for most environments, I couldn't imagine a way to do that without further complicating things. If possible, perhaps the sample() method should be more just for consistency and less for functionality, since I can't really think of a situation where it would be used besides to visually inspect an example output of the space. As you said, anyone can simply implement their own sampling in their environment. However, there should be some critical initializers to allow for baseline agents to be constructed purely from the metadata of this space. I wonder if a contributor to stable-baselines could provide some advice on this?

pseudo-rnd-thoughts · 2022-07-08T10:43:04Z

The problem that I see with this space, is for almost all games then a MultiDiscrete space can be used instead of the Text, except in the case of emergent communication where two or more agents "learn" a novel language to play a game. I have a PhD college who would be interested in using this space.
However, these are generally multi-agent problems not single agent problems thus we wouldn't want to add them to Gym.
If @ryanrudes fixed all of the CI and pre-commit issues and @andrewtanJS believed that the general structure is effective then I would be happy to merge the PR.
Allowing researchers to use the space and make feature requests as necessary for the space. Thoughts?

hannahrawr · 2022-07-08T10:49:55Z

@pseudo-rnd-thoughts that sounds like a good direction to take things. This space should sufficient for general use, and if there's more feature requests that arise we can continue to iterate on it further.

ryanrudes added 4 commits June 17, 2022 17:06

Add Text space

b2355c4

Add Text space

0e90c8a

Added Text space

805a26e

Updated comment

33866d7

RedTachyon mentioned this pull request Jun 20, 2022

[Proposal] Add an environment showcasing the Graph space #2912

Closed

hannahrawr reviewed Jun 20, 2022

View reviewed changes

Refactored conditional

22c4730

hannahrawr reviewed Jun 22, 2022

View reviewed changes

gym/spaces/text.py Outdated Show resolved Hide resolved

Refactored contains()

12e084b

pseudo-rnd-thoughts mentioned this pull request Jul 11, 2022

Text space #2959

Merged

jkterry1 closed this Jul 11, 2022

Add Text to Spaces #2908

Add Text to Spaces #2908

Uh oh!

Conversation

ryanrudes commented Jun 17, 2022

Description

Type of change

Checklist:

Uh oh!

pseudo-rnd-thoughts commented Jun 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RedTachyon commented Jun 19, 2022

Uh oh!

ryanrudes commented Jun 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hannahrawr commented Jun 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hannahrawr left a comment

Choose a reason for hiding this comment

Uh oh!

hannahrawr Jun 20, 2022

Choose a reason for hiding this comment

Uh oh!

ryanrudes Jun 20, 2022

Choose a reason for hiding this comment

Uh oh!

hannahrawr Jun 22, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ryanrudes commented Jun 20, 2022

Uh oh!

Uh oh!

pseudo-rnd-thoughts commented Jun 29, 2022

Uh oh!

ryanrudes commented Jun 29, 2022

Uh oh!

hannahrawr commented Jun 30, 2022

Uh oh!

ryanrudes commented Jun 30, 2022

Uh oh!

pseudo-rnd-thoughts commented Jul 8, 2022

Uh oh!

hannahrawr commented Jul 8, 2022

Uh oh!

Uh oh!

pseudo-rnd-thoughts commented Jun 19, 2022 •

edited

Loading

ryanrudes commented Jun 20, 2022 •

edited

Loading

hannahrawr commented Jun 20, 2022 •

edited

Loading