Gym Unity - Baselines

ademord · May 27, 2021

Hello guys,

I finished my environment in unity and now I am trying to "export it to gym" to try different algorithms (i will do my own implementations afterwards). I am trying Baselines now and I exported the environment as:
env = UnityToGymWrapper(unity_env, uint8_visual=True, flatten_branched=True, allow_multiple_obs=True)
And now, from this line:
model = PPO(MlpPolicy, env, verbose=0)
I am getting the error:
NotImplementedError: Tuple(Box(-inf, inf, (91,), float32)) observation space is not supported
What could I do? I am a bit lost.

vincentpierre · Jun 1, 2021

PPO baselines does not support observations of type Tuple(Box(-inf, inf, (91,), float32)) (which I think corresponds to flat vector observations of 91 floats). If you want to use baselines, you need to create an environment with observations and actions that baselines can work with.

ademord · Jun 2, 2021

vincentpierre said: ↑

PPO baselines does not support observations of type Tuple(Box(-inf, inf, (91,), float32)) (which I think corresponds to flat vector observations of 91 floats). If you want to use baselines, you need to create an environment with observations and actions that baselines can work with.
Click to expand...

I am using raycasts and one boolean (so yes vector obs as you mention). How can i know what kind of observations does baselines work with? do i check the algorithm - which obs input it supports or do i try to change my obs? i just need some direction.

vincentpierre · Jun 14, 2021

I have not worked with PPO baselines in a while, I think you will have better luck looking at their documentation or issues page. If my memory is correct, it should work on single visual observations but I really am not sure.

simmax21 · Oct 21, 2021

ademord said: ↑

I am using raycasts and one boolean (so yes vector obs as you mention). How can i know what kind of observations does baselines work with? do i check the algorithm - which obs input it supports or do i try to change my obs? i just need some direction.
Click to expand...

Hi, i have similar problem with stable_baselines3. How did you solved it?

ademord · Oct 27, 2021

@simmax21 I had to make a custom environment with the help of @aakarshanc01
If you can further improve on this code would also be amazing for me and other people that come after us:

Code (python):

def get_wandb_ue_env():

# engine config

engine_channel = EngineConfigurationChannel()

engine_channel.set_configuration_parameters(time_scale=config.time_scale)

# side channels

channel = SB3StatsRecorder()

# environment

env = UE(config.env_path,

seed=1,

worker_id=rank,

base_port=5000 + rank,

no_graphics=config.no_graphics,

side_channels=[engine_channel, channel])

return env

class CustomEnv(gym.Env):

def __init__(self):

super(CustomEnv, self).__init__()

env = get_wandb_ue_env()

env = UnityToGymWrapper(env, allow_multiple_obs=True)

self.env = env

self.action_space = self.env.action_space

self.action_size = self.env.action_size

self.observation_space = gym.spaces.Dict({

0: gym.spaces.Box(low=0, high=1, shape=(27, 60, 3)), # =(40, 90, 3)),

1: gym.spaces.Box(low=0, high=1, shape=(20, 40, 1)), # (56, 121, 1

2: gym.spaces.Box(low='-inf', high='inf', shape=(400,))

})

@staticmethod

def tuple_to_dict(s):

obs = {

0: s[0],

1: s[1],

2: s[2]

}

return obs

def reset(self):

# print("LOG: returning reset" + self.tuple_to_dict(self.env.reset()))

# print("LOG: returning reset" + (self.env.reset()))

# np.array(self._observation)

return self.tuple_to_dict(self.env.reset())

def step(self, action):

s, r, d, info = self.env.step(action)

return self.tuple_to_dict(s), float(r), d, info

def close(self):

self.env.close()

global rank

rank -= 1

def render(self, mode="human"):

self.env.render()

class SB3StatsRecorder(SideChannel):

"""

Side channel that receives (string, float) pairs from the environment, so that they can eventually

be passed to a StatsReporter.

"""

def __init__(self) -> None:

# >>> uuid.uuid5(uuid.NAMESPACE_URL, "com.unity.ml-agents/StatsSideChannel")

# UUID('a1d8f7b7-cec8-50f9-b78b-d3e165a78520')

super().__init__(uuid.UUID("a1d8f7b7-cec8-50f9-b78b-d3e165a78520"))

pretty_print("Initializing SB3StatsRecorder", Colors.FAIL)

self.stats: EnvironmentStats = defaultdict(list)

self.i = 0

self.wandb_tables: dict = {}

def on_message_received(self, msg: IncomingMessage) -> None:

"""

Receive the message from the environment, and save it for later retrieval.

:param msg:

:return:

"""

key = msg.read_string()

val = msg.read_float32()

agg_type = StatsAggregationMethod(msg.read_int32())

self.stats[key].append((val, agg_type))

# assign different Drone[id] to each subprocess within this wandb run

key = key.split("/")[1]

self.i += 1

if env_callback is not None and wandb_run_identifier == "test": # and "Speed" in "val"

# if self.i % 100 == 0:

my_table_id: str = "Performance[{}]".format(wandb_run_identifier)

# pretty_print("Publishing Table: key: {}, val: {}".format(my_table_id, key, val), Colors.FAIL)

env_callback(my_table_id, key, val)

def get_and_reset_stats(self) -> EnvironmentStats:

"""

Returns the current stats, and resets the internal storage of the stats.

:return:

"""

s = self.stats

self.stats = defaultdict(list)

return s

ademord · Oct 27, 2021

i then register this environment through the gym registration method and call it everywhere else as gym.make("my_id"). Since the environment pulls from the config file it can always adapt to different builds and dont need any more code to register "new" builds.

also something to take into account is the SubProcVecEnv is a bit unstable at least for me, you pass no context from any previous variables into the subprocesses so the training has to be fully separated and then brought back, you might choose a different strategy for that. i decided to reduce myself to 1 trainer instead of a vectorized env for now and just train for ~20 hours.

ademord · Oct 28, 2021

@simmax21 @vincentpierre if you can help me with this issue I would really appreciate it too UnityGymWrapper Crash after 2M iterations - Unity Forum

TiranianHoward · Nov 24, 2021

@ademord thank you for the custom env
just a little change. Changing the dict key from int to str make it works.

Edit:
This was the error
raise TypeError("module name should be a string. Got {}".format(
TypeError: module name should be a string. Got int

TiranianHoward · Nov 25, 2021

@ademord sorry, but do you know how to pass Academy.Instance.StatsRecorder data to python?

I want to log all the data that is recorded by Academy.Instance.StatsRecorder from my unity-converted-to-gym env

Search Unity

Unity ID

Useful Searches

Gym Unity - Baselines