Visual observations

andrzej_ · Jun 26, 2020

Quite a few people are asking about using visual observations with ML-agents and even how to train the example environments with visual obs (Hallway and Pyramids). I've spend some time expanding on the idea in the Hallway example and even managed to get some proper generalization.

The agent was trained with a set of 8 different symbols and is then able to recognize new, unseen symbols (30 total used in testing), with varying success rate (but definitely better than random). It even manages to recognize the Unity logo, which the agent haven't seen before.

As for the hyperparameters and RL model I've used PPO with LSTM memory and the most significant change is using the ResNet backbone for the visual observations.
I've also used curriculum learning to ramp up the number of simultaneous symbols and some other environments properties (not shown in the video, but some domain randomization).

I definitely haven't done a full hyperparameter search, so there's definitely room for improvement, but it took on the order of 10-15M steps to get decent results.

mbaske · Jun 26, 2020

That's awesome! Would you be ok with sharing the project files?

andrzej_ · Jun 26, 2020

Sure, I'd have to do a lot of clean up there, but except the random placement of the symbols where I used poisson disc sampling (which I very lazily implemented - once in a while throws an error when it can't get coordinates matching the requirements) there isn't anything that special in the project. In terms of the ML-agent relevant parts I'd say are very similar to the Hallway example.
One more new thing that I have there, and didn't use in the end, was experience recorder working with Unity's navigation system. My idea was to randomly spawn obstacles, record playthroughs with an agent avoiding those objects and then I could use that data with GAIL. But in the end I've left it for another project, as matching the possible actions of the agent to the navigation based system might be a bit problematic.

ervteng_unity · Jun 27, 2020

This is fantastic work - thanks for sharing!

Jincraftohk · Jul 10, 2020

I'm so surprised!
How did you train the Hallway with visual obs? I tried many hyperparameters but it just didn't work. I have very few experiences with this. Would you please share the detail of your experiment? What is the main factor? The ResNet?
Thank you for sharing!

andrzej_ · Jul 10, 2020

I think the biggest change I saw was changing it to ResNet. Other than that unfortunately I haven't really done a proper documentation for the experiments, even removed most of the experimental runs, so can't give you specific hyperparameter values. In the end it was pretty similar to the example config files

Jincraftohk · Jul 11, 2020

andrzej_ said: ↑

I think the biggest change I saw was changing it to ResNet. Other than that unfortunately I haven't really done a proper documentation for the experiments, even removed most of the experimental runs, so can't give you specific hyperparameter values. In the end it was pretty similar to the example config files
Click to expand...

Thanks! I will try the ResNet right now.

hestia_p · May 10, 2021

@andrzej_ Hello, What you did was perfectly consistent with what I was looking for. If you don't mind, could you share the project with me? I want to try it.

andrzej_ · May 16, 2021

Hey ... haven't touched that project for a while and it was based on an old version of ML-agent (still using TF), so most likely a lot of stuff will not work now. Also the only 'trick' here was using ResNet for visual observations.

Search Unity

Visual observations

andrzej_

mbaske

andrzej_

ervteng_unity

Unity Technologies

Jincraftohk

andrzej_

Jincraftohk

hestia_p

andrzej_

Search Unity

Unity ID

Useful Searches

Visual observations

Unity Technologies