GAIL with visual observations

andrzej_ · Jun 13, 2020

Hi,
I'm trying to use GAIL with visual observations and I'm having some trouble figuring out how to set everything up. This is the error I get with the 128 as the encoding size in the yaml config and 84x84x3 is the visual input. Not really sure what to make of the 10.3MiB

Traceback (most recent call last):
File "C:\Users\jezrd\Anaconda3\envs\mla-r1\lib\threading.py", line 916, in _bootstrap_inner
self.run()
File "C:\Users\jezrd\Anaconda3\envs\mla-r1\lib\threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "e:\dok\_ml\ml-agents-release_2\ml-agents\mlagents\trainers\trainer_controller.py", line 350, in trainer_update_func
trainer.advance()
File "e:\dok\_ml\ml-agents-release_2\ml-agents\mlagents\trainers\trainer\rl_trainer.py", line 151, in advance
self._process_trajectory(t)
File "e:\dok\_ml\ml-agents-release_2\ml-agents\mlagents\trainers\ppo\trainer.py", line 161, in _process_trajectory
self.update_buffer, training_length=self.policy.sequence_length
File "e:\dok\_ml\ml-agents-release_2\ml-agents\mlagents\trainers\buffer.py", line 283, in resequence_and_append
batch_size=batch_size, training_length=training_length
File "e:\dok\_ml\ml-agents-release_2\ml-agents\mlagents\trainers\buffer.py", line 51, in extend
self += list(np.array(data))
MemoryError: Unable to allocate 10.3 MiB for an array with shape (128, 84, 84, 3) and data type float32

Here's the config file :

SymbolFinder:
use_recurrent: true
sequence_length: 64
num_layers: 3
hidden_units: 256
memory_size: 128
beta: 1.0e-2
num_epoch: 8
buffer_size: 32768
batch_size: 2048
max_steps: 10000000
summary_freq: 10000
time_horizon: 96
reward_signals:
extrinsic:
strength: 1.0
gamma: 0.99
curiosity:
strength: 0.015
gamma: 0.99
encoding_size: 128
gail:
gamma: 0.99
strength: 0.5
encoding_size: 128
learning_rate: 0.0003
use_actions: false
demo_path: demos/5goals2.demo

I'm not 100% sure it's the right set-up. The training works normally without GAIL.

Since memory allocation might be an issue, my set-up: Windows10, RAM 64GB GPU 24 GB. The demonstration file is ~160MB of data

After dropping the curiosity and reducing the encoding size to 64 it started training. The memory load for the GPU is 5/24 and for RAM it's 36/64 (but that's mostly chrome and 50 other apps running as well).

andrzej_ · Jun 14, 2020

Did a few tests and I see that the size of the demonstration file and the number of environments also matter. 50MB file with env-num set to two works fine, but a bigger file ~500MB with a few more environment instances fails. So I'm guessing the demonstration file is loaded up to memory for each agent superatlety?

Would it make sense to start with smaller pool of instances, bootstrap the training with GAIL and then resume training, leaving the action/observation space unchanged and the number of layers/hidden units, without GAIL?

vincentgao88 · Jun 16, 2020

andrzej_ said: ↑

Did a few tests and I see that the size of the demonstration file and the number of environments also matter. 50MB file with env-num set to two works fine, but a bigger file ~500MB with a few more environment instances fails. So I'm guessing the demonstration file is loaded up to memory for each agent superatlety?

Would it make sense to start with smaller pool of instances, bootstrap the training with GAIL and then resume training, leaving the action/observation space unchanged and the number of layers/hidden units, without GAIL?
Click to expand...

Hi andrej_, I guess if you keep running into memory issue, you can train with GAIL and resume with GAIL with a smaller demo file with GAIL regarding signal as 0 (so that the small demo file is ignored). However the best way is still to avoid overload your memory and keep training with the same setup.

vincentgao88 · Jun 16, 2020

Also how much memory is your environment taking, is that the cause of the memory issue?

andrzej_ · Jun 16, 2020

vincentgao88 said: ↑

Also how much memory is your environment taking, is that the cause of the memory issue?
Click to expand...

the exe file itself is 625kB
I'm not entirely sure what seems to be the issue here, as a few things seem to influence the memory issue (num-envs, batch size, demo recording file size... ). Does the GAIL implementation try to load the whole demo recording at once and for each environment instance, or does it sample it?
I tend to build the environment and launch from command line with num-envs argument rather than duplicating n-times the agents in the editor ... would that help here? It would be n learners/agents but only a single Unity instance.

andrzej_ · Jun 16, 2020

Here mlagents were training for ~80k steps.
Single agent/env instance run from editor. batch size 512, buffer size 32768
the demo recording was ~41MB
I have still plenty of RAM and VRAM (~30 and ~12 GB respectively), but starting training eats up a lot of my C: drive space (~20-30GB), so I'm wondering if it's possible that it is caching something and at some point runs out of free space.
If you prefer I can post this as an issue on github.

andrzej_ · Jun 17, 2020

If I ever get this working I'll definitely will post a few things how and why ... so far most of the runs look like this:

Not sure what are the odds of two separate runs, with different hyper-parameters, to get a sudden 'collapse' after nearly 2M steps ...
edit: both of those runs were PPO+curosity, wo GAIL.

ervteng_unity · Jun 18, 2020

With curiosity, it's possible to get reward collapses later as the agent stops being curious about the things you want it to be curious about =P, and goes off to explore other things.

Do you have the plots for the Extrinsic and Curiosity rewards (in TB under Policy). Usually this can be solved by making the Curiosity reward much smaller than the Extrinsic reward, so that the agent prioritizes that.

andrzej_ · Jun 19, 2020

Here's the plot for two runs with curiosity and one with GAIL and no curiosity

policy :

loses:

andrzej_ · Jun 19, 2020

But I should add that all 3 might have some small changes to the environment and/or hyperparameters. Nothing big, but still it's not a 100% perfect comparison

Search Unity

GAIL with visual observations

andrzej_

andrzej_

vincentgao88

Unity Technologies

vincentgao88

Unity Technologies

andrzej_

andrzej_

andrzej_

ervteng_unity

Unity Technologies

andrzej_

andrzej_

Search Unity

Unity ID

Useful Searches

GAIL with visual observations

Unity Technologies

Unity Technologies

Unity Technologies