Transfer Learning with different observation

jokerHHH · Jun 10, 2021

My source task need 218 observations, I want to load the pretrain weight to initialize the new task which has 302 observations. At First time, it works well, but after about 10000 steps, I get an error like this:

2021-06-10 13:43:43 WARNING [torch_model_saver.py:118] Failed to load for module Policy. Initializing
2021-06-10 13:43:43 WARNING [torch_model_saver.py:118] Failed to load for module Optimizer:critic. Initializing
2021-06-10 13:43:43 INFO [torch_model_saver.py:125] Starting training from step 0 and saving to results\20_Neopets\Neopets.
2021-06-10 13:44:48 INFO [stats.py:180] Neopets. Step: 10000. Time Elapsed: 75.128 s. Mean Reward: -2.957. Std of Reward: 2.445. Training.
Exception in thread Thread-2:
Traceback (most recent call last):
File "e:\anaconda\envs\ml-agents-release16\lib\threading.py", line 916, in _bootstrap_inner
self.run()
File "e:\anaconda\envs\ml-agents-release16\lib\threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "d:\ml-agents-release16\ml-agents\mlagents\trainers\trainer_controller.py", line 297, in trainer_update_func
trainer.advance()
File "d:\ml-agents-release16\ml-agents\mlagents\trainers\trainer\rl_trainer.py", line 313, in advance
if self._update_policy():
File "d:\ml-agents-release16\ml-agents\mlagents\trainers\ppo\trainer.py", line 202, in _update_policy
buffer.make_mini_batch(i, i + batch_size), n_sequences
File "d:\ml-agents-release16\ml-agents-envs\mlagents_envs\timers.py", line 305, in wrapped
return func(*args, **kwargs)
File "d:\ml-agents-release16\ml-agents\mlagents\trainers\ppo\optimizer_torch.py", line 162, in update
self.optimizer.step()
File "e:\anaconda\envs\ml-agents-release16\lib\site-packages\torch\autograd\grad_mode.py", line 26, in decorate_context
return func(*args, **kwargs)
File "e:\anaconda\envs\ml-agents-release16\lib\site-packages\torch\optim\adam.py", line 119, in step
group['eps']
File "e:\anaconda\envs\ml-agents-release16\lib\site-packages\torch\optim\functional.py", line 86, in adam
exp_avg.mul_(beta1).add_(grad, alpha=1 - beta1)
RuntimeError: The size of tensor a (302) must match the size of tensor b (218) at non-singleton dimension 1

I check the checkpoint file by Netron, it seems my model initialize with the right dimensions, why am I get this error?

jokerHHH · Jun 11, 2021

I already solve the above problem. My solution is train the new agent at first time then copy the pretrain model parameters to the new model，then the error is gone.
But I met another problem，the transfer learning results（see my log） didn't look good enough than the model was trained from the beginning. So the transfer learning is effective to RL or not？ The gray line is transfer learning，the orange line is train the new agent.

Search Unity

Unity ID

Useful Searches

Transfer Learning with different observation

jokerHHH

jokerHHH