Search Unity

Training using Curriculum Learning with Imitation Learning

Discussion in 'ML-Agents' started by Fr2, Mar 12, 2020.

  1. Fr2

    Fr2

    Joined:
    Jan 19, 2013
    Posts:
    39
    I've found that recording a demonstration and using Behavioral Cloning a great way to increase the learning speed for my agent when training.

    I'm also using Curriculum Learning, and so I use BC to initially train my agent, and then increase the difficulty gradually with Curriculum Learning.

    Is there a way to record demonstrations which correspond with each curriculum lesson? For example, with the WallJump example, could I record multiple demonstrations corresponding to each wall height, and then use those demonstrations as the agent progresses through the curriculum?
     
  2. andrewcoh_unity

    andrewcoh_unity

    Unity Technologies

    Joined:
    Sep 5, 2019
    Posts:
    162
    This is an interesting feature request. I can circulate it with the team.

    My concern is that there may be no use for the easier demonstrations once we have the harder demonstrations. For example, we use curriculum so that an agent can work up to learning the harder walls. However, if we could just use
    BC/GAIL with demos from the harder walls we might be able to solve the easier instances. It's possible that this is just not a great use case for this feature though and maybe there are others where this isn't the case. Do you have something else in mind?
     
  3. JulesVerny

    JulesVerny

    Joined:
    Dec 21, 2015
    Posts:
    47
    I would also like to see a combination of Curriculum Learning with GAIL. Just as a teacher demonstrates new concepts in new lessons. I can only get my Agent to learn some initial performance from GAIL. But it struggles to learn performance in the next lessons. Certainly not with Reinforcement Learning alone. It needs some help.

    So I would have hoped some form of Demo Recording at each Lesson level, and some sort of GAIL (Discriminator network reset, at the new lesson) Then maybe my Agent would be able to traverse lesson levels with.