Search Unity

How to train exclusively with BC?

Discussion in 'ML-Agents' started by niels_modlai, Mar 25, 2020.

  1. niels_modlai

    niels_modlai

    Joined:
    Oct 8, 2019
    Posts:
    5
    How do I exclusively train an agent with Behavioral Cloning? I don't see any configuration file or explanation of this. I can't really tell if PPO is running in parallel.

    As an additional question, why can't I run BC without running the game? If I have a slow game but hours of recorded demonstrations to train on I don't see why the game has to run. Is there a way to disable this?
     
  2. christophergoy

    christophergoy

    Unity Technologies

    Joined:
    Sep 16, 2015
    Posts:
    735
  3. ahmmmmmwhy

    ahmmmmmwhy

    Joined:
    Aug 20, 2017
    Posts:
    11
    You enable behavior cloning by adding the following section to your Brain config:

    Code (csharp):
    1.     behavioral_cloning:
    2.         demo_path: path to file.demo
    3.         strength: 1
    4.         steps: 200000
    Check out
    mlagents-learn --help
    . It allows you to set time scale factor (x20 by default) and disable graphics. The answer why is basically because Unity acts as environment simulator and will produce states and rewards for the training agent.
     
  4. niels_modlai

    niels_modlai

    Joined:
    Oct 8, 2019
    Posts:
    5
    Thanks. Let me try to clarify.

    1. The documentation does not clearly state whether BC happens before (in the specified steps) or simultaneously with the PPO/SAC trainer. I want to be sure that it is only BC running.

    2. If only BC is running, it wouldn't need an environment. Only PPO, SAC, and GAIL need an environment, not BC.
     
  5. LexVolkov

    LexVolkov

    Joined:
    Sep 14, 2014
    Posts:
    62
    But it is not possible that only the BCwould work!
    After all, you define only a set of rewards for a set of behaviors, but not the neural network itself in the Demo.
    I do not think that the bot can repeat the BC. Only simulate for experience.
     
  6. niels_modlai

    niels_modlai

    Joined:
    Oct 8, 2019
    Posts:
    5
    I found out that this was a feature once:
    if trainer_type == "offline_bc":
    raise UnityTrainerException(
    "The offline_bc trainer has been removed. To train with demonstrations, "
    "please use a PPO or SAC trainer with the GAIL Reward Signal and/or the "
    "Behavioral Cloning feature enabled."
    )
    Sadly, it has been removed.