Search Unity

Only one agent moves when using Self-Play for training

Discussion in 'ML-Agents' started by AndyWarrior, Mar 24, 2020.

  1. AndyWarrior

    AndyWarrior

    Joined:
    Feb 9, 2017
    Posts:
    6
    Hello,

    I have two agents that I want to train using self-play. The red agent is trying to catch the green one, while the green one is trying to escape. The red agent has a Team ID = 0 and the green agent has a Team ID = 1.

    When I start the training only one of the agents is moving (the green one with Team ID = 1). However, if I disable the green agent and start training again, the red agent trains normally. It seems to be an issue when the two agents are training at the same time. I have the self_play parameter in my trainer_config.yaml.

    Is there any special configuration that I need to do to allow my two agents to train at the same time? I basically followed the soccer example, so I know it is possible for both of them to train together, it just seems like I'm doing something wrong.

    Screen Shot 2020-03-23 at 6.22.01 PM.png

    Screen Shot 2020-03-23 at 6.39.37 PM.png

    Screen Shot 2020-03-23 at 6.39.49 PM.png

    Screen Shot 2020-03-23 at 6.23.27 PM.png

    Thanks for any help you can provide.
     
  2. mbaske

    mbaske

    Joined:
    Dec 31, 2017
    Posts:
    473
    Not sure self-play is applicable here, because your two agents have different goals. Try training them as distinct agents, without the self-play option. Also, 2x512 units looks like overkill, given the small number of observables and actions. You can reduce that quite a bit, try 2x128 or even less. Since you're using discrete actions, you can set smaller batch and buffer sizes as well, see https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Training-PPO.md#batch-size
     
    AndyWarrior likes this.
  3. AndyWarrior

    AndyWarrior

    Joined:
    Feb 9, 2017
    Posts:
    6
    Thanks for the reply, training them separately makes sense, since they have different goals. Just curious why the second agent is not moving in this case? Even is self-play is not applicable here it should still move right? Or is there something that Unity does to detect wether or not both agents have the same goal and decide that self-play is not the right choice?
     
  4. ultimatepixel

    ultimatepixel

    Joined:
    Mar 28, 2020
    Posts:
    1
    I've had the same issue and my two agents were actually doing the same exact action (trying to stab each other).
    After a few days I finally found a fix - add the Decision Requester component to your agents.
     
    WalkLearner likes this.
  5. LexVolkov

    LexVolkov

    Joined:
    Sep 14, 2014
    Posts:
    62
    For different tasks - different brains.

    Why doesn’t your agent go? Maybe try to put together green_tag and red_tag in the settings of the ray sensors.
     
  6. WalkLearner

    WalkLearner

    Joined:
    Mar 12, 2020
    Posts:
    10
    Thanks! you save my life! it actually works like a charm! I still don't know the reason, because I manually added decision request in the code with the same settings as the component in the inspector. Somehow, only the decision requester as a component runs normally with self-play.
     
  7. AndrewGri

    AndrewGri

    Joined:
    Jan 30, 2020
    Posts:
    12
    I also try to train two different agents having the same scenario. I trained one separately and now I'm going to train another one, having the trained one playing as inference.
    And I have a question. When 1 agent catches another, I need to to reset both agents. But I guess that reset should be initiated by only that agent which is training now, not that which is inference.
    I assume that inference agent should not call Done() when catching training agent. So I need to check from the agent script if the current mode is inference or not.
    How to do that?

    ps. I temporarily solved this but introducing another public variable to agent script, but it is not convenient or course.
     
  8. WalkLearner

    WalkLearner

    Joined:
    Mar 12, 2020
    Posts:
    10
    I think in any games, you should reset all agents for fairness or for the definition of a GAME. The inference agents can also be reset if you add the commend Done() in 0.14 or EndEpisode() in 0.15 for them. The reset should not affect your inference agent's performance, if you had trained them well. However, in an adversarial scenario, I would recommend to train both agents at the same time not separately, because asymmetric self-play is already good enough and time saving for training two models.