Search Unity

  1. Unity 2020.1 has been released.
    Dismiss Notice
  2. We are looking for feedback on the experimental Unity Safe Mode which is aiming to help you resolve compilation errors faster during project startup.
    Dismiss Notice
  3. Good news ✨ We have more Unite Now videos available for you to watch on-demand! Come check them out and ask our experts any questions!
    Dismiss Notice

ml agents, multipolicy self play.

Discussion in 'ML-Agents' started by m4l4, Jul 29, 2020.

  1. m4l4

    m4l4

    Joined:
    Jul 28, 2020
    Posts:
    26
    Hi everyone,
    with self play is possible to train multiple agent in a competitive environment, but they all share the same goal, perception of the world etc. basically they share the same policy during episodes.

    in the context of herbivores vs carnivores, herbivores have to learn to find plants, and avoid predators, while carnivores have to learn how to catch herbivores, they need different perception of the surrounding environment, and they need their own policy. Reward is life dependent, the older you get, the higher the score.
    when an agent dies, AddReward(-1f), EndEpisode(), and it respawns, starting a new episode in the ALREADY running env. (no env reset, just the dead agent)!

    Made a simple env named EnvSym, gave agents different behavior names (Carnivore and Herbivore), made a config file named EnvSym.yaml, and launched training.

    unity does connect with 2 brains, names are correct (Carnivore, Herbivore), but their parameters are sort of default, not my config. Already happened once because the name of the config file didn't match the name of the behavior.
    tried to make 2 config files with the behaviors names, but i don't know if there's a command to call 2 different files at once.
    mlagents-learn config/ppo/???? --run-id=EnvSym01.

    is there a way to make different configs in the same file? should they be separated?

    is it correct to split carnivores and herbivores into 2 different teams? they are not really a "team", cooperation can be useful, but the goal is still "survive as much as you can on your own".

    and the most important question: is my project even possible right now with ml-agents?
     
  2. celion_unity

    celion_unity

    Unity Technologies

    Joined:
    Jun 12, 2019
    Posts:
    187
  3. m4l4

    m4l4

    Joined:
    Jul 28, 2020
    Posts:
    26
    yes, i found by trial and error that i can put multiple behavior in the same config file, and the training starts just fine.

    Thanks for the answer, i'll put self play back on.

    if i get something good i'll let you know :)
     
unityunity