Search Unity

Understanding Self-Play

Discussion in 'ML-Agents' started by James_Initus, Aug 7, 2020.

  1. James_Initus

    James_Initus

    Joined:
    May 26, 2015
    Posts:
    75
    Howdy, I was looking at the tennis example for Self-Play and was amazed to see the reward system is proving awards only when there is a goal scored. That means the system figured it out with complete magic :)

    The idea behind self-play is to let the agent discover "everything" on its own I suppose?

    I am experimenting with Self-Play and have had some "ok" success but I added some reinforced situations to help it a little bit, is this a bad idea?

    The arena is basically 2 characters using weapons to hit the other. If someone gets hit then the point is awarded to the character that shot it.

    I added some reward to things like, are they facing the right direction when firing, and negative rewards if they fire when not facing and not seeing the target.

    At some point, it worked but not to what I was hoping. They basically would find each other totally by accident and then point the weapons but never fire.

    So, this morning I changed the reward system to "only award if a character gets hit by the other projectile."

    It's running now but is there any guidance anyone can offer for this scenario?

    thanks