Search Unity

  1. Megacity Metro Demo now available. Download now.
    Dismiss Notice
  2. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

How to make the agent keep exploring after many training steps ?

Discussion in 'ML-Agents' started by namtran_amanotes, Jan 28, 2021.

  1. namtran_amanotes

    namtran_amanotes

    Joined:
    Mar 18, 2019
    Posts:
    2
    Hello Unity team

    I'm training ML Agent playing a chess game (Gomoku). At the beginning of the training process, the Agent will always explore the new combines of action for each step. But after ~24hrs training, it likely chooses the same sequence of action every Episode. Because the environment & enemy are deterministic rule-based, the agent will get the same cumulative reward and don't learn anything new.

    How to encourage the agent to keep exploring the new combination of actions? I think it should try a new combination of actions every Episode

    thank team
     
    Last edited: Jan 28, 2021
  2. ervteng_unity

    ervteng_unity

    Unity Technologies

    Joined:
    Dec 6, 2018
    Posts:
    150
    This is natural behavior for an agent trained with RL - in the beginning it continues exploring, and by the end it's choosing which actions it thinks is best. If the agent is converging on a sub-optimal solution, you could try one of a couple things:
    1. Set the learning_rate_schedule parameter to constant. This will maintain learning/exploring for longer.
    2. Increase beta. This encourages random actions (entropy) more.
    3. Use the SAC trainer. SAC is a "maximum entropy" algorithm that tries to balance exploration and exploitation throughout training.
    4. Add a curiosity or RND reward signal, which encourages finding new states.
     
    namtran_amanotes likes this.
  3. namtran_amanotes

    namtran_amanotes

    Joined:
    Mar 18, 2019
    Posts:
    2
    thank you for your reply

    i had already applied method 2&4 , will try others as you recommended

    will get back soon
     
  4. carlosm

    carlosm

    Joined:
    Sep 17, 2015
    Posts:
    7
    were you able to solve this? I'm having the same issue.