Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.

Question BASIC DISCRETE MODEL NOT WORKING

Discussion in 'ML-Agents' started by carlosm, Dec 4, 2022.

  1. carlosm

    carlosm

    Joined:
    Sep 17, 2015
    Posts:
    7
    Sorry if this is too basic, I'm a begginer at ML agents.
    I built a basic model with 1 discrete action with size of two. These two values (0,1) can get positive rewards depending on the environment values. The issue is the training uses more 0s and learns that 0 is the correct value to use only and doesn't use 1 much anymore even though there are still some rewards when using 1. Basically the model learns but just from one action. It's observing the current action (1), and the environment values (2). I've tried:
    • playing with the hyperparameters but no success, especially beta, batch and buffer size, normalize and learning rate
    • ppo and sac
    • Addrewards +1 and -0.1 seem to work better but still nothing
    Any help would be appreciate it. Thanks!
     

    Attached Files:

  2. garytrickett

    garytrickett

    Joined:
    Sep 2, 2018
    Posts:
    60
    if it's getting penalised for taking more actions / or more time it is likely just finding the way to get the most reward in the shortest time.
    hard to tell what the issue is without seeing the full picture but
    if you want it to explore more actions you would want to reward such behaviour - i.e. add a reward each time a unique action is made
    you can also mask individual actions so it can't be chosen - discrete action mask - this can help block action 0 from being chosen after it has been chosen too often etc
     
  3. carlosm

    carlosm

    Joined:
    Sep 17, 2015
    Posts:
    7
    I'll try that but I actually simplified the code just to prove it and it's still not working properly. See attached.
     

    Attached Files: