Search Unity

  1. Megacity Metro Demo now available. Download now.
    Dismiss Notice
  2. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

Binary Decision not balanced at all

Discussion in 'ML-Agents' started by AugustoGartner, Apr 12, 2021.

  1. AugustoGartner

    AugustoGartner

    Joined:
    Dec 7, 2020
    Posts:
    18
    Hi guys,

    I have a gameObject that can act two ways and my agent should recognize which action the gameObject is doing. The GameObject's action is related to its position, so that is what the Agent is observing.
    For the Agent I am using discrete actions and clamping it to 1, so that at the end I have a binary discrete action (0 and 1).

    The problem is that my agent doesn't seem to learn and what I see is that my Agent takes the action "0" 5 times more than the "1", while the gameObject Action is quite balanced.

    Does anyone had the same problem? Or could anyone point out what is wrong with my thoughts?
     
  2. andrewcoh_unity

    andrewcoh_unity

    Unity Technologies

    Joined:
    Sep 5, 2019
    Posts:
    162
    Hi @AugustoGartner

    You shouldn't need to do any clamping of discrete actions. They will just be integers between 0 to num_actions. Are you specifying this action space in some other way than 1 branch and 2 discrete actions in the behavior parameters script?

    Also, I am not quite sure I understand your scenario. Are both the GameObject and Agents actually two different agents?
     
  3. AugustoGartner

    AugustoGartner

    Joined:
    Dec 7, 2020
    Posts:
    18
    Hi @andrewcoh_unity,

    thanks for the hint! I will change that.

    Actually no. I just have two different scripts attached to one GameObject. One script randomizes the action of the GameObject. And another script contains the ML-Agent that "guesses" which action was taken.

    Is that a problem?

    I also tried to attach the ML-Agent script to another GameObject but it doesn't seem to make a difference.
     
    Last edited: Apr 13, 2021
  4. andrewcoh_unity

    andrewcoh_unity

    Unity Technologies

    Joined:
    Sep 5, 2019
    Posts:
    162
    So, you have a script that outputs a randomly either 0/1, that action is executed in the environment, and the agent needs to guess the value?

    My first thought is that it might be very difficult to do this depending on the problem. For example, if there are many paths through the state-action space that can lead to the current state, the agent will have no way of knowing which path got it there just from the current state. I think you will definitely need some kind of observation stacking.
     
  5. AugustoGartner

    AugustoGartner

    Joined:
    Dec 7, 2020
    Posts:
    18
    Ok. Thank you Andrew! I will take a look into how to implement observation stacking :)