Search Unity

Question Unable to reach perfection in extraordinarily simple environment

Discussion in 'ML-Agents' started by RylanYancey, Sep 8, 2021.

  1. RylanYancey

    RylanYancey

    Joined:
    Jul 18, 2021
    Posts:
    10
    Hello! I have recently begun conducting tests on Unity ML-Agents

    After conducting some simple tests and finding them very stubborn to learning in the environments I put them in, I tried to create the simplest environment I could possibly find. I call him yesbot, and he should always answer yes. [Code Below]. Yesbot receives a value, 1 or 0, and outputs a value, 1 or 0. if his output is the same as the input, he wins a +1 reward. If it is not, he loses -1 reward.

    Despite how simple this environment is, I noted that after 100,000 steps (as recorded in Windows Terminal) the bot was a steady 60% accuracy rate. I ran the calculations from my Debug.Log("right or wrong") several times while running the experiment and found that after about 20,000 steps it was 60% accurate to the second decimal. (60.0num). It never progressed above 60%. While watching my visual interpretation in the scene view, I noted that the agents were 100% correct every third decision. (over 36 agents) And then on the other two decisions, it was ~50% right. This behavior became incredibly consistent, and explains the 60% average accuracy. Even when I tried to use a trained model and use inference, this problem persisted.

    I am using a single Discrete branch, with 2 possibilities. My behavior type is set to default. I do have a YAML file I am using. I have only changed the behavior name and the max steps.

    I am very new and understand that I may be overlooking simple concepts. If you dont have an answer, I would appreciate if you would direct me in the direction of documentation that might have one.

    -Rylan Yancey, Amateur dev

    Code (CSharp):
    1. using System.Collections;
    2. using System.Collections.Generic;
    3. using UnityEngine;
    4. using Unity.MLAgents;
    5. using Unity.MLAgents.Sensors;
    6. using Unity.MLAgents.Actuators;
    7.  
    8. public class YesBot : Agent
    9. {
    10.     int yes;
    11.  
    12.     [SerializeField] Material Win;
    13.     [SerializeField] Material Lose;
    14.  
    15.     public override void OnEpisodeBegin()
    16.     {
    17.         yes = Random.Range(0, 2);
    18.     }
    19.  
    20.     public override void CollectObservations(VectorSensor sensor)
    21.     {
    22.         sensor.AddObservation(yes);
    23.     }
    24.  
    25.     public override void OnActionReceived(ActionBuffers actions)
    26.     {
    27.         MeshRenderer a;
    28.         a = GetComponent<MeshRenderer>();
    29.  
    30.         if (actions.DiscreteActions[0] == yes)
    31.         {
    32.             AddReward(+5f);
    33.             Debug.Log("Right!");
    34.             a.material = Win;
    35.         }
    36.         else
    37.         {
    38.             AddReward(-5f);
    39.             Debug.Log("Wrong!");
    40.             a.material = Lose;
    41.         }
    42.  
    43.         EndEpisode();
    44.     }
    45. }
    upload_2021-9-7_21-49-9.png
     
  2. ruoping_unity

    ruoping_unity

    Unity Technologies

    Joined:
    Jul 10, 2020
    Posts:
    134
    Random.Range() seems to be inclusive on min and max value so your input might actually be [0, 1, 2] instead of [0, 1]
     
  3. gft-ai

    gft-ai

    Joined:
    Jan 12, 2021
    Posts:
    44
    I think the max value given by Random.Range() for integers is exclusive. So I am not sure if that is the problem

    https://docs.unity3d.com/ScriptReference/Random.Range.html
     
  4. RylanYancey

    RylanYancey

    Joined:
    Jul 18, 2021
    Posts:
    10
    i tested it before hand. Random.Range(0,2) always returns either 0 or 1. The 2 is exclusive, the 0 inclusive.
     
  5. RankNFyle

    RankNFyle

    Joined:
    Jan 4, 2021
    Posts:
    31
    Just an idea for debugging: instead of random, set yes always to 0, run the training, then set yes always to 1 and run the training
     
  6. unity_-DoCqyPS6-iU3A

    unity_-DoCqyPS6-iU3A

    Joined:
    Aug 18, 2018
    Posts:
    26
    I saw a video on YouTube on old version of mlagents where an episode length of 1 caused problems. Can you try making your episode at least 2 steps long?

    Also, can you please show your config.yaml file?
     
  7. smallg2023

    smallg2023

    Joined:
    Sep 2, 2018
    Posts:
    147
    what are you training results in python? (i.e. the mean reward and the deviation)
    also did you edit the training file / behaviours at all?
    i did a test using your script and it seems like it is training fine here
    at 25k steps it has reached 0.999 mean with 0.033 std so i would say they are very nearly perfect however when using the trained brain it does indeed only get it right 60% of the time.. very odd
    edit: found the problem, set the decision period to 1 in the decision requester and they will work correctly :)
    (i am using ML agents version 0.27.0)
     
    Last edited: Sep 13, 2021