Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Question Unable to reach perfection in extraordinarily simple environment

Discussion in 'ML-Agents' started by RylanYancey, Sep 8, 2021.

  1. RylanYancey

    RylanYancey

    Joined:
    Jul 18, 2021
    Posts:
    10
    Hello! I have recently begun conducting tests on Unity ML-Agents

    After conducting some simple tests and finding them very stubborn to learning in the environments I put them in, I tried to create the simplest environment I could possibly find. I call him yesbot, and he should always answer yes. [Code Below]. Yesbot receives a value, 1 or 0, and outputs a value, 1 or 0. if his output is the same as the input, he wins a +1 reward. If it is not, he loses -1 reward.

    Despite how simple this environment is, I noted that after 100,000 steps (as recorded in Windows Terminal) the bot was a steady 60% accuracy rate. I ran the calculations from my Debug.Log("right or wrong") several times while running the experiment and found that after about 20,000 steps it was 60% accurate to the second decimal. (60.0num). It never progressed above 60%. While watching my visual interpretation in the scene view, I noted that the agents were 100% correct every third decision. (over 36 agents) And then on the other two decisions, it was ~50% right. This behavior became incredibly consistent, and explains the 60% average accuracy. Even when I tried to use a trained model and use inference, this problem persisted.

    I am using a single Discrete branch, with 2 possibilities. My behavior type is set to default. I do have a YAML file I am using. I have only changed the behavior name and the max steps.

    I am very new and understand that I may be overlooking simple concepts. If you dont have an answer, I would appreciate if you would direct me in the direction of documentation that might have one.

    -Rylan Yancey, Amateur dev

    Code (CSharp):
    1. using System.Collections;
    2. using System.Collections.Generic;
    3. using UnityEngine;
    4. using Unity.MLAgents;
    5. using Unity.MLAgents.Sensors;
    6. using Unity.MLAgents.Actuators;
    7.  
    8. public class YesBot : Agent
    9. {
    10.     int yes;
    11.  
    12.     [SerializeField] Material Win;
    13.     [SerializeField] Material Lose;
    14.  
    15.     public override void OnEpisodeBegin()
    16.     {
    17.         yes = Random.Range(0, 2);
    18.     }
    19.  
    20.     public override void CollectObservations(VectorSensor sensor)
    21.     {
    22.         sensor.AddObservation(yes);
    23.     }
    24.  
    25.     public override void OnActionReceived(ActionBuffers actions)
    26.     {
    27.         MeshRenderer a;
    28.         a = GetComponent<MeshRenderer>();
    29.  
    30.         if (actions.DiscreteActions[0] == yes)
    31.         {
    32.             AddReward(+5f);
    33.             Debug.Log("Right!");
    34.             a.material = Win;
    35.         }
    36.         else
    37.         {
    38.             AddReward(-5f);
    39.             Debug.Log("Wrong!");
    40.             a.material = Lose;
    41.         }
    42.  
    43.         EndEpisode();
    44.     }
    45. }
    upload_2021-9-7_21-49-9.png
     
  2. ruoping_unity

    ruoping_unity

    Unity Technologies

    Joined:
    Jul 10, 2020
    Posts:
    134
    Random.Range() seems to be inclusive on min and max value so your input might actually be [0, 1, 2] instead of [0, 1]
     
  3. gft-ai

    gft-ai

    Joined:
    Jan 12, 2021
    Posts:
    44
    I think the max value given by Random.Range() for integers is exclusive. So I am not sure if that is the problem

    https://docs.unity3d.com/ScriptReference/Random.Range.html
     
  4. RylanYancey

    RylanYancey

    Joined:
    Jul 18, 2021
    Posts:
    10
    i tested it before hand. Random.Range(0,2) always returns either 0 or 1. The 2 is exclusive, the 0 inclusive.
     
  5. RankNFyle

    RankNFyle

    Joined:
    Jan 4, 2021
    Posts:
    31
    Just an idea for debugging: instead of random, set yes always to 0, run the training, then set yes always to 1 and run the training
     
  6. unity_-DoCqyPS6-iU3A

    unity_-DoCqyPS6-iU3A

    Joined:
    Aug 18, 2018
    Posts:
    26
    I saw a video on YouTube on old version of mlagents where an episode length of 1 caused problems. Can you try making your episode at least 2 steps long?

    Also, can you please show your config.yaml file?
     
  7. smallg2023

    smallg2023

    Joined:
    Sep 2, 2018
    Posts:
    102
    what are you training results in python? (i.e. the mean reward and the deviation)
    also did you edit the training file / behaviours at all?
    i did a test using your script and it seems like it is training fine here
    at 25k steps it has reached 0.999 mean with 0.033 std so i would say they are very nearly perfect however when using the trained brain it does indeed only get it right 60% of the time.. very odd
    edit: found the problem, set the decision period to 1 in the decision requester and they will work correctly :)
    (i am using ML agents version 0.27.0)
     
    Last edited: Sep 13, 2021