Search Unity

ML Agents Vector Observation best practices

Discussion in 'ML-Agents' started by Fr2, Feb 6, 2020.

  1. Fr2

    Fr2

    Joined:
    Jan 19, 2013
    Posts:
    39
    I've been working through the "RollerBall" example for the latest release (0.13.1) in the "Making a New Learning Environment" doc: https://github.com/Unity-Technologi....13.1/docs/Learning-Environment-Create-New.md

    It's learning really quickly, but I had a couple of queries about the way Vector Observations are collected.

    The RollerBall demo collects observations like this:

    Code (CSharp):
    1.  
    2. public override void CollectObservations()
    3. {
    4.     // Target and Agent positions
    5.     AddVectorObs(Target.position);
    6.     AddVectorObs(this.transform.position);
    7.  
    8.     // Agent velocity
    9.     AddVectorObs(rBody.velocity.x);
    10.     AddVectorObs(rBody.velocity.z);
    11. }
    12.  
    According to the docs, "In total, the state observation contains 8 values". I couldn't work out how 8 values were gathered - I only see 4! Or to each of the vectors (Target.position & transform.position) count for 2 each?

    upload_2020-2-6_11-3-22.png

    The other query I had related to best practices for collecting Vector Observations:https://github.com/Unity-Technologi...1/docs/Learning-Environment-Best-Practices.md

    In the doc it says "all inputs should be normalized to be in the range 0 to +1 (or -1 to 1). " - but none of the RollerBall values (velocity and positions) are normalized values. So is it the case that normalized values are a "best case" for learning, but are not strictly necessary?

    The doc also suggests using relative values for the target "Positional information of relevant GameObjects should be encoded in relative coordinates wherever possible." - but the Rollerball example observes the global position rather than a relative position. Again, just wondering what would be the best approach to take, and if relative position is strictly necessary.
     
  2. Fr2

    Fr2

    Joined:
    Jan 19, 2013
    Posts:
    39
    OK the first question about the space size of 8 was an easy one. Looking at VectorSensor I can see that a Vector3 is simply added as three individual values:

    Code (CSharp):
    1.  
    2. public void AddObservation(Vector3 observation)
    3. {
    4. AddFloatObs(observation.x);
    5. AddFloatObs(observation.y);
    6. AddFloatObs(observation.z);
    7. }
    8.  
    So two Vector3 values = 6 observations, plus the two velocity values = 8.

    I forget that ml-agents is all open source, so I can look into the code myself!
     
    kabadayi_06 likes this.
  3. celion_unity

    celion_unity

    Joined:
    Jun 12, 2019
    Posts:
    289
    Good point that the example is a bit confusing, we can add some comments to the code sample to indicate the first two calls are actually adding 3 floats.

    For normalization, I think it's a good "best practice" and might help your model train faster, but it's not an absolute requirement. In general, you don't want different inputs to have very different orders of magnitude.

    For relative vs global coordinates, I think this gets a bit better in this section: https://github.com/Unity-Technologi...multiple-training-areas-within-the-same-scene, where the positions are changed to localPosition instead.
     
  4. Fr2

    Fr2

    Joined:
    Jan 19, 2013
    Posts:
    39
    Many thanks