Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

ML Agents: Only the Agent at position 0,0,0 Works Properly and Odd Reward Drop Off

Discussion in 'ML-Agents' started by BrandonPlays, Mar 6, 2022.

  1. BrandonPlays

    BrandonPlays

    Joined:
    Jun 19, 2019
    Posts:
    4
    Hey Everyone!
    Just giving into ML Agents for the first time and started with the classic "cube reach the sphere" example.
    After training for a while, I noticed 2 very odd things:

    • Only the agents at the coordinate 0,0,0 would perform as it should.
    • A huge, unexplained drop off in intelligence after training for a while.

    I've included pictures for both situations. This is a very basic example. Using 2 continuous actions, only feeding it its own position and its target's position as Vector3s for both.
    Any help and insight is greatly appreciated!

    Screen Shot 2022-03-06 at 12.09.58 PM.png Screen Shot 2022-03-06 at 12.13.12 PM.png
     
  2. BrandonPlays

    BrandonPlays

    Joined:
    Jun 19, 2019
    Posts:
    4
  3. WaxyMcRivers

    WaxyMcRivers

    Joined:
    May 9, 2016
    Posts:
    59
    You're not providing enough detail to tell you exactly what's going on, but to me it seems that you're having issues with the coordinate system the agent is seeing observations in. Make sure that you're using either the local coordinates to the individual training environment container objects, or you can use transform.inversetransformpoint/direction to re-localize the information from the training environment to the agent.
     
    BrandonPlays likes this.
  4. BrandonPlays

    BrandonPlays

    Joined:
    Jun 19, 2019
    Posts:
    4
    This is pretty much everything the agent is doing. Does this accommodate for local coordinates?

    Screen Shot 2022-03-07 at 2.03.37 PM.png
     
  5. PolenTz

    PolenTz

    Joined:
    Jan 8, 2021
    Posts:
    3
    You are adding to the VectorSensor the global position of the agent and target(transform.position, target.transform.position), so every agent in your scene has different coordinates. Having such different coordinates the RL algorithm cannot generalize a correct behavior. Try transform.localPosition, target.transform.localPosition instead.
     
    BrandonPlays likes this.
  6. ChillX

    ChillX

    Joined:
    Jun 16, 2016
    Posts:
    145
    Also if I may suggest.
    1) Normalize the observations
    2) Use a delta for the target instead of absolute.

    // if your game map center is positioned at 0,0 in world coordinates.

    float MaxDistance = 100f; // Set to the max distance from 0,0 to the edge of the map.
    sensor.AddObservation(transform.position / MaxDistance);
    sensor.AddObservation( (transform.position - target.transform.positon) / (MaxDistance * 2f));

    If your game map center is NOT positioned at 0,0 in world coordinates then create an offset vector 3 and add that to transform positions in the observations.

    Or like @PolenTz said parent the agent and target to some game object centered the middle of the game map and then use local coordinates instead.
     
    BrandonPlays likes this.
  7. BrandonPlays

    BrandonPlays

    Joined:
    Jun 19, 2019
    Posts:
    4
    I figured this out a few hours ago and solved it by subtracting the current position by the environments position. It essentially normalizes it to 0,0,0 and it worked! Thank you though! I'll look into .localPosition. Funnily enough, it was able to generate a proper brain, but would only work for 0,0,0 prior to this XD