Search Unity

Feedback Feedback - Curiosity Module on non Physics based simulations

Discussion in 'ML-Agents' started by ChillX, Jan 16, 2022.

  1. ChillX

    ChillX

    Joined:
    Jun 16, 2016
    Posts:
    145
    After hundreds of tests with lots of changes in paramenters including strength, combining with Gail, etc... I've stopped using the Curiosity Module in any simulation where movement is not Physics Based.
    Use Case Details:
    I am using AStar PathFinding Project Pro for movement and navmesh navigation.
    For the agent actions I have tried both continuous mapped to a vector as well as discrete mapped to left right up down stay movements.
    The agent expresses the desired direction and this is applied as a delta to the AStar Pathfinding Project movement target.

    Observation:
    No matter what I do the model trains fine with curiosity disabled. But the moment curiosity is enabled it begins exploring variances in pathfinding movement rather than focusing on the rewards. I've tried both sparse rewards as well as dense shaped rewards. Nothing makes any difference. Eventually after about 500,000 steps the model collapses to where the agent simply runs in one single diagonal direction until it hits the edge of the map. If I restart from scratch it may settle on a different diagonal direction but the outcome is always the same.

    Having Gail in parallel somewhat dampens the effect of the curiosity module but eventually the outcome is still the same.

    With the curiosity module disabled however the model trains fine.

    Further Note:
    The reason I mention the topic as Physics based vs non physics based is because initially I had the simulation working using Unity Physics with rigid bodies and collider floor + walls etc... When it was Physics based Curiosity worked perfectly as expected.
    However I moved the simulation across to A Star Pathfinding Project so that later I could make actual maps instead of a basic square play area. Once I did this even with the same basic square play area Curiosity simply will not work.

    Kind Regards
    Tikiri
     
    Last edited: Jan 16, 2022
  2. ChillX

    ChillX

    Joined:
    Jun 16, 2016
    Posts:
    145
    Further update on this and howto replicate

    The default (builtin) sensor on the agent called in CollectObservations()
    uses a vector sensor. For some unknown reason if a large number of observations are implemented (> 64) and stacking is enabled on this default builtin vector sensor causes the curiosity module completely mess up training.

    To replicate:
    1) Create an agent that populates the built-in sensor with more than 64 distinct observations.
    2) Enable stacking by setting the stacking depth to 2
    3) train the model to decent performance with Curiosity disabled.
    4) Keeping all hyper parameters (except curiosity) the same Do a second run using --initialize-from to clone the fully trained weights but this time enable the curiosity module.
     
  3. ChillX

    ChillX

    Joined:
    Jun 16, 2016
    Posts:
    145
    Comment:
    In my case the observations are the normalized direction and magnitude vectors of the location of 32 other agents on the map.
    Direction is 2D as they are moving on a flat surface. so Two Vectors X and Z (Normalized Value with Y Axis zeroed out)
    For magnitude I am feeding two vectors.
    Close range magnitude Mathf.Clamp01(Magnitude / 20f);
    Long range magnitudeMathf.Clamp01(Magnitude / 50f);
    So total number of vectors being fed into the built-in VectorSensor is 128 floats normalized to 0,1
     
  4. ChillX

    ChillX

    Joined:
    Jun 16, 2016
    Posts:
    145
    Update:
    Turns out increasing the number of neurons and the number of layers fixes the issue.
    To cater for the larger number of vector sensors / complexity I had previously increased the agent network to:
    hidden_units: 1024
    num_layers: 4

    After changing the curiosity module to the following it is now working:
    hidden_units: 1024
    num_layers: 3
     
  5. ChillX

    ChillX

    Joined:
    Jun 16, 2016
    Posts:
    145
    Sorry I take that back/ It trains fine for about 150,000 steps and then does and about turn and completely unlearns everything. It eventually reaches the maximum possible negative reward.
     
  6. ChillX

    ChillX

    Joined:
    Jun 16, 2016
    Posts:
    145
    Finally got it working. When using more than one sensor. For example in my case two grid sensors plus a ray perceptions sensor plus observations on the built-in sensor Curiosity module goes haywire.

    However Curiosity module works fine if the same observations are custom implemented on a single sensor. For example in my case Curiosity works fine after implementing all observations on the built-in Vector sensor and removing the other child sensors.

    Having said that if I enable stacking or LSTM Curiosity still goes haywire. Which I'm still trying to figure out a solution for.