Search Unity

ML-Agents Issue

Discussion in 'ML-Agents' started by thirteengavfiftynine, Apr 24, 2020.

  1. thirteengavfiftynine

    thirteengavfiftynine

    Joined:
    Apr 22, 2020
    Posts:
    2
    Hello.

    I have just started to use ML-Agents and I went through the penguin example and it all worked fine. I have started to work on my own game now (just to see if I can do all of it myself) (I'm new to unity too! :D).

    So I made a very simple game (https://i.imgur.com/6VPxR5T.png) where the player (cube) has to collect 5 coins that are randomly placed in the level.

    I have gone through and made the academy, agent and area scripts. (I am using the same ml-agents version as the penguins tutorial for now (0.13.1) and using curriculum learning). This is in my yaml:

    PlayerLearning:
    summary_freq: 5000
    time_horizon: 128
    batch_size: 128
    buffer_size: 128
    hidden_units: 256
    beta: 1.0e-2
    max_steps: 1.0e6

    and the JSON curriculum is this:

    { "measure": "reward","thresholds": [-0.1,0.7,1.7,1.7,1.7,2.7,2.7],"min_lesson_length": 80, "signal_smoothing": true,"parameters": {"coin_speed": [0.0,0.0,0.0,0.0,0.0,0.5,0.5,0.5]}}


    I can move the player around fine using heuristics, although when I try and train, both elements in vectorAction (from overriding AgentAction) are 0, therefore the player is not moving.

    Does anyone know why this is? Apologies if this is a stupid question, as I said, I'm new to unity and ml-agents. Thanks a lot.

    Code: https://gist.github.com/13gav59/93ed2f8e2b99bcef7afcec9eace57938

    Images: https://i.imgur.com/t1S9lPA.png
    https://i.imgur.com/DUeEYJ3.png
    https://i.imgur.com/iQjmCe0.png
     
  2. vincentpierre

    vincentpierre

    Joined:
    May 5, 2017
    Posts:
    160
    Can you try to print directly the vectorAction from AgentAction (these should be floats)?
    It seems strange that they would be all 0 in this case. Did you make sure that you disabled heuristics (it could be that your agents still use "heuristic only" in the scene)? Is the training process going normally? Is the reward moving at all?
    Are you maybe using action masks?
     
  3. thirteengavfiftynine

    thirteengavfiftynine

    Joined:
    Apr 22, 2020
    Posts:
    2
    Yep they are both 0. It is set to default (so not heuristic only). The reward is increasing, just the agent is not actully moving around (and I did test, moving works when heuristic only is on)
     
  4. vincentpierre

    vincentpierre

    Joined:
    May 5, 2017
    Posts:
    160
    Could you open an issue on the github repository? Can you submit minimal steps to reproduce this bug (does it happen with an example environment or a toy environment?)