Search Unity

GridWorld

Discussion in 'ML-Agents' started by PxZzYYDS, Apr 3, 2021.

  1. PxZzYYDS

    PxZzYYDS

    Joined:
    Mar 26, 2021
    Posts:
    13
    i change the gridsize of GridWorld from '5' to '10',but the result of train is too bad ,.I find the reward just is max_step*(the reward of each step) .i don't konw what cause it. i only change the gridsize ,other parameters
    have not change.maybe other parameters should change ,but i don't konw what range is appropriate

    upload_2021-4-3_10-51-6.png
     
  2. ruoping_unity

    ruoping_unity

    Unity Technologies

    Joined:
    Jul 10, 2020
    Posts:
    134
    A few things to check:

    Have you been able to train the original GridWorld?
    After you changed the grid size, have you checked that the camera sensor is able to capture the whole grid?

    Generally when your scene doesn't train at all I'd suggest you to first check that your environment is being set up correctly, including observation, action and reward.
     
  3. PxZzYYDS

    PxZzYYDS

    Joined:
    Mar 26, 2021
    Posts:
    13
    of course , i have trianed the original GridWorld.
    The camera senor is able to capture the whole grid.
    i trained again yesterday.i find the reward is variable but not regular.
    examlpe:
    step 20000 reward -10
    step 40000 reward -8.35
    step 60000 reward -6.56
    step 80000 reward -10
    i think maybe the parameters in Traner_config .yaml isn't fit .but i don't know what range is suitable
     
  4. ruoping_unity

    ruoping_unity

    Unity Technologies

    Joined:
    Jul 10, 2020
    Posts:
    134
    If your environment setup is all good and the training is still not working, it could be the case that the network you use isn't big enough for learning the task.
    In that case you could try using larger network (num_layers, hidden_units). You might also try larger batch_size and buffer_size, and train for more steps.
    Also the tensorboard log is helpful for debugging the training. You should see the trend of loss going down if the training is working.
     
    PxZzYYDS likes this.