Search Unity

Multiple RayPerceptionSensors on one Agent

Discussion in 'ML-Agents' started by BrokenMelon, Jan 12, 2021.

  1. BrokenMelon

    BrokenMelon

    Joined:
    Aug 4, 2020
    Posts:
    3
    Hello,
    recently, I have developed a game in, which I want an AI to be able to dodge projectiles and harmful enemies. Projectiles fly in a straight line, and the Agent is only able to jump.
    I have gotten this AI to work with a single RayPerceptionSensor2D, but since I want the AI to be able to look through certain objects, while also acknowledging them (e.g. Projectiles hiding behind enemies) using one sensor cant be the right way (correct me if I am wrong).
    The AI was able to perform with one sensor after about 5 hours of training.
    At the moment, there are 5 sensors attached to it (each for its own layer), and after 2 days of training the AI still does not show any signs of progress. (AI randomly jumps, Mean reward is not improving)

    My assumption is that I made a mistake when attaching the additional sensors.
    upload_2021-1-12_19-31-50.png
    upload_2021-1-12_19-32-27.png

    Thanks to everybody in advance!
     
  2. vincentpierre

    vincentpierre

    Joined:
    May 5, 2017
    Posts:
    160
    Hi, I am surprised this does not work, in the PushBlock example, we use two 3DPerceptionSensorComponents. I would recommend looking at the inputs that are communicated to Python. If you connect to the environment using the Python API (https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Python-API.md) you should be able to see if the information that Python receives corresponds to the information that is being set by Unity.
    It is possible that the added complexity makes training slower (There is a lot more less useful information). I would recommend starting out with only two sensors and see if it manages at least as well as with only one.
    I hope this helps.
     
  3. BrokenMelon

    BrokenMelon

    Joined:
    Aug 4, 2020
    Posts:
    3

    First of all, thanks for your help!
    I have tried to train the model with only one, and then 2 Sensors again, and noticed something very weird. While training both these models, I took a closer look at the statistics shown in the console.
    The training continued for about three days, and every now and again I noticed improvement. The mean reward increased, and the AI seemed to purposely dodge obstacles.

    But randomly, the mean reward reset back to the initial value of -13, or even -14. To me it feels like the AIs progress is lost every now and then.
    Please contact me if you notice any obvious mistakes, or need further information or screenshots. Thank you!

    upload_2021-1-17_18-26-24.png
     

    Attached Files:

  4. vincentpierre

    vincentpierre

    Joined:
    May 5, 2017
    Posts:
    160
    It does sometimes happen that the Agent forgets things. Usually, it is because the environment does not consistently reward the expected behavior. I would increase the number of Agents in the environment, or the number of environments to increase the consistency of the training. I would also consider reducing the learning rate and increasing the batch/buffer size if your machine allows it.
     
  5. BrokenMelon

    BrokenMelon

    Joined:
    Aug 4, 2020
    Posts:
    3
    I think having multiple agents in my scene would cause immense amounts of work. Henceforth I have tried to improve the situation solely by changing the values you mentioned. I have decreased the value "Learning rate" and increased the Buffer- and Batch size. Each respectively by the factor 10.
    Looking at the results, I assume that I have changed the values by too much, since very little improvement is visible now.

    Again, at the start of the training, the mean reward is around -14.
    Over the course of 2.5 Days I have noticed an increase of around 1 (current mean reward: -12.8).
    Notice: There is a consistent increase, and few random jumps.

    But the agent seems to remember things this time around. (Not sure, due to little "mean reward" changes).

    Do you think, adding agents to the scene is inevitable, or can the time it takes to learn be shortened drastically by just changing values? (Learning rate, Buffer- and Batch size)
     
  6. vincentpierre

    vincentpierre

    Joined:
    May 5, 2017
    Posts:
    160
    If your machine allows it, you could also run multiple environments in parallel with the --num-envs argument, that would be equivalent to having multiple Agents in the scene (but slower). Without knowing more about your environment, it will be hard for me to say what could be wrong with the training, but I would try to simplify the task until the Agent can solve it to make sure there are no problems with the observations, actions or rewards of the Agent.
     
  7. m4l4

    m4l4

    Joined:
    Jul 28, 2020
    Posts:
    81
    could it be a matter of complexity? every ray returns something like 4 values (do not remember exactly).
    your agent has something like 500+ inputs, and we are just talking about raycasts, and it probably has other inputs described in the script.