Agent Not Converging.

Discussion in 'ML-Agents' started by seifmostafa7347, Mar 10, 2023.

  seifmostafa7347


    Nov 2, 2021
    I'm training a Robot that takes in an angular velocity and a linear velocity to move to a given target position, adding a 3Draycast sensor to avoid obstacles.
    the environment seemed extremely straightforward to me but the agent still doesn't solve the env even after running it on two different configurations (one for 10 hours, which yielded a weird behavior of always going back, and another one for 7 hours that yielded an even weirder behavior of agent going a tiny bit forward then backward xD)

    this is sadly my third environment that I try and doesn't converge , I think there is a critical thing that I'm missing , if anyone can guide me here as I'm starting to lose hope :(

    Rewards :
    time penalty -0.00025
    distance reward : -(distance/10) "to scale it lower than 1"
    oncollisionstay : when colliding with an obstacle -1 for each frame that it collides
    Ontriggerentr: +100 if collided with target

    configuration :

    trainer_type: ppo
    batch_size: 1024
    buffer_size: 10240
    learning_rate: 0.0003
    beta: 0.005
    epsilon: 0.2
    lambd: 0.95
    num_epoch: 3
    learning_rate_schedule: linear
    normalize: true
    num_layers: 3
    vis_encode_type: simple
    gamma: 0.99
    strength: 1.0
    keep_checkpoints: 5
    max_steps: 5000000
    summary_freq: 5000
  seifmostafa7347


    Nov 2, 2021
    I randomize the agent rotation and position each episode, along with the target and other obstacles, such that they don't overlap.
    the observations are :
    Code (CSharp):
    1.         sensor.AddObservation(transform.position);
    2.         sensor.AddObservation(transform.rotation.eulerAngles);
    3.         sensor.AddObservation(target.position);
    5.         sensor.AddObservation(maxAngularSpeed);
    6.         sensor.AddObservation(maxLinearSpeed);
  seifmostafa7347


    Nov 2, 2021
    after 6 more hours of training and testing some environments tweaks, the robot still doesn't converge, when I tried to test the brain on the agent in simulation (since I trained it in cmd with --no graphics option) it acts completely as if it doesn't know anything about the environment, it collides with boundaries and walls, not even trying to get close of the target :( .
  seifmostafa7347


    Nov 2, 2021
    50 M steps update.
    the robot moves very confidently now (meaning it doesn't go back and forward as it used to) , however, there are 3 very weird behaviors
    1) it never seeks the goal even if it is just right in front of it
    2) instead it either goes straight to the boundaries to terminate the episode if it close to it
    3) or if just keeps "near" an obstacle and keeps hovering there
    this is undescribable by my rewards functions at!

    * I give it a linearly decreasing negative reward when it gets closer to target , and a flat negative reward when it hit obstacles , and a flat negative terminating reward when it hits boundaries of map*

