Search Unity

  1. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

Question Agent Not Converging.

Discussion in 'ML-Agents' started by seifmostafa7347, Mar 10, 2023.

  1. seifmostafa7347

    seifmostafa7347

    Joined:
    Nov 2, 2021
    Posts:
    22
    I'm training a Robot that takes in an angular velocity and a linear velocity to move to a given target position, adding a 3Draycast sensor to avoid obstacles.
    the environment seemed extremely straightforward to me but the agent still doesn't solve the env even after running it on two different configurations (one for 10 hours, which yielded a weird behavior of always going back, and another one for 7 hours that yielded an even weirder behavior of agent going a tiny bit forward then backward xD)

    this is sadly my third environment that I try and doesn't converge , I think there is a critical thing that I'm missing , if anyone can guide me here as I'm starting to lose hope :(


    Rewards :
    time penalty -0.00025
    distance reward : -(distance/10) "to scale it lower than 1"
    oncollisionstay : when colliding with an obstacle -1 for each frame that it collides
    Ontriggerentr: +100 if collided with target


    configuration :

    behaviors:
    MobileRobot:
    trainer_type: ppo
    hyperparameters:
    batch_size: 1024
    buffer_size: 10240
    learning_rate: 0.0003
    beta: 0.005
    epsilon: 0.2
    lambd: 0.95
    num_epoch: 3
    learning_rate_schedule: linear
    network_settings:
    normalize: true
    hidden_units:
    num_layers: 3
    vis_encode_type: simple
    reward_signals:
    extrinsic:
    gamma: 0.99
    strength: 1.0
    keep_checkpoints: 5
    max_steps: 5000000
    summary_freq: 5000
     
    Last edited: Mar 11, 2023
  2. seifmostafa7347

    seifmostafa7347

    Joined:
    Nov 2, 2021
    Posts:
    22
    I randomize the agent rotation and position each episode, along with the target and other obstacles, such that they don't overlap.
    the observations are :
    Code (CSharp):
    1.         sensor.AddObservation(transform.position);
    2.         sensor.AddObservation(transform.rotation.eulerAngles);
    3.         sensor.AddObservation(target.position);
    4.  
    5.         sensor.AddObservation(maxAngularSpeed);
    6.         sensor.AddObservation(maxLinearSpeed);
    7.  
    upload_2023-3-10_17-47-33.png
     
  3. seifmostafa7347

    seifmostafa7347

    Joined:
    Nov 2, 2021
    Posts:
    22
    after 6 more hours of training and testing some environments tweaks, the robot still doesn't converge, when I tried to test the brain on the agent in simulation (since I trained it in cmd with --no graphics option) it acts completely as if it doesn't know anything about the environment, it collides with boundaries and walls, not even trying to get close of the target :( .
     
  4. seifmostafa7347

    seifmostafa7347

    Joined:
    Nov 2, 2021
    Posts:
    22
    50 M steps update.
    the robot moves very confidently now (meaning it doesn't go back and forward as it used to) , however, there are 3 very weird behaviors
    1) it never seeks the goal even if it is just right in front of it
    2) instead it either goes straight to the boundaries to terminate the episode if it close to it
    3) or if just keeps "near" an obstacle and keeps hovering there
    this is undescribable by my rewards functions at!

    * I give it a linearly decreasing negative reward when it gets closer to target , and a flat negative reward when it hit obstacles , and a flat negative terminating reward when it hits boundaries of map*
     

    Attached Files: