Search Unity

  1. We are migrating the Unity Forums to Unity Discussions. On July 12, the Unity Forums will become read-only. On July 15, Unity Discussions will become read-only until July 18, when the new design and the migrated forum contents will go live. Read our full announcement for more information and let us know if you have any questions.

Question How to train ML agent in Unity make him go directly and not in circles

Discussion in 'ML-Agents' started by Bruder, Jul 17, 2023.

  1. Bruder

    Bruder

    Joined:
    Aug 9, 2014
    Posts:
    56
    I'm trying to train an ML agent (cube) in Unity using reinforcement learning to pick up 6 rewards (circles), each reward has a different number (between 2 to 12 points), and getting out of the ring gets a reward of -2. The result is that the agent is picking up the objects in a weird circle way and not directly as I was expecting and I suspect that my hyper-parameters are not optimal for this task.

    Here is the end result:


    And here is the config with the parameters:
    behaviors:
    moveToGoal:
    trainer_type: ppo
    hyperparameters:
    batch_size: 10
    buffer_size: 100
    learning_rate: 3.0e-4
    beta: 5.0e-4
    epsilon: 0.2
    lambd: 0.99
    num_epoch: 3
    learning_rate_schedule: linear
    beta_schedule: constant
    epsilon_schedule: linear
    network_settings:
    normalize: false
    hidden_units: 128
    num_layers: 4
    reward_signals:
    extrinsic:
    gamma: 0.99
    strength: 1.0
    max_steps: 1500000
    time_horizon: 64
    summary_freq: 10000
     
  2. smallg2023

    smallg2023

    Joined:
    Sep 2, 2018
    Posts:
    154
    what are your observations, i.e. does it know the location of every target at all times and their worth or is it just seeing what's in front of it?
    and are there rewards for facing or time taken etc? if you want it to go more directly you would want to reward it for doing so... otherwise time means nothing to an AI.
     
    Energymover likes this.
  3. Energymover

    Energymover

    Joined:
    Mar 28, 2023
    Posts:
    33
  4. Atilli

    Atilli

    Joined:
    Aug 31, 2022
    Posts:
    11
    somewhere in the docs it mentions setting rewards to between -1 & 1