Search Unity

  1. We are migrating the Unity Forums to Unity Discussions. On July 12, the Unity Forums will become read-only. On July 15, Unity Discussions will become read-only until July 18, when the new design and the migrated forum contents will go live. Read our full announcement for more information and let us know if you have any questions.

Question Changing the Max steps

Discussion in 'ML-Agents' started by mathijsl_unity, Jun 20, 2023.

  1. mathijsl_unity

    mathijsl_unity

    Joined:
    Sep 2, 2020
    Posts:
    7
    I'm working with ML-Agents and to start I can't get the behavior to use the script I applied to my agent. It mentions that it can't find an agent with this name. I can't figure out how to create that actual behavior so I thought I'd change the default settings in settings.py to have an increased max_steps. This worked until yesterday but since yesterday it will just use 500000 no matter if I change it. Does someone know how to create a new behavior so that I can create one per agent or know how I can change the default max_steps.
    Any help with how to debug this would also be greatly appreciated!
     
  2. GamerLordMat

    GamerLordMat

    Joined:
    Oct 10, 2019
    Posts:
    186
    In the behaviour paramters there is the max steps for one episode; but you mean the maxsteps for the agent training:

    just when starting training: mlagents-learn path\to\myconfig.yaml give your yaml as a parameter

    "MyAgent" has to be set also in your Behavour script as the behaviour name
    example for a .yaml

    behaviors:
    MyAgent:
    trainer_type: ppo
    hyperparameters:
    batch_size: 2048
    buffer_size: 20480
    learning_rate: 0.0003
    beta: 0.005
    epsilon: 0.2
    lambd: 0.95
    num_epoch: 3
    learning_rate_schedule: linear
    network_settings:
    normalize: true
    hidden_units: 512
    num_layers: 3
    vis_encode_type: simple
    memory:
    sequence_length: 128
    memory_size: 256
    reward_signals:
    extrinsic:
    gamma: 0.995
    strength: 1.0
    curiosity:
    gamma: 0.99
    strength: 0.02
    network_settings:
    hidden_units: 512
    learning_rate: 0.0003
    keep_checkpoints: 20
    checkpoint_interval: 50000
    max_steps: 3000000000
    time_horizon: 1000
    summary_freq: 10000
    threaded: True