Search Unity

  1. We are migrating the Unity Forums to Unity Discussions by the end of July. Read our announcement for more information and let us know if you have any questions.
    Dismiss Notice
  2. Dismiss Notice

Question When do you stop training?

Discussion in 'ML-Agents' started by victornor, Sep 22, 2023.

  1. victornor

    victornor

    Joined:
    Jan 17, 2014
    Posts:
    91
    At what point should i accept that no more significant progress will be made?
    My cumulative reward seems to be at a plateau?

    behaviors:
    My Behavior:
    trainer_type: ppo
    hyperparameters:
    batch_size: 512
    buffer_size: 10240
    learning_rate: 0.0003
    beta: 0.00005
    epsilon: 0.2
    lambd: 0.95
    num_epoch: 3
    learning_rate_schedule: linear
    network_settings:
    normalize: false
    hidden_units: 128
    num_layers: 2
    vis_encode_type: simple
    memory:
    memory_size: 256
    sequence_length: 64
    reward_signals:
    extrinsic:
    gamma: 0.99
    strength: 1.0
    gail:
    strength: 1.0
    gamma: 0.99
    encoding_size: 128
    demo_path: Assets/Demonstrations/KayakDemo.demo
    keep_checkpoints: 5
    max_steps: 50000000000
    checkpoint_interval: 100000
    time_horizon: 64
    summary_freq: 50000
    threaded: true



    f82f9eedd8a1b42c9750dd8e4a57c490.png 0743f3e25caeada470177b83c040c967.png
     
  2. smallg2023

    smallg2023

    Joined:
    Sep 2, 2018
    Posts:
    154
    depends what the highest reward possible is, just because it's reached a peak in it's current understanding of the environment doesn't mean it's stopped learning, it could very well find that next breakthrough and go up again - this really depends on the difficulty / variation(s) of the task(s) you're trying to get it to achieve and the reward(s) given.

    for a well trained brain i would expect the reward to be more flat so it can consistently achieve the best result it can but it doesn't really matter when you stop, you can always carry on again if the training hasn't reached the point you need after testing the brain.