Search Unity

  1. All Pro and Enterprise subscribers: find helpful & inspiring creative, tech, and business know-how in the new Unity Success Hub. Sign in to stay up to date.
    Dismiss Notice
  2. Dismiss Notice

Training Error

Discussion in 'ML-Agents' started by gcook1_unity, Apr 6, 2021.

  1. gcook1_unity

    gcook1_unity

    Joined:
    Feb 19, 2021
    Posts:
    1
    I am having two problems when training.
    First, I am unable to watch the training reward process with tensorboard and the updated on the earned reward is not printing to the console. I think that this may be connected to their being no summaries folder being generated. (I am basing this off of the hummingbird example and how he did it, so let me know if things have just changed since then because I was able to get tensorboard to load and show output using results instead of summaries.)
    Second, after I run the test for a couple of minutes I get this error:

    c:\users\capstone\.conda\envs\ml-agents-node\lib\site-packages\mlagents\trainers\torch\utils.py:242: UserWarning: This overload of nonzero is deprecated:
    nonzero()
    Consider using one of the following signatures instead:
    nonzero(*, bool as_tuple) (Triggered internally at ..\torch\csrc\utils\python_arg_parser.cpp:882.)
    res += [data[(partitions == i).nonzero().squeeze(1)]]

    If I let the training to proceed it trains fine for about 10 minutes python crashes, so I have to stop training and restart it. My configuration file is:

    behaviors:
    Node_AI:
    trainer_type: sac
    summary_freq: 50000
    time_horizon: 128
    max_steps: 5.0e6
    keep_checkpoints: 5
    checkpoint_interval: 500000
    init_path: null
    threaded: true
    hyperparameters:
    learning_rate: 3e-4
    batch_size: 100 #this is a guess avg is 32 - 512
    buffer_size: 50000
    learning_rate_schedule: constant
    buffer_init_steps: 0
    init_entcoef: 0.5
    save_replay_buffer: true
    tau: 0.005
    steps_per_update: 1​
    network_settings:
    hidden_units: 256
    num_layers: 2 #typical is 1 - 3
    normalize: false
    vis_encoder_type: match3​
    reward_signals:
    extrinsic:
    gamma: 0.99
    strength: 1.0​
    curiosity:
    strength: 0.05
    gamma: 0.99​
    self_play:
    save_steps: 20000
    team_change: 80000
    swap_steps: 5000
    play_against_latest_model_ratio: 0.5
    window: 10​
     
  2. andrewcoh_unity

    andrewcoh_unity

    Unity Technologies

    Joined:
    Sep 5, 2019
    Posts:
    157
unityunity