Search Unity

Has anyone reproduce the Dodge Bullet example?

Discussion in 'ML-Agents' started by zhutian, Feb 7, 2022.

  1. zhutian

    zhutian

    Joined:
    May 30, 2019
    Posts:
    2
    Hi all,

    I try to reproduce the dodge bullet example presented in this release note. I get the code from the dev-bullet-hell branch. The pretrained model works well and I can run the training code to start the training. However, I just cannot get the expected reward (the max rewards should be ~5.0, but I only got ~0.2 after training).
    I wonder has anyone reproduce the example?

    Attached is my trainning logs and config:



    I used the training config from here
    Code (Boo):
    1. behaviors:
    2.   Dodge:
    3.     trainer_type: ppo
    4.     hyperparameters:
    5.       batch_size: 1024
    6.       buffer_size: 10240
    7.       learning_rate: 0.0003
    8.       beta: 0.005
    9.       epsilon: 0.2
    10.       lambd: 0.95
    11.       num_epoch: 3
    12.       learning_rate_schedule: linear
    13.     network_settings:
    14.       normalize: true
    15.       hidden_units: 128
    16.       num_layers: 2
    17.       vis_encode_type: simple
    18.     reward_signals:
    19.       extrinsic:
    20.         gamma: 0.99
    21.         strength: 1.0
    22.     keep_checkpoints: 5
    23.     max_steps: 50000000
    24.     time_horizon: 64
    25.     summary_freq: 100000
    26.     threaded: true