Has anyone reproduce the Dodge Bullet example?

zhutian · Feb 7, 2022

Hi all,

I try to reproduce the dodge bullet example presented in this release note. I get the code from the dev-bullet-hell branch. The pretrained model works well and I can run the training code to start the training. However, I just cannot get the expected reward (the max rewards should be ~5.0, but I only got ~0.2 after training).
I wonder has anyone reproduce the example?

Attached is my trainning logs and config:

I used the training config from here

Code (Boo):

behaviors:

Dodge:

trainer_type: ppo

hyperparameters:

batch_size: 1024

buffer_size: 10240

learning_rate: 0.0003

beta: 0.005

epsilon: 0.2

lambd: 0.95

num_epoch: 3

learning_rate_schedule: linear

network_settings:

normalize: true

hidden_units: 128

num_layers: 2

vis_encode_type: simple

reward_signals:

extrinsic:

gamma: 0.99

strength: 1.0

keep_checkpoints: 5

max_steps: 50000000

time_horizon: 64

summary_freq: 100000

threaded: true

Search Unity

Unity ID

Useful Searches

Has anyone reproduce the Dodge Bullet example?

zhutian