Resolved Cumulative reward decreases when episode length is also decreasing.

meldeg · Mar 1, 2023

Hi,

The Cumulative reward is decreasing, when episode length is decreasing. When the episode ends it receives 20 points. Why does the Cumulative reward decrease, but the agent is getting better and better to solve the task?. (Attaching stat for agent.)

Rewards:
-0.2 for colliding with border

2 for taking a treasure from chamber

4 for taking this treasure to own chamber

20 for winning the game.
Conf file.

Code (CSharp):

default:

trainer: ppo

batch_size: 1024

beta: 5.0e-3

buffer_size: 10240

epsilon: 0.2

hidden_units: 128

lambd: 0.95

learning_rate: 3.0e-4

learning_rate_schedule: linear

max_steps: 5.0e5

memory_size: 128

normalize: false

num_epoch: 3

num_layers: 2

time_horizon: 64

sequence_length: 64

summary_freq: 20000

use_recurrent: false

vis_encode_type: simple

reward_signals:

extrinsic:

strength: 1.0

gamma: 0.99

curiosity:

strength: 0.02

gamma: 0.99

encoding_size: 64

learning_rate: 3.0e-3

PlayerAgent:

time_horizon: 256

batch_size: 4096

buffer_size: 40960

hidden_units: 512

max_steps: 5.0e6

beta: 7.5e-3

Github Repository. You need to import ML-agent 1.0.8 Package manually in the project.

https://github.com/Badsalt/AI

/Melvin

hughperkins · Mar 2, 2023

Cumulative reward is the total reward earned in the episode.

Imagine you earn $1 per hour. You work 8 hours. Your cumulative reward is $8.

Now imagine that you are being paid to do a task. You learn to do it faster, in only 3 hours. Now your cumulative reward is only $3.

hughperkins · Mar 2, 2023

It's ok for cumulative reward to go down by the way.

hughperkins · Mar 2, 2023

Oh, you don't have per-timestep rewards. Never mind then This is not the reason then

hughperkins · Mar 2, 2023

I'd recommend watching what the agent is actually doing. How is it earning reward? Why is the episode getting shorter? There's either a bug in your code, or the agent is somehow 'gaming' your rewards somehow. Looking at what it is actually doing will likely reveal some insight to you.

meldeg · Mar 9, 2023

I solved the problem by setting the final reward based on the amounts of steps.

Search Unity

Resolved Cumulative reward decreases when episode length is also decreasing.

meldeg

Attached Files:

AI_292_PlayerAgent.zip

hughperkins

hughperkins

hughperkins

hughperkins

meldeg

Search Unity

Unity ID

Useful Searches

Resolved Cumulative reward decreases when episode length is also decreasing.

meldeg

Attached Files:

AI_292_PlayerAgent.zip

hughperkins

hughperkins

hughperkins

hughperkins

meldeg