I have an agent who first learns normally, and then starts to degrade. What could be the reason? Scripts give Reward 1 when obs: [*,*,2] and ray close to target. Reward -0.001 when obs: [2,*,*] Reward -0.01 when obs: [*,*,{1 or 2}] and ray dist long away And -1 when dead Comand use: > mlagents-learn D:/Unity/ml-agents-0.15.0/config/trainer_config.yaml --run-id=GirlApple_6 --train yaml WomenBraine: batch_size: 128 buffer_size: 2048 memory_size: 256 hidden_units: 512 beta: 1.0e-2 sequence_length: 16 init_entcoef: 0.01 time_horizon: 128 tau: 0.01 max_steps: 3000000 Other parameters on screenshots
Hey there, we'll forward this over to the team to check out. Which version of Python and C# are you using?
HI @crazywolfcub, Could you expand on your reward functions a bit? I'm not sure what the syntax of your statement means. Could you also expand a bit more on your action space?
Version information: ml-agents: 0.15.0, ml-agents-envs: 0.15.0, Communicator API: 0.15.0, TensorFlow: 2.0.1 I have already changed the settings for rewards, actions and sensors, so I can not say. But while it seems to work and study well. If there are still such problems, I will describe in more detail. I have to experiment with the settings, since many parameters are not clear to me.