I managed to create an environment and program an agent to train it in the task of aiming and shooting a cannon but I am not getting the results I would desire. The command window says its training but the mean reward is to low and the agent misses the static target all the time, the target does not even change its position, maybe it should be trained for more time but here comes the bug, when it has spent some time training it just stops with some lines saying the mean reward and the std of reward. Maybe it is because the mean reward is too low being negative all the lines and not improving? It does not say some error it just stops. And when I stop the training in unity editor, the command window says again "Start training with play button" and after some time it says the results have been exported and copied, then it shows the error about "The Unity environment took too long to respond." https://github.com/Marioc9955/Training3DCannon In my repository, you can find my current solution. While it is not working as well as I would like, I hope it can provide you with some insight into my approach and any potential issues. Can anyone please help me to see what am I doing wrong? I need help tuning the hyperparameters of the agent and the environment to optimize training, I am using the default hyperparameters, and any tips or best practices for designing a reward function that encourages the agent to learn to aim and shoot accurately. Thank you for your time and consideration.