I'm trying to train my rabbit agents to collect food in a closed environment. There are four walls around the area that the agents should not touch. If they stay in contact with the wall, they are punished every step. After some time, instead of trying to collect food (which is available everywhere), they all start to hug the wall despite being heavily punished while doing so. I just can't understand this kind of behavior, seems like they are trying to pursuit the most negative reward. As seen in the picture, the rabbits are touching the top-left wall corner. And yes, they have ray perception censor to detect the walls I'm at step 62M (about 1.7 days of training) and they are having a lot of negative reward because of this Any help is appreciated!