Hi everyone I am a beginner in artificial intelligence. I tried to train the AI models by setting the observations and rewards as follows: Observations: line clear(0-4) number of holes bounded by four walls, boardstate score(higher the worse), sum of height of the each column Rewards: line rewards when cleared lines(pow(2, lines cleared) * 10), -10 for game over, +1 for placing each tetromino After training for serveral hours, the model can only move left and right to place the tetromino horizontally but not clearing serveral lines. Can anyone suggest a better parameters for observations and rewards with the vector observation space size and number of stacked vectors? I am using discrete branches size of 2: 4 for rotation and 10 for positions and ML-Agents 19. Thank you.