Search Unity

  1. Megacity Metro Demo now available. Download now.
    Dismiss Notice
  2. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

Machine Learning AI for Pong Game

Discussion in 'ML-Agents' started by Rebel47, Jun 5, 2021.

  1. Rebel47

    Rebel47

    Joined:
    May 16, 2020
    Posts:
    4
    Pong-Game.PNG Hello,

    I want to implement a machine learning AI for the game "Pong". I tried some code for it, but I think I'm missing something, because I don't see any progress on the training.
    Am I doing something wrong ?

    I'm punishing him for the following things:
    1. When his enemy is making a goal and
    2. When he is sticking on the walls on north and south(because he is doing it very often.

    I'm rewarding him for the following things:
    1. When he touches the ball at all and
    2. When the ball reaches east

    He is observing the following things:
    1. The current y axis of the ball
    2. The distance between him and the ball
    3. All Walls (North,East, South, West) and
    4. It's own velocity


    What am I doing wrong ?

    The code is attached below.

    This is how the game looks like (the green one is the Machine Learning agent and the walls on west and east are invisible): Pong-Game.PNG



    Thank you in advance.
     

    Attached Files:

  2. andrewcoh_unity

    andrewcoh_unity

    Unity Technologies

    Joined:
    Sep 5, 2019
    Posts:
    162
    Hi @Rebel47

    Is the blue agent also a learning agent or is it being controlled by a heuristic?

    Additionally, the observations sound a little strange. You might try just giving the agent it's own x,y coordinates and the balls x,y coordinates.
     
  3. AngrySamsquanch

    AngrySamsquanch

    Joined:
    Dec 28, 2014
    Posts:
    24


    I made a pong game that had observations for each player's paddles and the ball's position and velocity. I gave it a slight existential reward which incentivizes it to keep the ball in play as long as possible. The ball speeds up slightly on each hit so the agents eventually can't keep up. You can see in the video that the ai paddle (on the right) is a bit twitchy even though the agent is only rewarded if the paddles are not moving, so there's some room for improvement.

    Code (CSharp):
    1.      public override void CollectObservations(VectorSensor sensor)
    2.         {
    3.             // 6 observations
    4.             sensor.AddObservation(_envManagerTransform.position - transform.position);
    5.             sensor.AddObservation(GetVelocity());
    6.  
    7.  
    8.             // 6 observations
    9.             sensor.AddObservation(_envManagerTransform.position - _envManger.GetOtherPlayerPosition(GetPlayerIndex()));
    10.             sensor.AddObservation(_envManger.GetOtherPlayerVelocity(GetPlayerIndex()));
    11.  
    12.  
    13.             // 6 observations
    14.             sensor.AddObservation(_envManagerTransform.position - _envManger.GetBallPosition());
    15.             sensor.AddObservation(_envManger.GetBallVelocity());
    16.  
    17.         }
    Code (CSharp):
    1.    private void FixedUpdate()
    2.         {
    3.             _paddleMovement.MovePaddle(agentMoveInput);
    4.  
    5.             // Existential reward if paddle isn't moving
    6.             if (_rigidbody.velocity.magnitude < 1)
    7.                 AddReward(0.001f);
    8.         }