Search Unity

Train agent to push enemies off the platform

Discussion in 'ML-Agents' started by Kozaki2, Mar 22, 2020.

  1. Kozaki2

    Kozaki2

    Joined:
    Apr 8, 2019
    Posts:
    47
    Hi, my training area looks like this
    upload_2020-3-22_13-24-43.png

    I want to train that blue agent to push off red enemies from the platform. Each enemy has tag "Enemy" that is passed to the Ray Perception Sensor 3D component of agent. The inspector of my agent looks like this
    upload_2020-3-22_13-27-48.png

    Unfortunately I can't see any progress during training. My enemy has no in mind to go in direction of any enemy. Hi is just walking in random(?) directions.

    My BotAgent script rewards:
    1. Each step -1 / maxStep which is -0002f
    2. -1 when agent fall off the platform
    3. 5 when agent push off all enemies

    Is something wrong with my configuration? Can i do something better?
     
  2. Antypodish

    Antypodish

    Joined:
    Apr 29, 2014
    Posts:
    10,776
    I am not using ML-Agents, but I used different ML algorithm.
    I would add more reward conditions.
    • One for getting close to the enemy.
    • Second for pushing / touching enemy. (First condition may resolve this, then no need for this one condition)
    • Other, for pushing toward the nearest edge.
     
  3. celion_unity

    celion_unity

    Joined:
    Jun 12, 2019
    Posts:
    289
    What are the vector observations that you're using? I assume the position of the agent on the platform?

    Following up on the ideas from @Antypodish, giving a small reward each time the agent moves the enemy towards the edge might help. You could also try just a single enemy to try to simplify things.

    Other than that, I don't see anything obviously wrong. How long are you letting it train? Can you paste some of the trainer output here?
     
  4. Kozaki2

    Kozaki2

    Joined:
    Apr 8, 2019
    Posts:
    47
    Hi, thanks for tips. So, now I give 0.1f reward for colliding with enemies but I can't see any difference. At the beginning agent is making a lot of collisions bo over time number of those collisions are smaller

    My vector observations are for the agent position. I changed space size to 3. Maybe there is something wrong with my bot agent script
    Code (CSharp):
    1. using System.Collections.Generic;
    2. using System.Linq;
    3. using MLAgents;
    4. using MLAgents.Sensors;
    5. using UnityEngine;
    6. using Random = UnityEngine.Random;
    7.  
    8. public class BotAgent : Agent
    9. {
    10.     public float speed = 5.0f;
    11.  
    12.     private Rigidbody _rigidBody;
    13.     private Vector3 _startPosition;
    14.     private Quaternion _startRotation;
    15.     private IEnumerable<GameObject> _enemies;
    16.  
    17.     private void Start()
    18.     {
    19.         _rigidBody = GetComponent<Rigidbody>();
    20.         _startPosition = transform.localPosition;
    21.         _startRotation = transform.localRotation;
    22.         _enemies = transform.parent.FindChildsWithTag("Enemy").Where(enemy => enemy != gameObject);
    23.     }
    24.  
    25.     public override void OnEpisodeBegin()
    26.     {
    27.         if (transform.localPosition.y < -1)
    28.         {
    29.             _rigidBody.angularVelocity = Vector3.zero;
    30.             _rigidBody.velocity = Vector3.zero;
    31.             transform.localPosition = _startPosition;
    32.             transform.localRotation = _startRotation;
    33.         }
    34.  
    35.         foreach (var enemy in _enemies)
    36.         {
    37.             enemy.transform.localPosition = new Vector3(Random.value * 8 -4, 2, Random.value * 8 - 4);
    38.             enemy.GetComponent<Rigidbody>().velocity = Vector3.zero;
    39.             enemy.GetComponent<Rigidbody>().angularVelocity = Vector3.zero;
    40.         }
    41.     }
    42.  
    43.     public override void CollectObservations(VectorSensor sensor)
    44.     {
    45.         sensor.AddObservation(transform.localPosition);
    46.     }
    47.  
    48.     public override void OnActionReceived(float[] act)
    49.     {
    50.         var controlSignal = Vector3.zero;
    51.         controlSignal.x = act[0];
    52.         controlSignal.z = act[1];
    53.  
    54.         if (controlSignal != Vector3.zero)
    55.         {
    56.             transform.rotation = Quaternion.LookRotation(controlSignal);
    57.         }
    58.      
    59.         _rigidBody.MovePosition(transform.position + transform.forward * (speed * Time.deltaTime));
    60.  
    61.         var enemies = _enemies.Where(enemy => enemy.transform.localPosition.y > -1);
    62.         if (!enemies.Any())
    63.         {
    64.             Debug.Log("Training: 5.0f reward");
    65.             SetReward(5f);
    66.             EndEpisode();
    67.         }
    68.      
    69.         if (transform.localPosition.y < -1)
    70.         {
    71.             Debug.Log("Training: -1.0f reward");
    72.             SetReward(-1);
    73.             EndEpisode();
    74.         }
    75.      
    76.         AddReward(-0.0002f);
    77.     }
    78.  
    79.     private void OnCollisionEnter(Collision other)
    80.     {
    81.         if (!other.gameObject.CompareTag("Enemy")) return;
    82.      
    83.         Debug.Log("Training: 0.1f reward");
    84.         AddReward(0.1f);
    85.     }
    86. }
    Output of training: https://pastebin.com/A7e3m21D
     
  5. celion_unity

    celion_unity

    Joined:
    Jun 12, 2019
    Posts:
    289
    Giving the reward in OnCollisionEnter is probably not going to encourage the behavior you want. You want to do something like (for each enemy):
    * save the initial distance of the enemy to the edge as minDistanceToEdge
    * at each step, get the distance from the enemy to the edge as currentDistanceToEdge
    * if currentDistanceToEdge < minDistanceToEdge. give the agent a reward based on (minDistanceToEdge - currentDistanceToEdge) and set minDistanceToEdge to currentDistanceToEdge.

    That way, the Agent only gets rewarded for moving the enemies in the proper direction (they can't keep moving the enemy back and forth to get a reward).

    Looking at your logs:
    1) 300 seconds isn't that long, you might need to wait longer before you see the behavior that you want.
    2) How many Agents and environments are you training at once? If it's only 1 Agent and 1 environment, then something might have happened to it between steps 40000 and 50000. Since your Agent.maxSteps is 5000 but summary_freq is 10000, you should always complete an episode unless something weird happened to your Agent (like getting disabled). If you have more than 1 Agent or environment, ignore this (since summary_freq refers to the total number of Agent steps)
     
  6. Kozaki2

    Kozaki2

    Joined:
    Apr 8, 2019
    Posts:
    47
    The reward for pushing enemies near edge didn't work. They are making a lot of collisions but only at the beginning of training. I run some tests on "PushBlock" example.
    • I changed vector space type from discrete to continuous and vector action space size to 2
    • I modified "move" method to the one from my script
    After these changes agents are unable to learn, so maybe the problem is continuous vector space? Maybe it should be set to Discrete but I dont know how to achive my movement function with such a vector type
    Code (CSharp):
    1.     public override void OnActionReceived(float[] act)
    2.     {
    3.         var controlSignal = Vector3.zero;
    4.         controlSignal.x = act[0];
    5.         controlSignal.z = act[1];
    6.  
    7.         if (controlSignal != Vector3.zero)
    8.         {
    9.             transform.rotation = Quaternion.LookRotation(controlSignal);
    10.         }
    11.      
    12.         _rigidBody.MovePosition(transform.position + transform.forward * (speed * Time.deltaTime));
    13.         //(...)
    14.     }