Search Unity

Make agent reach a moving target

Discussion in 'ML-Agents' started by DeathScytheU, May 3, 2020.

  1. DeathScytheU

    DeathScytheU

    Joined:
    Nov 14, 2017
    Posts:
    4
    Hello people!
    I am looking for some way to define rewards for an agent that has to navigate towards a constantly moving target. Since there's the possibility that just by chance sometimes the target can be moving towards the agent by itself, this makes defining a reward system extra difficult for me.
    Any help, advice and insight are much appreciated!
    Cheers!
     
  2. Roboserg

    Roboserg

    Joined:
    Jun 3, 2018
    Posts:
    83
    look at the reward for the crawler example.
     
  3. DeathScytheU

    DeathScytheU

    Joined:
    Nov 14, 2017
    Posts:
    4
    Thanks for the reply! I have already tried something similar to the crawler in terms of rewards. The thing is, even in the dynamic target version of the crawler, the position of the target is changed only at the end of the episode, meaning from the agent's perspective, it actually has a static position for the duration of the episode. What I'm trying to find is how to give the agent incentive to move towards a target that changes position continuously every frame - e.g. moving in circle within the parameters of the floor in the scene. The results I've been getting so far are that the agent would rather learn a lazy way to 'farm' the reward by figuring out a stationary position through which the target passes most frequently and just head there at the beginning of every episode.
     
  4. christophergoy

    christophergoy

    Unity Technologies

    Joined:
    Sep 16, 2015
    Posts:
    735
    You could try penalizing the agent for "being lazy" with a small negative reward for every step it isn't moving.
     
  5. DeathScytheU

    DeathScytheU

    Joined:
    Nov 14, 2017
    Posts:
    4
    Thank you for the reply, but I have tried this already, unfortunately to no avail. Since that didn't work out, I even tried a penalty for velocity lower than certain value, simply to prevent him from camping in one place, but then he just learns to 'jiggle' around without really covering any ground.
     
  6. mbaske

    mbaske

    Joined:
    Dec 31, 2017
    Posts:
    473
    You could try using a waypoint queue. Enqueue your agent's position at some regular time interval. Set a maximum queue size and dequeue old positions. At each agent step, measure the distance between the newest and oldest position in the queue and set a reward proportional to that distance.
     
    DeathScytheU likes this.
  7. DeathScytheU

    DeathScytheU

    Joined:
    Nov 14, 2017
    Posts:
    4
    Thank you for the advice! I hadn't thought of that. I'll give it a go and see if there's an improvement.
     
  8. kahabal

    kahabal

    Joined:
    Nov 19, 2018
    Posts:
    16
    try adding more AddObservation, like the distance from the target, the direction the agent is facing, their positions and eventually the force(i dont know how you are moving your agent) that you are applying
    this worked for me

    sensor.AddObservation(transform.localPosition);
    sensor.AddObservation(target.transform.localPosition);

    sensor.AddObservation(spinningForceL);
    sensor.AddObservation(spinningForceR);
    sensor.AddObservation(Vector3.Distance(this.transform.localPosition, target.transform.localPosition));
    sensor.AddObservation(transform.forward);
    sensor.AddObservation((target.transform.localPosition - this.transform.localPosition).normalized);
     
  9. MirrorsK

    MirrorsK

    Joined:
    Jan 23, 2022
    Posts:
    1
    I have the same question,have you solved it?