Question Making ML agent handle collsion in racing game.

nscrivanich112325 · Mar 30, 2021

Hello,

I'm currently using ML agents for a car racing game where the AI is supposed to race against the player around the track. I'm having trouble trying to get the ML-Agents to properly handle collisions. Whenever I try to train the AI and they drive head-on into a barrier, they get stuck and fail to go in reverse to get unstuck. I have a picture below:

I tried placing a trigger in the front of the car and give a penalty to the AI for hitting the throttle and not the brake while hitting a barrier (The brake and reverse are mapped to the same command).

Code (CSharp):

AddReward((-AIThrottle + AIBrake) * 0.5f);

However, this was unsuccessful and results in the AI not moving at all in the later steps of training.

Does anybody know a good way to implement this desired behavior using ML-agents or just some advice on how to properly implement collisions using ML-agents?

Here are the rewards/penalties I'm currently using:

Reward to encourage faster driving:

Code (CSharp):

AddReward(Mathf.Clamp01(carController.speed / 200f) * 0.1f);

Penalty when initially colliding with a barrier.

Reward for going through a checkpoint.

Penalty for facing the wrong direction

An episode ends when a car does not make it through a checkpoint within a certain time frame or when the car finishes the race (3 laps).

Below is the configuration for the hyperparameters and the AIAgent code:

Hyperparameters:
behaviors:

  Race:

    trainer_type: ppo

    hyperparameters:

      batch_size: 1024

      buffer_size: 20480

      learning_rate: 3.0e-4

      learning_rate_schedule: linear

    network_settings:

      normalize: false

      hidden_units: 128

      num_layers: 2

    reward_signals:

      extrinsic:

        gamma: 0.99

        strength: 1.0

    max_steps: 10000000

    time_horizon: 64

    summary_freq: 1000000
Code:

Code (CSharp):

private void Awake()

{

carController = GetComponent<RCC_CarControllerV3>();

curReset = resets[Random.Range(0, (resets.Count - 1))];

}

private void Start()

{

checkPointScript.OnCarCorrectCheckPoint += OnCorrectCheckPoint;

checkPointScript.OnCarWrongCheckPoint += OnWrongCheckPoint;

}

void OnCorrectCheckPoint(object sender, TrackCheckPoints.CheckPointSystemArgs e)

{

if (e.CarTransform == carCollider)

{

AddReward(1f);

secondsCount = 0;

if (e.last)

{

laps++;

if (laps == 3)

{

CheckEndEpisode();

}

}

}

}

private void FixedUpdate()

{

secondsCount += Time.fixedDeltaTime;

if (secondsCount >= maxEpisodeTime)

{

CheckEndEpisode();

}

speed = carController.speed;

}

public override void CollectObservations(VectorSensor sensor)

{

Vector3 checkPos = checkPointScript.getNextCheckpoint(carCollider).position;

Vector3 dirToTarget = (checkPos - transform.position).normalized;

float dirDot = Vector3.Dot(transform.forward, dirToTarget);

if(dirDot < 0.1f)

{

AddReward(-1f);

}

sensor.AddObservation(dirDot);

sensor.AddObservation(dirToTarget);

sensor.AddObservation(frontCol);

sensor.AddObservation(Mathf.Clamp01(carController.speed/200f));

sensor.AddObservation(transform.forward);

sensor.AddObservation(leftSteerAngle / 64f);

sensor.AddObservation(rightSteerAngle / 64f);

}

public override void OnActionReceived(float[] vectorAction)

{

AIThrottle = vectorAction[0];

if (AIThrottle < 0.0f)

{

AIThrottle = 0.0f;

}

AIBrake = vectorAction[1];

if (AIBrake < 0.0f)

{

AIBrake = 0.0f;

}

AISteer = vectorAction[2];

if (frontCol)

{

AddReward((-AIThrottle + AIBrake) * 0.5f);

}

if (carController.speed > 15f)

{

AddReward(Mathf.Clamp01(carController.speed / 200f) * 0.1f);

}

}

public override void Heuristic(float[] actionsOut)

{

actionsOut[0] = Input.GetAxis(RCC_Settings.Instance.Xbox_triggerRightInput);

actionsOut[1] = Input.GetAxis(RCC_Settings.Instance.Xbox_triggerLeftInput);

actionsOut[2] = Input.GetAxis(RCC_Settings.Instance.Xbox_horizontalInput);

}

void OnCollisionEnter(Collision collision)

{

if (collision.gameObject.tag == "Wall" || collision.gameObject.tag == "AICar" || collision.gameObject.tag == "Player")

{

AddReward(-1f);

}

}

private void OnTriggerEnter(Collider other)

{

if (other.gameObject.CompareTag("Wall"))

{

frontCol = true;

}

}

private void OnTriggerExit(Collider other)

{

if (other.gameObject.CompareTag("Wall"))

{

frontCol = false;

}

}

public override void OnEpisodeBegin()

{

this.transform.position = curReset.position;

this.transform.rotation = curReset.rotation;

checkPointScript.ResetCheckPoints(carCollider);

secondsCount = 0;

laps = 0;

}

void CheckEndEpisode()

{

curReset = resets[Random.Range(0, (resets.Count - 1))];

if (curReset.GetComponent<ResetSpawn>().CheckSpawn())

{

EndEpisode();

}

}

Any help on this is much appreciated. Thank you.

christophergoy · Mar 25, 2021

HI, have you tried using raycast sensors to detect the other cars and walls in the game?

nscrivanich112325 · Mar 26, 2021

christophergoy said: ↑

HI, have you tried using raycast sensors to detect the other cars and walls in the game?
Click to expand...

Hello,

Yes, I forgot to mention that I'm using the Ray Perception sensor 3D component.

nscrivanich112325 · Mar 26, 2021

christophergoy said: ↑

HI, have you tried using raycast sensors to detect the other cars and walls in the game?
Click to expand...

christophergoy · Mar 29, 2021

Hey,
there are a few other things in your code that I'm not quite sure I understand.

You are adding reward values that are really high. You usually want to keep them low within the -1 to +1 range.
I see you are setting the speed in fixed update? Do you want to set that from your actionBuffer?

nscrivanich112325 · Mar 30, 2021

christophergoy said: ↑

Hey,
there are a few other things in your code that I'm not quite sure I understand.

You are adding reward values that are really high. You usually want to keep them low within the -1 to +1 range.
I see you are setting the speed in fixed update? Do you want to set that from your actionBuffer?
Click to expand...

Hey, thanks for the response. I did try changing the code to keep the value of the rewards between 0 and 1 (issue still persists). I updated the code and the hyperparameters in the starter thread. The speed variable that you see in FixedUpdate is just a public variable that does nothing. I just have it there so I can see the speed of each agent in the inspector during training. Despite having a penalty for facing the wrong way, the agents cannot grasp the fact that they should only go one way. Is there anything Im missing regarding the observations? I passed in the dot product with the forward vector of the car and the vector from the car to the next checkpoint so the agent can observe the correct direction to go in. Perhaps I just need to train them more. Although, they don't seem to be improving at this point.

christophergoy · Mar 30, 2021

It looks like you have a pretty complex reward function, perhaps a better reward for the direction would be to just reward it for how well it is pointing in the right direction, instead of penalizing it for not pointing in the right direction.

For example I'd remove:

Code (CSharp):

if (dirDot < 0.1f)

{

AddReward(-1f);

}

in favor of:

Code (CSharp):

AddReward(dirDot);

or something similar.

This is much easier for the neural network to maximize than not getting any reward signal for going the right direction, but then getting penalized once it reaches an arbitrary threshold set by you.
Also, if you ever decide to change that if statement, you'll need to retrain your model.

I'd also remove the penalty for colliding with anything.

Try to simplify your reward function:
- Try to avoid conditional rewards unless absolutely necessary
- Try to give rewards that reflect how well the network is doing
- For example, it gets a higher reward for facing the checkpoint more directly, and a lower reward for not.

Let me know if that helps.

Search Unity

Question Making ML agent handle collsion in racing game.

nscrivanich112325

christophergoy

Unity Technologies

nscrivanich112325

nscrivanich112325

Attached Files:

upload_2021-3-25_20-23-38.png

christophergoy

Unity Technologies

nscrivanich112325

christophergoy

Unity Technologies

Search Unity

Unity ID

Useful Searches

Question Making ML agent handle collsion in racing game.

Unity Technologies

Attached Files:

Unity Technologies

Unity Technologies