Search Unity

Question ML AGENT is not working even after several attempts. please HELP

Discussion in 'ML-Agents' started by onebyonegames, Apr 30, 2021.

  1. onebyonegames

    onebyonegames

    Joined:
    Jan 17, 2019
    Posts:
    5
    Practising ML agent.

    This is to train the green cube to reach the red cube.

    My collectobservation and onactionreceived are not working. Hence my Player cube (green) is not moving at all.

    The movement is based upon the VectorObservation size and continuous action as given in the tutorial (documentation).

    https://github.com/Unity-Technologi..._docs/docs/Learning-Environment-Create-New.md

    When I started my training, mean reward and standard rewards are all 0.

    Packages Installed :
    • I'm using: Release 17
    • Python package: 0.26.0
    • ML agents - 2.0.0
    • MLagents extension - 0.3.1
    • UNITY - 2020.3.6f1
    Here is my code:

    using System.Collections;
    using System.Collections.Generic;
    using UnityEngine;
    using Unity.MLAgents;
    using Unity.MLAgents.Actuators;
    using Unity.MLAgents.Sensors;


    public class playerscript: Agent
    {
    Rigidbody rBody;
    public Transform Target;
    public float forceMultiplier = 10;
    public Vector3 playerpos;
    public override void Initialize()
    {
    rBody = GetComponent<Rigidbody>();
    playerpos = transform.position;
    }
    public override void OnEpisodeBegin()
    {
    // Move the target to a new spot
    transform.position = playerpos;
    Target.localPosition = new Vector3(Random.Range(-1, 4), 0.51f, 14.54f);
    // this.rBody.velocity = new Vector3(0, 0.5f, 1.5f);
    }


    public override void CollectObservations(VectorSensor sensor)
    {
    // Target and Agent positions
    sensor.AddObservation(Target.localPosition);
    sensor.AddObservation(this.transform.localPosition);

    // Agent velocity
    sensor.AddObservation(rBody.velocity.x);
    sensor.AddObservation(rBody.velocity.z);
    }


    public override void OnActionReceived(ActionBuffers actionBuffers)
    {

    Vector3 controlSignal = Vector3.zero;
    controlSignal.x = actionBuffers.ContinuousActions[0];
    controlSignal.z = actionBuffers.ContinuousActions[1];
    this.rBody.velocity= new Vector3(controlSignal.x, 0, controlSignal.z);

    print("onaction x: " + controlSignal.x);
    print("onaction z: " + controlSignal.z);
    // Rewards

    float distanceToTarget = Vector3.Distance(this.transform.localPosition, Target.localPosition);

    // Reached target
    if (distanceToTarget < 1.42f)
    {
    SetReward(1.0f);
    EndEpisode();
    }

    // Fell off platform
    else if (this.transform.localPosition.y < 0)
    {
    EndEpisode();
    }
    }

    public override void Heuristic(in ActionBuffers actionsOut)
    {
    var continuousActionsOut = actionsOut.ContinuousActions;
    continuousActionsOut[0] = Input.GetAxis("Horizontal");
    continuousActionsOut[1] = Input.GetAxis("Vertical");
    }
    }


    Here are my screenshots, Codes and error messages.
    error1.png error2.png error3.png error4.png
     
    Last edited: Apr 30, 2021
  2. christophergoy

    christophergoy

    Unity Technologies

    Joined:
    Sep 16, 2015
    Posts:
    735
    Your max step it set to 1, that means your agent resets every step. Did you mean to have it set to that?
     
  3. onebyonegames

    onebyonegames

    Joined:
    Jan 17, 2019
    Posts:
    5
    Thanks for the Response.

    Even when I make the max steps to 10 / 100, it remains the same
     
  4. onebyonegames

    onebyonegames

    Joined:
    Jan 17, 2019
    Posts:
    5
    @christophergoy,

    I could able to train the example environments. I have just now trained 3DBall example. It trains fine with increasing mean and standard rewards.
    error5.png
     
  5. onebyonegames

    onebyonegames

    Joined:
    Jan 17, 2019
    Posts:
    5
    Can someone help me here?
     
  6. onebyonegames

    onebyonegames

    Joined:
    Jan 17, 2019
    Posts:
    5
    I updated my code and tried it too. Still the same result. The cube is not moving at all.
    gameassets.zip
     
  7. christophergoy

    christophergoy

    Unity Technologies

    Joined:
    Sep 16, 2015
    Posts:
    735
    Hi @onebyonegame,
    Even with a maxStep of 100, that means your agent's episode only lasts about 2 seconds of real-time.

    I also see that your Heuristic method is being called. Are you sure you are correctly connecting to the trainer?
     
    Last edited: May 3, 2021
  8. yasharora

    yasharora

    Joined:
    Aug 8, 2022
    Posts:
    1
    Set max steps to 0 then it will run till you call EndEpisode()