Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.

Question MLagent Warehouse robot not learning at all

Discussion in 'ML-Agents' started by Braeze, Dec 6, 2022.

  1. Braeze

    Braeze

    Joined:
    Mar 4, 2019
    Posts:
    2
    Hello, we're having trouble with our environment and have tried everything we could think of.

    Environment:
    The agent should pick up a package (by touching one of the coloured racks) and then deliver the package to the corresponding port at the end of the building. Problem is that the agent just doesn't seem to learn anything! We started with pure PPO, but have added imitation learning, as we thought it was maybe too complex. So far we have tried tuning different parameters and changing strength of GAIL (and adding behaviour cloning)

    Rewards:
    The agent gets a reward of 0.5 for picking up a package and then 0.5 for delivering the package. It can get a negative value of -1 based on how many steps it takes and we have a max step size of 5000

    Vision/Observations:
    Our observations for the agent itself include a Boolean if it picked up a package, an integer which equal the value of the package it picked up(Defaults to -1 if holding no package) and lastly the agent transform(sensor.AddObservation(transform.InverseTransformDirection(m_AgentRb.velocity))
    Then there are the objects it needs to interact with. We have tried a lot of different ways. Firstly by giving the agent the target location of packages and ports. Then by giving the distance between agent and package. Now we have switched to using raycast/RayPerceptionSensor. One sensor is detecting walls, and another is detecting racks (with packages) and ports. They are working on different layers so it can see through the racks.
    Github to the agent script: https://github.com/WeAreVR/UnityRL/blob/main/Assets/Warehouse/Scripts/AgentMover.cs
    Latest config and graph be seen below
    Screenshot 2022-12-06 155048.png image.png
     
    Last edited: Dec 6, 2022
    seifmostafa7347 likes this.
  2. Braeze

    Braeze

    Joined:
    Mar 4, 2019
    Posts:
    2
    After a week of trying different things i figured out that the problem was config, it is learning now. Changed a few things but curiosity had the biggest impact
     
    GamerLordMat likes this.
  3. GamerLordMat

    GamerLordMat

    Joined:
    Oct 10, 2019
    Posts:
    126
    Could you elaborate, I am curious (pun intended)?
    so first off, your big negative reward just hinders it to learn it seems. If anything it should get point for holding it bc that is the goal right(Holding it and put it in place)? I would give it reward = 1 - (timeItNeedeed/MaxSteps) as reward when coming to the goal combined with curiosity. . Also ask your self if you need ML-Agents for that simple problems. I know you just learn but try to pick something you have no chance to program (like moving ragdolls, simpler AI enemies, anything with physics works out fine )