Search Unity

Question Ecosystem Simulation ML-Agent - beginner

Discussion in 'ML-Agents' started by ZRogers, Apr 22, 2021.

  1. ZRogers

    ZRogers

    Joined:
    Jun 4, 2019
    Posts:
    1
    hi im new to ml agents (and ml in general) im trying to create a small ecosystem with some rabbits, foxes and carrots. the idea being that rabbits will explore the area looking for carrots, when the eat a carrot they will receive some energy and if they have enough energy and they see another rabbit they can mate all whilst avoiding the foxes and the player.

    Environment:
    I have a plane with a navmesh baked (this is the method of movement i plan to use as i would like to implement it onto a terrain, ill cover this at the end as its less important). i have tried using walls and no walls, without walls the agents just walk towards the edge.

    Observations:
    For each agent i am using the ray perception sensor 3d along with the local rotation, local position and energy. Tags - Fox, Carrot ( not for Fox), Rabbit, Wall.
    I originally added the observations as the whole float but i have since normalized them, for the rotation i devided the local rotation eulerAngles by 360, the local position i added as seperate floats using this formula- normalizedValue = (currentValue - minValue)/(maxValue - minValue) and finally i divided the energy by 100.

    Actions: Nothing, Forward, Rotate Left and Right

    Rewards:
    my original plan for rewards was -1 for hitting the wall, player, a Fox (for rabbits) and energy reaching 0. + 0.25 for a carrot and +1 for finding a mate. i also ran tests with a small reward for moving forward ( to reduce the amount of turning side to side) this was tested at no reward, 0.001 and 0.0001. i also tried a survival reward to help the agent survive till it ran out of energy however i found this to not be nessisary.

    MaxSteps:
    0

    Episode ends:
    the agents energy is reduced over time so when this reaches 0 the episode ends, if it hits the wall, fox(for rabbits) or player the episode ends. finally if it finds a mate the episode ends.

    Training:
    im using the default PPO config but will 1mill max step( was planning to adjust parameters once i had it working)

    Episode begin:
    agents energy is set to 75 ( needs to be above 80 to mate)

    Issues: initially i was having issues with the rabbit finding carrots so i decided to remove the other objects to try train the explore/hunt behaviour, i tried several things including; increasing the reward, changing the size of the collider ( reducing the size and retrain initializing from the previous run, and adding extra carrots. having more carrots and reducing it down to one did help with it moving towards carrots when it saw them however it would often go round in a circle and not explore the whole environment. another issue i have is that the agent can train to find a mate but it doesnt seem to learn that it can only do that when its energy is above a certain amount (80) and keeps trying to get the reward but can't, is this something that would disappear with longer training? ( im using triggers to detect collision and have add && energy <= 80 in the if statement, i dunno if this is how its does for ml agents)

    i would appretiate any advice that anyone could give as to how to go about producing an agent that will explore an area, whether my plan for rewards is any good, would it be best to train the agents separately then train together, like wise is it best to train each agent to first hunt then once it has that down train it to look for a mate or both together? also should i be normalizing the observations as i am. i hope i have included everything that is helpful but can answer any questions. Thanks :)

    nav mesh issue:
    for some reason im unable to get my agent to move on a custom terrain they move perfectly fine when using navmesh on a plane. the same movement code works for my other scenes using the same terrain but not using ml agents. this isn't the end of the whole as i can change the others to a flat terrain too.
     
    Last edited: Apr 22, 2021