Search Unity

Resetting Environment

Discussion in 'ML-Agents' started by Claytonious, Apr 25, 2020.

  1. Claytonious

    Claytonious

    Joined:
    Feb 16, 2009
    Posts:
    904
    How do we reset the environment during training?

    Looking at these docs and elsewhere I can see how to implement OnEpisodeBegin() on my agents and how to call EndEpisode() for each of them as each finishes, and I can see how to subscribe to the Academy.Instance.OnEnvironmentReset event so that I can reset the overall environment when it actually does reset, however I cannot see how to cause the environment to reset.

    My current use case is that I have up to several agents on a "team". When all on a team have died, then the game ends and I need to reset the environment. As each of them individually dies, then his own EndEpisode() is called and that's fine. But once they have all died, I need to reset everything (including non-agent objects in the environment to put things back to how they should be at the start of a game).

    It looks like Academy used to have something like a Done() method on it. I'm using 0.15.1 and don't see anything like that.

    So how does one reset the environment now?

    Thanks.
     
  2. Claytonious

    Claytonious

    Joined:
    Feb 16, 2009
    Posts:
    904
    Upon further inspection of the examples and docs, it looks like maybe ml-agents simply isn't equipped for this?

    To simplify the description of the problem: if I have several agents in a scene learning at the same time, but I only want to reset the environment when *all* of their episodes have ended, can I do that?

    The environment in question is large and expensive (a fully populated terrain) - it doesn't seem practical for me to have several instances of this in a scene (like the examples do with lots of tiny tennis boards, for example). On top of that, I actually do want the agents to learn how to cooperate eventually, so I don't want them isolated anyway (at least in the long run).

    So is this idea of only resetting the environment when *all* agents are finished supported?

    Thanks!
     
  3. vincentpierre

    vincentpierre

    Joined:
    May 5, 2017
    Posts:
    160
    Hi,

    The environment will not reset when all the agents are Done. Just like when making a game, it is the responsibility of the environment to keep track of the "players". If all the players are dead, the game should restart on its own. Note that this can be done by calling "Academy.Instance.OnEnvironmentReset.Invoke()" directly.
    Academy.Instance.OnEnvironmentReset will be called by Python when using certain features that require the whole environment to reset (curriculum learning for example). It is also called when using the UnityEnvironment.reset method on Python (if you are using our environment API directly or the gym wrapper).

    The Academy.Instance.OnEnvironmentReset event is a tool that allows Python to restart the game when it wants. But it will not be called automatically when all the agents have ended their episodes.
     
  4. Claytonious

    Claytonious

    Joined:
    Feb 16, 2009
    Posts:
    904
    When I restart the game, though, do I need to call something on the agents to "reset" their learning? Otherwise, they are learning that by all dying together they can cause a reset which might have reward benefits in some scenarios right?

    (Or conversely, to STOP the learning of the dead ones until they are all dead.)

    For example, if each agent has BeginEpisode() called right after he dies, how do I make them wait until the entire team is dead before they start learning again? Obviously I can control my game logic itself to make them do nothing until all of the agents have reset, but then by forcing them to do nothing (ignoring input), they are learning that their attempts to provide input values are not doing anything for some period of time, right?
     
  5. vincentpierre

    vincentpierre

    Joined:
    May 5, 2017
    Posts:
    160
    Ah, this sounds like a great use case for NOT reseting agents but instead destroying them and re-spawning them.

    Instead of calling "EndEpisode" when the agent dies, simply Destroy the Agent.
    Destroy(AgentGameObject) will automatically tell Python that the Agent terminated the task (either in success or in failure) and the Agent will not reset.
    When reseting the environment, destroy the remaining agents (if any) and create new Agents.
    If destroying Agents is too costly, Disabling and re-Enabling Agents should do the trick as well:
    https://docs.unity3d.com/ScriptReference/GameObject.SetActive.html

    Would that solve your issue?
     
    Claytonious likes this.
  6. 0rigin93

    0rigin93

    Joined:
    Dec 11, 2016
    Posts:
    8
    Isn't OnEnvironmentReset an event, meaning it is not possible to Invoke() it from anywhere but the Academy class? Also, I can not find EndEpisode() as a call-able method in the agent class. Did something change? Edit: Is EndEpisode() called Done() in some versions?
     
    Last edited: Apr 28, 2020
  7. Claytonious

    Claytonious

    Joined:
    Feb 16, 2009
    Posts:
    904
    Yes, Vincent, that sounds like the perfect solution. I didn't know that destroying them did this. Sounds great so I will give it a try. Thank you for your support.
     
  8. Claytonious

    Claytonious

    Joined:
    Feb 16, 2009
    Posts:
    904
    EndEpisode is definitely there in the latest release (0.15.1).
     
  9. isefno

    isefno

    Joined:
    May 17, 2020
    Posts:
    1
    I have the same configuration as Claytonious where i have 2 teams of agents (sharing the same brain) battling each other, resetting when one team gets `killed`. I have tried calling
    Code (CSharp):
    1. Academy.Instance.OnEnvironmentReset.Invoke()
    but without success as i suppose we cannot call this event outside Academy class. Also, i have tried the solution given by vincentpierre. Destroying all agents then instantiating them again did not call my custom reseting method
    Code (CSharp):
    1. EnvironmentReset()
    added to the academy event like so:
    Code (CSharp):
    1. Academy.Instance.OnEnvironmentReset += EnvironmentReset;
    I am running with the new release 1.0. Can anyone enlight me on what could have gone wrong with my logic? Thanks
     
  10. vincentpierre

    vincentpierre

    Joined:
    May 5, 2017
    Posts:
    160
    Hi,
    I think your logic is correct, but the goal of OnEnvironmentReset is to reset the simulation or game from Python.
    I think what you should do is have a method to reset the environment (called EnvironmentReset() like in your example)
    When the game needs to reset (because too many agents died for instance) call EnvironmentReset() manually.
    This is regular simulation behavior, it resets on its own when the conditions are met.
    IN ADDITION, you should use Academy.Instance.OnEnvironmentReset += EnvironmentReset
    so Python can reset the environment without having to wait for the conditions of a reset to be met.
     
    Haneferd likes this.
  11. guidosalimbeni

    guidosalimbeni

    Joined:
    Nov 10, 2017
    Posts:
    17
    Hi,
    I have a similar dilemma. In my case when all the agents reach the goal, I want to change the game object that the agent was controlling. Academy.Instance.OnEnvironmentReset += EnvironmentReset seemed to be called only one time at the beginning of the training but not at the start of each new episode. The solution was to maintain a list and remove from the first and add to the list the last agent resetting from the call onepisodend on each agent.