Search Unity

How to go about environment/agent resets that take longer than a frame?

Discussion in 'ML-Agents' started by Xiromtz, Feb 13, 2020.

  1. Xiromtz

    Xiromtz

    Joined:
    Feb 1, 2015
    Posts:
    65
    Hi,
    The game I'm using for my mlagents implementation uses a procedural generation algorithm per episode to generate new levels. The current implementation waits for the current level to be destroyed before setting up a new level, which takes about two frames.

    I want to call Agent.Done() the moment the player dies and before a new level is generated. In the same frame I call Done(), the AgentReset function is called, which in turn calls the episode resetting logic.
    The big problem I'm having here is that the resetting logic takes longer than one frame, but the agent observation/action loop is called in the same frame I call Done(). This causes an additional faulty observation in a not setup environment, which might ruin the learning process.

    My solution was to call "DisableAutomaticStepping" on the academy, which according to the documentation, stops the observation/action loop until i call "EnvironmentStep". I don't know if this is a bug, but after doing this, neither the "AgentReset", nor the "EnvironmentReset" are called at the beginning of training. The only way to call any of these functions is by doing an "EnvironmentStep" manually, which in turn starts a whole observation/action loop, which results in the same issue I was having before.

    I am currently using version 0.14.0 of mlagents.

    I would love some help here, since I can't think of any other solution right now.
    Thanks for the help in advance.
     
  2. mbaske

    mbaske

    Joined:
    Dec 31, 2017
    Posts:
    473
    I'm facing a similar problem, only that my resetting pause applies to single agents rather than the whole environment. Individual agents can reset at different points in time. My ugly hack for solving this is to set a bool flag in AgentReset. As long as that's true, CollectObservations and AgentAction exit prematurely and the agent receives null observations.
    Code (CSharp):
    1. public override void CollectObservations()
    2. {
    3.     if (isResetting)
    4.     {
    5.         SetNullObservations();
    6.         return;
    7.     }
    8.     ...
    9. }
    10.  
    11. protected void SetNullObservations()
    12. {
    13.     collectObservationsSensor.Update();
    14.     AddVectorObs(
    15.         new float[SensorExtensions.ObservationSize(collectObservationsSensor)]);
    16. }
    The assumption being that a few null observations here and there won't mess up the training too much. But it's far from ideal... Is there any way to pause the agent loop?
     
  3. Xiromtz

    Xiromtz

    Joined:
    Feb 1, 2015
    Posts:
    65
    The way I did it with "DisableAutomaticStepping()" is to simply set a bool to false whenever I don't want the agent to be updated. Whenever the bool is set to false, I don't call "EnvironmentStep()" in Update/FixedUpdate. This is a pretty solid solution, the only problem being AgentReset() not being called.
     
  4. Xiromtz

    Xiromtz

    Joined:
    Feb 1, 2015
    Posts:
    65
    Also, per default EnvironmentStep is called every FixedUpdate, so if you call it manually there, functionality is equivalent to default behavior.
     
  5. Xiromtz

    Xiromtz

    Joined:
    Feb 1, 2015
    Posts:
    65
    Ah, my bad, that won't work for you, since you're doing it on a per-agent basis.
    I guess your solution is the only one I could think of for that case too.
     
  6. mbaske

    mbaske

    Joined:
    Dec 31, 2017
    Posts:
    473
    Oh well, I should just use on-demand decisions. Totally forgot about that option... Makes sense to make them the default in 0.14