Search Unity

When is the best time to call EndEpisode?

Discussion in 'ML-Agents' started by Phong, Oct 10, 2020.

  1. Phong

    Phong

    Joined:
    Apr 12, 2010
    Posts:
    2,085
    I have been calling "EndEpisode" at the end of "OnActionReceived:. This is what most of the examples do. However this generates an extra call to "CollectObservations" which is called in:
    • OnActionReceived
      • EndEpisode
        • EndEpisodeAndReset
          • NotifyAgentDone
            • ...
            • CollectObservations
    This generates a stack trace where there is a CollectObservations call inside OnActionReceived. I don't like this extra CollectObservations call because it happens after my agent has applied the actions for this step to the environment. The observations are now different than those which generated this set of actions.

    So when is the best time to check if the Episode should terminate and call EndEpisode?

    I have considered using Academy.PreAgentStep. But that doesn't feel right either because I would end up with a code flow like this:
    • OnActionReceived
      • EndEpisode
        • ...
        • CollectObservations
        • ...
        • OnEpisodeBegin
          • Reset or generate new start conditions
      • CollectObservations
      • OnActionReceived
    Is there a recommended best practice for when to call EndEpisode?
     
  2. mbaske

    mbaske

    Joined:
    Dec 31, 2017
    Posts:
    473
    Were you able to solve this? I'm having a related issue where an observed object is being destroyed on EndEpisode. Resulting in a null reference exception when that extra CollectObservations call can't find the object any longer.

    EDIT: Nevermind, I think my method call order was off. Still, would be good to know if there was a best practice for calling EndEpisode.
     
    Last edited: Mar 24, 2021
  3. Phong

    Phong

    Joined:
    Apr 12, 2010
    Posts:
    2,085
    I think that extra call is still there. I have resorted to caching the observations. Then when I EndEpisode is called I detect the extra CollectObservations call and send the cached observations. Not an ideal solution but it is working.
     
  4. christophergoy

    christophergoy

    Joined:
    Sep 16, 2015
    Posts:
    735
    Hi, could you create a GitHub issue out of this? Thanks!