Hi there, I have a custom algorithm as well as a custom environment that uses MLAgents. I want to know whether signalling a reset should be handled on the Python side or the Unity side? For example my agent reaches the end of the episode, designated by it stepping the pre determined number of times inside the Unity environment. Should the proces then be: Agent reaches max number of steps EndEpisode() is called Agent is therefore in Terminal Steps on the Python side Python sees agent in terminal steps, sets done=True Python calls env.reset() env.reset() calls Academy.OnEnvironmentReset in the unity environment, which in turn calls the ResetScene() function Everything is reset, the next episode starts Or should it be Agent reaches max number of steps FixedUpdate contains a check like so (psuedocode): if agentStep >= maxEpisodeStep: EndEpisode() ResetScene() Python sees agent in TerminalSteps Python calls env.reset() 6 and 7 of the previous happens Essentially, my questions comes down to the handling of ResetScene() between python and unity. Should ResetScene() ONLY be called when env.reset() is done from Python's side? And env.reset() is called when an agent is in terminalSteps, which is determined by whether EndEpisode was called on the agent. I hope my question was somewhat clear.