Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. We have updated the language to the Editor Terms based on feedback from our employees and community. Learn more.
    Dismiss Notice
  3. Join us on November 16th, 2023, between 1 pm and 9 pm CET for Ask the Experts Online on Discord and on Unity Discussions.
    Dismiss Notice
  4. Dismiss Notice

Training slow down once EndEpisode() called manually

Discussion in 'ML-Agents' started by Aurel33, Jul 10, 2021.

  1. Aurel33

    Aurel33

    Joined:
    Apr 12, 2018
    Posts:
    10
    Hi everyone !

    I'm facing a strange issue : Training is slowing down once EndEpisode() start to be called manually.
    I'm posting some screen shots below.
    To understand what we are talking about here :
    I'm training an agent to reach 4 targets. This agent is set to end its episode after 5000 steps (automatic time out) or when it has reached its 4 targets (manual call to EndEpisode() in my code before automatic time out).
    Training env is cloned so I have 10 agents learning in parallel into the same unity scene.

    There are the curves :
    From 0 to about 2M steps, agent doesn't succed to reach the 4 targets so time out hits and Episode Length remains constant (and EndEpisode() is never called manually). After 2M steps, agent starts to reach its 4 targets and then EndEpisode() is called manually and Episode Lenght decrease.
    upload_2021-7-10_17-1-31.png
    We can notice here that it takes about 1 hour to reach 2M steps.

    Then almost 4 hours for next 2M steps :
    upload_2021-7-10_17-5-10.png

    And slow down remains constant : 4 hours for next 2M steps
    upload_2021-7-10_17-5-54.png


    Going further into profiling, we can see the same problem, CPU usage jumps up into "DecideAction" as soon as EndEpisode() is called :
    upload_2021-7-10_17-6-56.png


    upload_2021-7-10_17-8-36.png


    (Ok I have to split my message since I can not upload more that 5 image, to be continued.... :))
     
  2. Aurel33

    Aurel33

    Joined:
    Apr 12, 2018
    Posts:
    10
    ....

    Analysing call stack, it turns out that the consuming function is the one underlined bellow :

    (Before EndEpisode manual calls)
    upload_2021-7-10_17-13-1.png

    (After EndEpisode manual calls)
    upload_2021-7-10_17-13-6.png



    My agent code seems to be "normal", here is the FixedUpdate loop :
    Code (CSharp):
    1. protected void FixedUpdate()
    2. {
    3.     if (_initStepCounter < 5)
    4.     {
    5.     _initStepCounter++;
    6.     }
    7.     else
    8.     {
    9.     if (_stepCounter == 0)
    10.     {
    11.             RequestDecision();
    12.     }
    13.  
    14.     if (_BehaviorParameters.BehaviorType == BehaviorType.Default)
    15.     {
    16.             _EventManagerLocal.triggerEvent(E_EVENTS.EVT_AGENT_PA_REWARD_COMPUTE, new EVT_AGENT_PA_REWARD_COMPUTE_DATA
    17.                 {
    18.                 _agent = this
    19.                 });
    20.     }
    21.  
    22.     _stepCounter = (_stepCounter + 1) % 10;[/INDENT]
    23.     }
    24. }

    So I manage manually the RequestDecision() every 10 fixed steps and fire an event to request for reward every fixedupdate. (The _initStepCounter part is to let time to the agent to be stable after each reset).

    The EndEpisode() is called into a Rewarder class. In this class a function is called to apply reward to the agent once the event EVT_AGENT_PA_REWARD_COMPUTE is received. In this function I also set a flag when the 4 targets are reached. Into the FixedUpdate of the Rewarder class I scan this flag, if it is true, I call EndEpisode() and reset the flag to false.

    So nothing really special.

    Note that my events system is local to each cloned environment (an event fired in a training environment can not be trapped into an other cloned environment).


    Note also that the slow down problem doesn't appears when I train only one env (without clone).


    Sorry for the long description, if someone has a clue !

    Thanks
     

    Attached Files:

  3. TreyK-47

    TreyK-47

    Unity Technologies

    Joined:
    Oct 22, 2019
    Posts:
    1,796
    I'll see if the team has any insight!