Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. We have updated the language to the Editor Terms based on feedback from our employees and community. Learn more.
    Dismiss Notice

Question training stops with RuntimeError: dictionary changed size during iteration

Discussion in 'ML-Agents' started by MarkTension, Jul 6, 2020.

  1. MarkTension

    MarkTension

    Joined:
    Aug 17, 2019
    Posts:
    42
    Hi, Sometimes my training stops with this "dictionary changed size during iteration" error. Is anyone familiar with that? I'm using concurrent environments, and mlagents release 2.

    This was from command prompt:
    File "c:\users\hello\desktop\project\ml-agents-release_2\ml-agents\mlagents\trainers\stats.py", line 344, in write_stats
    for key in StatsReporter.stats_dict[self.category]:
    RuntimeError: dictionary changed size during iteration

    This was from my build's debug log:
    Unable to save timers to file C:/Users/hello/Desktop/project/builds/7_3_2/agents2_Data\ML-Agents\Timers\Clay3D_timers.json
    (Filename: C:\buildslave\unity\build\Runtime/Export/Debug/Debug.bindings.h Line: 35)

    Any idea what's happening, or ways to stop this behavior?
     
  2. celion_unity

    celion_unity

    Unity Technologies

    Joined:
    Jun 12, 2019
    Posts:
    289
    The stats.py error is really weird - I understand that modifying the dictionary while iterating over it is bad, but don't see how that could be happening here. Could you open a github issue with the full callstack (and maybe some more info about your python version)?

    The "Unable to save timers to file" message should be harmless.
     
  3. BotAcademy

    BotAcademy

    Joined:
    May 15, 2020
    Posts:
    32
    I had the same issue. Used Release 3 and the default 3DBall environment with SAC. Only changed the max_steps to 1 million. Occurred at around 700k steps or so. So it should (hopefully) be easy to reproduce.
     
  4. celion_unity

    celion_unity

    Unity Technologies

    Joined:
    Jun 12, 2019
    Posts:
    289
    I tried a few times but can reproduce the problem (3DBall, release 3, SAC, max_steps=1000000). Can you please post the full callstack of the error, command line args you're using to run, and output from "python --version"?
     
  5. celion_unity

    celion_unity

    Unity Technologies

    Joined:
    Jun 12, 2019
    Posts:
    289
    Still can't reproduce it, but I have a theory - I think StatsReporter is getting called from different threads simultaneously, so one thread causes a new key to be added (via add_stat or set_stat) while write_stats is being called.
     
  6. celion_unity

    celion_unity

    Unity Technologies

    Joined:
    Jun 12, 2019
    Posts:
    289
    MarkTension and BotAcademy like this.
  7. MarkTension

    MarkTension

    Joined:
    Aug 17, 2019
    Posts:
    42
    Great! Happy it got solved
     
  8. celion_unity

    celion_unity

    Unity Technologies

    Joined:
    Jun 12, 2019
    Posts:
    289
    If this is causing a problem for training, and you're comfortable modifying the python code, a simpler workaround is to convert the loop in question to
    for key in tuple(StatsReporter.stats_dict[self.category].keys()):


    The fix will be in the next release, tentatively scheduled for next week.
     
    MarkTension likes this.