Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.

ML-Agents: Multiple environments (num-envs)

Discussion in 'ML-Agents' started by dracolytch, Jan 9, 2020.

  1. dracolytch

    dracolytch

    Joined:
    Jan 1, 2016
    Posts:
    19
    Hey folks,

    Getting into training with multiple simultaneous environments... Right now I have 8 environments going, and it's giving me a 2x speedup. CPU is at 15%, Memory at 40%, disk/network/gpu are all negligible. Any thoughts as to what's causing the bottleneck here? Also, below is a screenshot of the reward: Orange is 1 instance, blue is 8 instances. The problem is the same. Any clues as to why it's gone all... saw-tooth-y?

    upload_2020-1-8_20-30-8.png
     
    CloudyVR and KarloE like this.
  2. SmartMediaNL

    SmartMediaNL

    Joined:
    Sep 29, 2016
    Posts:
    77
    i like to know as well. at my system it seems only two cores (of total 8 cores /16 threads) are working hard. Increasing enviroments does not help (much) going from 8 to 16 only eats up more Ram nothing more. CPU keeps bouncing around 25 to 33% SSD is idle. I tried to increase buffer sizes in config file but nothing seems to help. I noticed 2 instances of Python running wish would explain the 2 core load. no idea on how to increase that to eight.
     
    CloudyVR likes this.
  3. C0dingschmuser

    C0dingschmuser

    Joined:
    Mar 23, 2021
    Posts:
    3
    Thats because Pytorch is only configured to use 4 Threads at once. You can change that in venv\Lib\site-packages\mlagents\torch_utils\cpu_utils.py at "get_num_threads_to_use()"

    In the "return max(min(num_cpus // 2, 4), 1) if num_cpus is not None else None"-Line change 4 to the Number of Threads you want
     
  4. ervteng_unity

    ervteng_unity

    Unity Technologies

    Joined:
    Dec 6, 2018
    Posts:
    150
    A note about PyTorch and CPU threads - for the small networks we're using in ML-Agents, increasing the number of threads that PyTorch uses will increase your CPU usage but it won't actually make it much faster o_O. This is because parallelizing small ops is less beneficial than with large ops (e.g. in the case of CNNs).

    As for the sawtooth problem - you're likely going to have to increase your summary frequency - it looks like many more short episodes are completing in between each summary write b/c of the increase in environments.