Search Unity

  1. Megacity Metro Demo now available. Download now.
    Dismiss Notice
  2. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

Question How to improve training speed ?

Discussion in 'ML-Agents' started by TTrope, Mar 29, 2021.

  1. TTrope

    TTrope

    Joined:
    Dec 7, 2014
    Posts:
    20
    Hello,

    I'm currently trying to improve the training time of my agent: A knight need to beats enemies, he has a discrete action space, (movement, different abilities/roll)

    I can see two types of enhancement for the training speed, "methodology" and "compute power"
    For methodology, I'm currently trying hyperparameter tuning, curriculum learning, reward shaping (If you see any other i'm interested too :) )

    However I'd also like to be faster with the compute power
    Unfortunatly, ml agent doesn't really leverage GPU, and for the cpu right now, I can't really scale my learning speed on it, if I add fields/agents, or try to add multiple num_envs, it doesn't matter much because there is a bottleneck on the "training thread", the one that collect experiences and train the network.

    Basically, my unity instances takes 3-4% cpu, the python threads associated to thoses takes 1-2%, and one big python (likely the master one) takes 50-60% cpu. I guess that's the one associated to the training part.
    (And the training speed doesn't improve with CPUs using more cores)

    How could I improve this ? Apart from having some better single thread performance CPU?

    Thank you !
     
    Last edited: Mar 30, 2021
  2. ruoping_unity

    ruoping_unity

    Unity Technologies

    Joined:
    Jul 10, 2020
    Posts:
    134
    In terms of speeding up training with compute power, if you're not able to upgrade your hardware then the only thing you could do is to better utilize all resources on your machine.

    One thing I would argue in your description is:
    > if I add fields/agents, or try to add multiple num_envs, it doesn't matter much because there is a bottleneck on the "training thread"
    Though it depends on machine setup and game settings, generally when training with reinforcement learning, the training time bottleneck is the time to run simulation and collect data and not actually the training itself. Runing simulations and collect data is exactly what the unity instances that takes 3-4% cpu are doing. It doesn't take much resources but it takes time to run many iterations to collect enough data for training. So what we can do is duplicate more training fields or run with larger num_envs to parallelize the training data collection and usually that helps a lot.

    But if you're training with really large network or huge batch size then you could get bottlenecked on the training thread. If you ever get to use such a huge training settings then the best advice I could give is to leverage GPU or adjust your training settings to use less parameters.
     
  3. TTrope

    TTrope

    Joined:
    Dec 7, 2014
    Posts:
    20
    Hello,
    Thanks for the clarifications.
    I think in my game, the observation collection takes quite a lot of time.
    However,
    - On a 64 hidden unit * 2 hidden layer, 10K step training takes 60sec
    while
    - On a 256 hidden units * 2 layers, it takes 220seconds.

    It's almost 3 times longer. This is not related to the data collection part right?
    In this case what should I do to train faster ?
     
    Last edited: Apr 20, 2021