Search Unity

Feedback More info needed in the docs regarding torch_settings: device:

Discussion in 'ML-Agents' started by NanushTol, Jan 26, 2023.

  1. NanushTol

    NanushTol

    Joined:
    Jan 9, 2018
    Posts:
    131
    EDIT:
    I found the reason but couldn't find any reference to this in the documentation in the git, if any Unity dev see's this, pleas can you shed some light or add more info in the docs please?
    So i'm changing the title tag to - feedback

    ORIGINAL:
    I'm training with a server build (on my PC) with the --no-graphics command, but my GPU is still being utilized at +90%.
    I don't have visual observations.
    I didn't set ML Agents to train on the GPU, and my agents are set up to use Burst.

    Is this normal?
    Is ML Agents training on the GPU automatically?
     
    Last edited: Jan 26, 2023
  2. hughperkins

    hughperkins

    Joined:
    Dec 3, 2022
    Posts:
    191
    I'm not sure why you're getting GPU usage on a dedicated server build, but one thing I have noticed is that the Update rate is really high on my own dedicated server builds (200-300fps), and uses up all available CPU. I usually drop Update rate to about 10fps, by setting Application.targetFrameRate`, e.g. see https://github.com/hughperkins/peac...76a46eb53457e64/PeacefulPie/Simulation.cs#L87 (this code also shows one way to detect when running as a dedicated server, i.e. https://github.com/hughperkins/peac...76a46eb53457e64/PeacefulPie/Simulation.cs#L61 )
     
    NanushTol likes this.
  3. NanushTol

    NanushTol

    Joined:
    Jan 9, 2018
    Posts:
    131
  4. hughperkins

    hughperkins

    Joined:
    Dec 3, 2022
    Posts:
    191
    Ah, torch is the library that's used for running the neural network. You probably want that to be using the gpu, if you have one: it should run faster.

    I'm not sure what inforation you are looking for for this, but eg the pytorch doc on device is here: https://pytorch.org/docs/stable/tensor_attributes.html#torch.device. I think that setting it to null making it use the gpu is probably a Unity thing. (normally, it defaults to cpu, in torch, I think, but defaulting to gpu definitely makes sense, for best performance)
     
  5. NanushTol

    NanushTol

    Joined:
    Jan 9, 2018
    Posts:
    131
  6. hughperkins

    hughperkins

    Joined:
    Dec 3, 2022
    Posts:
    191
    Depends on the size of your network. But yeah, with a few rays as input, and using a small stack of Linear layers for the network, gpu is not going to change much.

    If you start feeding images into your network, and you start using convolutional layers, then gpu becomes more useful.

    Not that you will get better results using images - in fact, everything will just learn much more slowly - but depends on what you are trying to do.
     
  7. hughperkins

    hughperkins

    Joined:
    Dec 3, 2022
    Posts:
    191
    Last edited: Jan 27, 2023
  8. NanushTol

    NanushTol

    Joined:
    Jan 9, 2018
    Posts:
    131
    I'm updating this again because I did some tests to see if it is faster with the CPU or GPU settings,
    I noticed a significant increase in performance with the GPU, the CPU setting was much slower to update the policy (I didn't save statistics).

    just for reference the agent network size is 400 units & 2 hidden layers
    It has a total of 649 observations inputs
    and 1 bool & x2 Vector3 outputs
     
  9. hughperkins

    hughperkins

    Joined:
    Dec 3, 2022
    Posts:
    191
    Interesting. Glad that GPU does work on mlagents PPO :)

    My own experience is that for small networks, yes, the learning phase of the policy will be slightly faster on GPU than on CPU, but still fairly small relative to time to run the game. With Nature-CNN sized networks, e.g. https://github.com/DLR-RM/stable-ba...ble_baselines3/common/torch_layers.py#L85-L92 , then running on CPU becomes prohibitively slow (3-5 minutes pauses per learning phase...), and GPU becomes vastly preferable.
     
  10. hughperkins

    hughperkins

    Joined:
    Dec 3, 2022
    Posts:
    191
    (actually, re-reading this, I guess 649 inputs is quite a lot :) )