Search Unity

Question Why training with CPU use GPU vram and cause [CUDA out of memory]?

Discussion in 'ML-Agents' started by genesiz20898, Apr 11, 2022.

  1. genesiz20898

    genesiz20898

    Joined:
    Jul 12, 2021
    Posts:
    1
    Hello
    I tried using ML-Agents using CPU (as it's recommended in the docs that normally CPU inference training is normally faster than GPU inference).

    However, when I inspected my resource in Task Manager, it showed that my GPU vram is being used by ML-Agents.

    I have no problem with that if the hidden unit for neural network is set to 256. However, if I increase it to 512, the ML-Agents use up all my GPU vram, and produce this error.

    RuntimeError: CUDA out of memory. Tried to allocate 512.00 MiB (GPU 0; 6.00 GiB total capacity; 4.61 GiB already allocated; 24.62 MiB free; 4.61 GiB reserved in total by PyTorch)

    Why CPU inference training require my GPU vram and lead to that error? Are there any way to solve this error?

    I tried to uninstalled Pytorch with CUDA (torch==1.8.2+cu111) and reinstall the version with no CUDA version (torch==1.8.2+cpu. With no CUDA Pytorch, the ML-Agents no longer use my GPU vram, but the training time for each step is 5x increased (which I don't know if it is normal or not since the docs said that normally CPU inference is faster than GPU inference).

    Here is my Behavior Parameter Settings
    upload_2022-4-11_12-54-40.png

    And here is my config file:
    upload_2022-4-11_12-58-51.png

    My system is Windows 10, Intel i5-6500 CPU with 16gb RAM, and NVIDIA GeForce GTX 1060 6GB.

    May be I did something wrong? I am really new for this area, and really need some help, thank you.
     
    Last edited: Apr 12, 2022
  2. xcao65

    xcao65

    Unity Technologies

    Joined:
    Jan 13, 2022
    Posts:
    2
  3. GamerLordMat

    GamerLordMat

    Joined:
    Oct 10, 2019
    Posts:
    185
    Hello,

    stupid question, but do you mix up inference and training (Inference GPU has nothing to do with the training prozess.)? Or do you train while having also inference.
    If you have a camera sensor I think it uses your graphics card anyway for rendering