Search Unity

Connection error running 3dball example

Discussion in 'ML-Agents' started by guidosalimbeni, May 30, 2020.

  1. guidosalimbeni

    guidosalimbeni

    Joined:
    Nov 10, 2017
    Posts:
    17
    Hi,
    I updated to the realease v1 of ml agents and followed the instruction to run the 3d ball example. I get the following error message when trying to run training from mlagents and I wonder if you can help me.



    (mlagents) C:\python-envs\mlagents\Scripts>mlagents-learn D:\ml-agents-release_1\ml-agents-release_1\config\trainer_config.yaml --run-id=AVACA06
    2020-05-30 13:13:41.423464: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64
    _101.dll not found
    2020-05-30 13:13:41.428574: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
    WARNING:tensorflow:From c:\python-envs\mlagents\lib\site-packages\tensorflow\python\compat\v2_compat.py:96: disable_resource_variables (from tensorflow.python.o
    ps.variable_scope) is deprecated and will be removed in a future version.
    Instructions for updating:
    non-resource variables are not supported in the long term


    ▄▄▄▓▓▓▓
    ╔▓▓▓▓▓▓█▓▓▓▓▓
    ,▄▄▄m▀▀▀' ,▓▓▓▀▓▓▄ ▓▓▓ ▓▓█
    ▄▓▓▓▀' ▄▓▓▀ ▓▓▓ ▄▄ ▄▄ ,▄▄ ▄▄▄▄ ,▄▄ ▄▓▓█▄ ▄▄▄ ,▄▄
    ▄▓▓▓▀ ▄▓▓▀ █▓▓█ ▓▓█ █▓▓ █▓▓▓▀▀▀▓▓█ ▓▓▓ ▀▓▓█▀ ^▓▓█ ╔▓▓█
    ▄▓▓▓▓▓▄▄▄▄▄▄▄▄▓▓▓ ▓▀ ▓▓█ █▓▓ █▓▓ ▓▓▓ ▓▓▓ ▓▓█ █▓▓▄ ▓▓█
    ▀▓▓▓▓▀▀▀▀▀▀▀▀▀▀▓▓▄ ▓▓ ▓▓█ █▓▓ █▓▓ ▓▓▓ ▓▓▓ ▓▓█ █▓▓█▓▓
    ^█▓▓▓ ▀▓▓▄ █▓▓█ ▓▓▓▓▄▓▓▓▓ █▓▓ ▓▓▓ ▓▓▓ ▓▓▓▄ ▓▓▓▓`
    '▀▓▓▓▄ ^▓▓▓ ▓▓▓ └▀▀▀▀ ▀▀ ^▀▀ `▀▀ `▀▀ '▀▀ █▓▓█
    ▀▀▀▀▓▄▄▄ ▓▓▓▓▓▓, ▓▓▓▓▀
    `▀█▓▓▓▓▓▓▓▓▓█
    ¬`▀▀▀█▓


    Version information:
    ml-agents: 0.16.1,
    ml-agents-envs: 0.16.1,
    Communicator API: 1.0.0,
    TensorFlow: 2.2.0
    2020-05-30 13:13:44.102752: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64
    _101.dll not found
    2020-05-30 13:13:44.109765: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
    WARNING:tensorflow:From c:\python-envs\mlagents\lib\site-packages\tensorflow\python\compat\v2_compat.py:96: disable_resource_variables (from tensorflow.python.o
    ps.variable_scope) is deprecated and will be removed in a future version.
    Instructions for updating:
    non-resource variables are not supported in the long term
    2020-05-30 13:13:45 INFO [environment.py:201] Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
    2020-05-30 13:13:57 INFO [environment.py:111] Connected to Unity environment with package version 1.0.0-preview and communication version 1.0.0
    2020-05-30 13:13:57 INFO [environment.py:342] Connected new brain:
    3DBall?team=0
    2020-05-30 13:13:57.199867: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to
    use: AVX2
    2020-05-30 13:13:57.214726: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x8ad48db2f0 initialized for platform Host (this does not guarantee th
    at XLA will be used). Devices:
    2020-05-30 13:13:57.220712: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
    2020-05-30 13:13:57.254463: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
    2020-05-30 13:13:57.464626: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
    pciBusID: 0000:01:00.0 name: GeForce GTX 860M computeCapability: 5.0
    coreClock: 1.0195GHz coreCount: 5 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 74.65GiB/s
    2020-05-30 13:13:57.475064: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64
    _101.dll not found
    2020-05-30 13:13:57.482419: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cublas64_10.dll'; dlerror: cublas64_
    10.dll not found
    2020-05-30 13:13:57.490187: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10
    .dll not found
    2020-05-30 13:13:57.497016: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_
    10.dll not found
    2020-05-30 13:13:57.504378: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cusolver64_10.dll'; dlerror: cusolve
    r64_10.dll not found
    2020-05-30 13:13:57.510779: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cusparse64_10.dll'; dlerror: cuspars
    e64_10.dll not found
    2020-05-30 13:13:57.516380: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudnn64_7.dll'; dlerror: cudnn64_7.d
    ll not found
    2020-05-30 13:13:57.523382: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1598] Cannot dlopen some GPU libraries. Please make sure the missing libraries me
    ntioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the
    required libraries for your platform.
    Skipping registering GPU devices...
    2020-05-30 13:13:57.535085: F tensorflow/stream_executor/cuda/cuda_driver.cc:391] Check failed: CUDA_SUCCESS == cuDevicePrimaryCtxGetState(device, &former_prima
    ry_context_flags, &former_primary_context_is_active) (0 vs. 303)
    Process Process-1:
    Traceback (most recent call last):
    File "C:\Users\Proprietario\AppData\Local\Programs\Python\Python37\lib\multiprocessing\connection.py", line 312, in _recv_bytes
    nread, err = ov.GetOverlappedResult(True)
    BrokenPipeError: [WinError 109] The pipe has been ended

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
    File "C:\Users\Proprietario\AppData\Local\Programs\Python\Python37\lib\multiprocessing\process.py", line 297, in _bootstrap
    self.run()
    File "C:\Users\Proprietario\AppData\Local\Programs\Python\Python37\lib\multiprocessing\process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
    File "c:\python-envs\mlagents\lib\site-packages\mlagents\trainers\subprocess_env_manager.py", line 151, in worker
    req: EnvironmentRequest = parent_conn.recv()
    File "C:\Users\Proprietario\AppData\Local\Programs\Python\Python37\lib\multiprocessing\connection.py", line 250, in recv
    buf = self._recv_bytes()
    File "C:\Users\Proprietario\AppData\Local\Programs\Python\Python37\lib\multiprocessing\connection.py", line 321, in _recv_bytes
    raise EOFError
    EOFError
     
  2. TreyK-47

    TreyK-47

    Unity Technologies

    Joined:
    Oct 22, 2019
    Posts:
    1,820
    I'll forward this to the team for some guidance.
     
  3. ilaydanil

    ilaydanil

    Joined:
    May 17, 2020
    Posts:
    20
    Hi, were you able to solve this? I get the same error I even diabled the firewall yet it did not work.
     
  4. ervteng_unity

    ervteng_unity

    Unity Technologies

    Joined:
    Dec 6, 2018
    Posts:
    150
    Hi, this appears to be a CUDA/Tensorflow error and not related to ML-Agents. The install guide is here: https://www.tensorflow.org/install/gpu

    If it's successful, you should be able to open a python terminal (type
    python
    ), import tensorflow using
    import tensorflow as tf
    , and open a new session by typing
    sess = tf.Session()
    .