Search Unity

BrokenPipeError and EOFError while training

Discussion in 'ML-Agents' started by ninazrdl, Apr 18, 2020.

  1. ninazrdl

    ninazrdl

    Joined:
    Feb 8, 2020
    Posts:
    3
    The training process was running fine and suddenly it throws these errors. The first few episodes are fine.
    I used this command from Anaconda Prompt, using with a virtual env:
    mlagents-learn trainer_config.yaml --env=UnityProject.exe --run-id=1 --train.

    Here is my log:
    2020-04-18 07:53:22 INFO [trainer.py:214] 2: Brain: Step: 110000. Time Elapsed: 6598.126 s Mean Reward: 1.000. Std of Reward: 0.000. Training.
    (UnityMLAVnv) (base) C:\Users\User\Desktop>Process Process-1:
    Traceback (most recent call last):
    File "c:\programdata\anaconda3\lib\multiprocessing\connection.py", line 312, in _recv_bytes
    nread, err = ov.GetOverlappedResult(True)
    BrokenPipeError: [WinError 109] パイプは終了しました。
    During handling of the above exception, another exception occurred:
    Traceback (most recent call last):
    File "c:\programdata\anaconda3\lib\multiprocessing\process.py", line 297, in _bootstrap
    self.run()
    File "c:\programdata\anaconda3\lib\multiprocessing\process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
    File "c:\users\user\unitymlavnv\lib\site-packages\mlagents\trainers\subprocess_env_manager.py", line 120, in worker
    cmd: EnvironmentCommand = parent_conn.recv()
    File "c:\programdata\anaconda3\lib\multiprocessing\connection.py", line 250, in recv
    buf = self._recv_bytes()
    File "c:\programdata\anaconda3\lib\multiprocessing\connection.py", line 321, in _recv_bytes
    raise EOFError​
    EOFError
    Environment:
    - OS: Windows 10
    - _ML-Agents 0.15.1
    - _TensorFlow 2.0.1
     
  2. TreyK-47

    TreyK-47

    Unity Technologies

    Joined:
    Oct 22, 2019
    Posts:
    1,822
    I'll forward this for the team to review. Which versions of C# & Python are you running?
     
  3. harpj

    harpj

    Unity Technologies

    Joined:
    Jun 20, 2017
    Posts:
    6
    Hi @ninazrdl, it seems the training process shut down without cleanly shutting down the Unity environment worker. A few questions:

    • Was this the end of training / had it reached the maximum number of steps?
    • Did you try to end training early?
    • Was the resulting .nn file written out?
     
  4. maximilian_92

    maximilian_92

    Joined:
    May 22, 2020
    Posts:
    3
    I have the same isssue, have you figured out how to solve this? Thx!