Search Unity

Couldn't connect to trainer on port 5004 using API version 0.15.0

Discussion in 'ML-Agents' started by tapleya, Apr 20, 2020.

  1. tapleya

    tapleya

    Joined:
    Oct 24, 2018
    Posts:
    2
    Copying this over from the Github issue page, per request.

    Describe the bug
    I am getting the above error when trying to train my own custom environment.

    To Reproduce
    I have been getting this error numerous times, even after cloning a new repo and reinstalling.

    I cloned the repo (latest release) onto my laptop. I installed the dependencies following the development instructions on the installation doc into an Anaconda env. I then created my own environment in Unity. I was able to train the environment with no issues. However, I decided I wanted to save off the images from my visual observations to my laptop, so I installed opencv with "conda install opencv". I then went into ml-agents-env/base_env.py and added a line to save off the images. However, once I tried to start training again, I got the error. After deleting the line from base_env.py, I am still getting the error and cannot train.

    In the past, I have deleted the repo and recloned it, and I am usually able to run the examples without any issues. Usually, once I run my own environment, I run into the issue where the trainer cannot connect. I was finally able to get it to run this morning without the issue (somehow) but it came back once I attempted to add a line to the file.

    Console logs / stack traces


    This is from the ~/Library/Logs/Unity/upm.log file.

    Screenshots


    This is what shows up when I run the mlagents-learn.



    This is what I get in the console inside of Unity.



    This is the error I get on the terminal side.

    Environment (please complete the following information):

    • OS + version: Mac 10.15.3 (Catalina)
    • ML-Agents version: latest release (0.15.1)
    • TensorFlow version: 2.0.1
    • Environment: 3D-Ball, or any environment actually


    Whenever I get the error, if I delete the repo from my machine, re-clone it from the Github, and re-install the dependencies from the ml-agents-envs and ml-agents directories, it will work. However, every time I make a change to my project, the error comes up and I have to redo everything over again. It seems to be an installation error, but I am following the directions in the installation docs each time. Has anyone had a similar issue? I'm installing the dependencies through 'pip3 install -e ./ml-agents-envs' and 'pip3 install -e ./ml-agents' instead of just ml-agents since those were the instructions if we were going to be making changes within the repo.
     
  2. anupambhatnagar

    anupambhatnagar

    Unity Technologies

    Joined:
    Jun 22, 2017
    Posts:
    4
    Can you share the complete command you use to launch ml-agents?
     
  3. tapleya

    tapleya

    Joined:
    Oct 24, 2018
    Posts:
    2
    So I first clone the repo onto my machine with

    git clone --branch latest_release https://github.com/Unity-Technologies/ml-agents.git

    I then run

    python -m venv ./<env_name>
    source ./<env_name>/bin/activate
    pip3 install --upgrade pip
    pip3 install --upgrade setuptools

    to set up my virtual environment. Once in here, I cd into the ml-agents repo and run

    pip3 install -e ./ml-agents-envs/
    pip3 install -e ./ml-agents/

    At this point, I am able to train my model in my environment. However, once I make a small change to the script or the .yaml file or anything related to the environment, it breaks and I get the error.
     
  4. Yunkis

    Yunkis

    Joined:
    Apr 20, 2020
    Posts:
    1
    I had the same problem. It was just that when I press play(my laptop is slow) it takes too long of a time to actually start to play.
    The way around it is to make ml-agents wait longer for a reply from Unity. I went to ml-agent's code and disabled function of timing out. Worked for me.