Training stops immediately with error: SubprocessEnvManager had workers that didn't signal shutdown

Hjalte · Mar 23, 2022

Hi,

When I run 'mlagents-learn --force', I get the Unity logo as expected, but once I press play, it loads for a couple seconds and gives me an error (see full printout below).

I am using the 'ML Agents 2.2.1-exp1' package.

Full printout:

Code (CSharp):

(venv) C:\Users\hjalte\Desktop\Szrot\Szrot ML>mlagents-learn --force

┐ ╖

╓╖╬│╡ ││╬╖╖

╓╖╬│││││┘ ╬│││││╬╖

╖╬│││││╬╜ ╙╬│││││╖╖ ╗╗╗

╬╬╬╬╖││╦╖ ╖╬││╗╣╣╣╬ ╟╣╣╬ ╟╣╣╣ ╜╜╜ ╟╣╣

╬╬╬╬╬╬╬╬╖│╬╖╖╓╬╪│╓╣╣╣╣╣╣╣╬ ╟╣╣╬ ╟╣╣╣ ╒╣╣╖╗╣╣╣╗ ╣╣╣ ╣╣╣╣╣╣ ╟╣╣╖ ╣╣╣

╬╬╬╬┐ ╙╬╬╬╬│╓╣╣╣╝╜ ╫╣╣╣╬ ╟╣╣╬ ╟╣╣╣ ╟╣╣╣╙ ╙╣╣╣ ╣╣╣ ╙╟╣╣╜╙ ╫╣╣ ╟╣╣

╬╬╬╬┐ ╙╬╬╣╣ ╫╣╣╣╬ ╟╣╣╬ ╟╣╣╣ ╟╣╣╬ ╣╣╣ ╣╣╣ ╟╣╣ ╣╣╣┌╣╣╜

╬╬╬╜ ╬╬╣╣ ╙╝╣╣╬ ╙╣╣╣╗╖╓╗╣╣╣╜ ╟╣╣╬ ╣╣╣ ╣╣╣ ╟╣╣╦╓ ╣╣╣╣╣

╙ ╓╦╖ ╬╬╣╣ ╓╗╗╖ ╙╝╣╣╣╣╝╜ ╘╝╝╜ ╝╝╝ ╝╝╝ ╙╣╣╣ ╟╣╣╣

╩╬╬╬╬╬╬╦╦╬╬╣╣╗╣╣╣╣╣╣╣╝ ╫╣╣╣╣

╙╬╬╬╬╬╬╬╣╣╣╣╣╣╝╜

╙╬╬╬╣╣╣╜

╙

Version information:

ml-agents: 0.28.0,

ml-agents-envs: 0.28.0,

Communicator API: 1.5.0,

PyTorch: 1.7.1+cu110

[INFO] Listening on port 5004. Start training by pressing the Play button in the Unity Editor.

[INFO] Connected to Unity environment with package version 2.2.1-exp.1 and communication version 1.5.0

[INFO] Connected new brain: MoveToGoal?team=0

[WARNING] Behavior name MoveToGoal does not match any behaviors specified in the trainer configuration file. A default configuration will be used.

[WARNING] Deleting TensorBoard data events.out.tfevents.1648050845.DESKTOP-FHQ6GMJ.25336.0 that was left over from a previous run.

[INFO] Hyperparameters for behavior name MoveToGoal:

trainer_type: ppo

hyperparameters:

batch_size: 1024

buffer_size: 10240

learning_rate: 0.0003

beta: 0.005

epsilon: 0.2

lambd: 0.95

num_epoch: 3

learning_rate_schedule: linear

beta_schedule: linear

epsilon_schedule: linear

network_settings:

normalize: False

hidden_units: 128

num_layers: 2

vis_encode_type: simple

memory: None

goal_conditioning_type: hyper

deterministic: False

reward_signals:

extrinsic:

gamma: 0.99

strength: 1.0

network_settings:

normalize: False

hidden_units: 128

num_layers: 2

vis_encode_type: simple

memory: None

goal_conditioning_type: hyper

deterministic: False

init_path: None

keep_checkpoints: 5

checkpoint_interval: 500000

max_steps: 500000

time_horizon: 64

summary_freq: 50000

threaded: False

self_play: None

behavioral_cloning: None

[INFO] Exported results\ppo\MoveToGoal\MoveToGoal-0.onnx

[INFO] Copied results\ppo\MoveToGoal\MoveToGoal-0.onnx to results\ppo\MoveToGoal.onnx.

[ERROR] SubprocessEnvManager had workers that didn't signal shutdown

[ERROR] A SubprocessEnvManager worker did not shut down correctly so it was forcefully terminated.

Traceback (most recent call last):

File "C:\Users\hjalte\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 193, in _run_module_as_main

"__main__", mod_spec)

File "C:\Users\hjalte\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 85, in _run_code

exec(code, run_globals)

File "C:\Users\hjalte\Desktop\Szrot\Szrot ML\venv\Scripts\mlagents-learn.exe\__main__.py", line 7, in <module>

File "C:\Users\hjalte\Desktop\Szrot\Szrot ML\venv\lib\site-packages\mlagents\trainers\learn.py", line 260, in main

run_cli(parse_command_line())

File "C:\Users\hjalte\Desktop\Szrot\Szrot ML\venv\lib\site-packages\mlagents\trainers\learn.py", line 256, in run_cli

run_training(run_seed, options, num_areas)

File "C:\Users\hjalte\Desktop\Szrot\Szrot ML\venv\lib\site-packages\mlagents\trainers\learn.py", line 132, in run_training

tc.start_learning(env_manager)

File "C:\Users\hjalte\Desktop\Szrot\Szrot ML\venv\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped

return func(*args, **kwargs)

File "C:\Users\hjalte\Desktop\Szrot\Szrot ML\venv\lib\site-packages\mlagents\trainers\trainer_controller.py", line 176, in start_learning

n_steps = self.advance(env_manager)

File "C:\Users\hjalte\Desktop\Szrot\Szrot ML\venv\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped

return func(*args, **kwargs)

File "C:\Users\hjalte\Desktop\Szrot\Szrot ML\venv\lib\site-packages\mlagents\trainers\trainer_controller.py", line 234, in advance

new_step_infos = env_manager.get_steps()

File "C:\Users\hjalte\Desktop\Szrot\Szrot ML\venv\lib\site-packages\mlagents\trainers\env_manager.py", line 124, in get_steps

new_step_infos = self._step()

File "C:\Users\hjalte\Desktop\Szrot\Szrot ML\venv\lib\site-packages\mlagents\trainers\subprocess_env_manager.py", line 417, in _step

step: EnvironmentResponse = self.step_queue.get_nowait()

File "C:\Users\hjalte\AppData\Local\Programs\Python\Python37\lib\multiprocessing\queues.py", line 126, in get_nowait

return self.get(False)

File "C:\Users\hjalte\AppData\Local\Programs\Python\Python37\lib\multiprocessing\queues.py", line 109, in get

self._sem.release()

ValueError: semaphore or lock released too many times

I had some issues in the past, so I erased all my python, mlagents etc. This should be a perfectly clean install and even entirely new project.

This is the agent, I'm trying to train (It worked perfectly with MLAgents 16.0):

Code (CSharp):

using System.Collections;

using System.Collections.Generic;

using Unity.MLAgents;

using Unity.MLAgents.Actuators;

using Unity.MLAgents.Sensors;

using UnityEngine;

public class TestAgent : Agent

{

public Transform target;

public float speed = 5f;

private MeshRenderer floorRender;

private MaterialPropertyBlock mpb;

private void Start()

{

floorRender = transform.parent.Find("Ground").GetComponent<MeshRenderer>();

mpb = new MaterialPropertyBlock();

}

public override void OnEpisodeBegin()

{

base.OnEpisodeBegin();

transform.localPosition = new Vector3(Random.Range(-3f, 3f), 0, Random.Range(-3f, 3f));

target.localPosition = new Vector3(Random.Range(-3f, 3f), 0, Random.Range(-3f, 3f));

}

public override void CollectObservations(VectorSensor sensor)

{

base.CollectObservations(sensor);

sensor.AddObservation(transform.localPosition);

sensor.AddObservation(target.localPosition);

}

public override void OnActionReceived(ActionBuffers actions)

{

base.OnActionReceived(actions);

float moveX = actions.ContinuousActions[0];

float moveZ = actions.ContinuousActions[1];

transform.localPosition += new Vector3(moveX, 0, moveZ) * Time.deltaTime * speed;

}

public override void Heuristic(in ActionBuffers actionsOut)

{

base.Heuristic(actionsOut);

ActionSegment<float> contActions = actionsOut.ContinuousActions;

contActions[0] = Input.GetAxisRaw("Horizontal");

contActions[1] = Input.GetAxisRaw("Vertical");

}

private void OnTriggerEnter(Collider other)

{

if (other.CompareTag("Wall"))

{

SetReward(-1f);

mpb.SetColor("_Color", Color.red);

floorRender.SetPropertyBlock(mpb);

EndEpisode();

}

if (other.CompareTag("Goal"))

{

SetReward(1f);

mpb.SetColor("_Color", Color.green);

floorRender.SetPropertyBlock(mpb);

EndEpisode();

}

}

}

KandepateJohnson · Dec 10, 2022

any update on this ?

Search Unity

Unity ID

Useful Searches

Training stops immediately with error: SubprocessEnvManager had workers that didn't signal shutdown

Hjalte

KandepateJohnson