Search Unity

  1. We are migrating the Unity Forums to Unity Discussions. On July 12, the Unity Forums will become read-only.

    Please, do not make any changes to your username or email addresses at id.unity.com during this transition time.

    It's still possible to reply to existing private message conversations during the migration, but any new replies you post will be missing after the main migration is complete. We'll do our best to migrate these messages in a follow-up step.

    On July 15, Unity Discussions will become read-only until July 18, when the new design and the migrated forum contents will go live.


    Read our full announcement for more information and let us know if you have any questions.

Question Error and failure of saving a model after training

Discussion in 'ML-Agents' started by dominik_rice, Oct 31, 2023.

  1. dominik_rice

    dominik_rice

    Joined:
    Oct 23, 2023
    Posts:
    3
    Hello!

    After managing to install the ML Agents, I was able to train a model. However, when finishing training, I always get an error and the model is not saved. Can someone help me?

    My setup:
    macOS Sonoma 14.1 / M1
    onnx = 1.12.0
    protobuf = 3.19.6
    torch = 2.1.0
    python = 3.10.12
    mlagents = 1.0.0
    Unity = 2022.3.11f1

    Here is the error:

    Code (CSharp):
    1. [INFO] MoveToGoal. Step: 10000. Time Elapsed: 22.075 s. Mean Reward: -2.000. Std of Reward: 2.714. Training.
    2. [WARNING] Restarting worker[0] after 'Communicator has exited.'
    3. /Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/torch/__init__.py:614: UserWarning: torch.set_default_tensor_type() is deprecated as of PyTorch 2.1, please use torch.set_default_dtype() and torch.set_default_device() as alternatives. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/tensor/python_tensor.cpp:453.)
    4.   _C._set_default_tensor_type(t)
    5. [INFO] Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
    6. Traceback (most recent call last):
    7.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents/trainers/trainer_controller.py", line 175, in start_learning
    8.     n_steps = self.advance(env_manager)
    9.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents_envs/timers.py", line 305, in wrapped
    10.     return func(*args, **kwargs)
    11.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents/trainers/trainer_controller.py", line 233, in advance
    12.     new_step_infos = env_manager.get_steps()
    13.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents/trainers/env_manager.py", line 124, in get_steps
    14.     new_step_infos = self._step()
    15.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents/trainers/subprocess_env_manager.py", line 420, in _step
    16.     self._restart_failed_workers(step)
    17.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents/trainers/subprocess_env_manager.py", line 328, in _restart_failed_workers
    18.     self.reset(self.env_parameters)
    19.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents/trainers/env_manager.py", line 68, in reset
    20.     self.first_step_infos = self._reset_env(config)
    21.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents/trainers/subprocess_env_manager.py", line 446, in _reset_env
    22.     ew.previous_step = EnvironmentStep(ew.recv().payload, ew.worker_id, {}, {})
    23.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents/trainers/subprocess_env_manager.py", line 101, in recv
    24.     raise env_exception
    25. mlagents_envs.exception.UnityTimeOutException: The Unity environment took too long to respond. Make sure that :
    26.      The environment does not need user interaction to launch
    27.      The Agents' Behavior Parameters > Behavior Type is set to "Default"
    28.     The environment and the Python interface have compatible versions.
    29.     If you're running on a headless server without graphics support, turn off display by either passing --no-graphics option or build your Unity executable as server build.
    30.  
    31. During handling of the above exception, another exception occurred:
    32.  
    33. Traceback (most recent call last):
    34.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/torch/onnx/_internal/onnx_proto_utils.py", line 221, in _add_onnxscript_fn
    35.     import onnx
    36.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/onnx/__init__.py", line 5, in <module>
    37.     from .onnx_cpp2py_export import ONNX_ML
    38. ImportError: dlopen(/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/onnx/onnx_cpp2py_export.cpython-310-darwin.so, 0x0002): Library not loaded: @rpath/libprotobuf.31.dylib
    39.   Referenced from: <4A0F8F41-B487-3758-8FD3-BD8580182670> /Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/onnx/onnx_cpp2py_export.cpython-310-darwin.so
    40.   Reason: tried: '/Users/.../opt/anaconda3/envs/ml-agents/lib/libprotobuf.31.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/Users/.../opt/anaconda3/envs/ml-agents/lib/libprotobuf.31.dylib' (no such file), '/Users/.../opt/anaconda3/envs/ml-agents/lib/libprotobuf.31.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/Users/.../opt/anaconda3/envs/ml-agents/lib/libprotobuf.31.dylib' (no such file), '/Users/.../opt/anaconda3/envs/mlagents/bin/../lib/libprotobuf.31.dylib' (no such file), '/Users/.../opt/anaconda3/envs/mlagents/bin/../lib/libprotobuf.31.dylib' (no such file)
    41.  
    42. The above exception was the direct cause of the following exception:
    43.  
    44. Traceback (most recent call last):
    45.   File "/Users/.../opt/anaconda3/envs/mlagents/bin/mlagents-learn", line 8, in <module>
    46.     sys.exit(main())
    47.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents/trainers/learn.py", line 267, in main
    48.     run_cli(parse_command_line())
    49.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents/trainers/learn.py", line 263, in run_cli
    50.     run_training(run_seed, options, num_areas)
    51.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents/trainers/learn.py", line 137, in run_training
    52.     tc.start_learning(env_manager)
    53.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents_envs/timers.py", line 305, in wrapped
    54.     return func(*args, **kwargs)
    55.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents/trainers/trainer_controller.py", line 200, in start_learning
    56.     self._save_models()
    57.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents_envs/timers.py", line 305, in wrapped
    58.     return func(*args, **kwargs)
    59.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents/trainers/trainer_controller.py", line 80, in _save_models
    60.     self.trainers[brain_name].save_model()
    61.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents/trainers/trainer/rl_trainer.py", line 172, in save_model
    62.     model_checkpoint = self._checkpoint()
    63.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents_envs/timers.py", line 305, in wrapped
    64.     return func(*args, **kwargs)
    65.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents/trainers/trainer/rl_trainer.py", line 144, in _checkpoint
    66.     export_path, auxillary_paths = self.model_saver.save_checkpoint(
    67.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents/trainers/model_saver/torch_model_saver.py", line 60, in save_checkpoint
    68.     self.export(checkpoint_path, behavior_name)
    69.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents/trainers/model_saver/torch_model_saver.py", line 65, in export
    70.     self.exporter.export_policy_model(output_filepath)
    71.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/mlagents/trainers/torch_entities/model_serialization.py", line 164, in export_policy_model
    72.     torch.onnx.export(
    73.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/torch/onnx/utils.py", line 516, in export
    74.     _export(
    75.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/torch/onnx/utils.py", line 1670, in _export
    76.     proto = onnx_proto_utils._add_onnxscript_fn(
    77.   File "/Users/.../opt/anaconda3/envs/mlagents/lib/python3.10/site-packages/torch/onnx/_internal/onnx_proto_utils.py", line 223, in _add_onnxscript_fn
    78.     raise errors.OnnxExporterError("Module onnx is not installed!") from e
    79. torch.onnx.errors.OnnxExporterError: Module onnx is not installed!
     
  2. dominik_rice

    dominik_rice

    Joined:
    Oct 23, 2023
    Posts:
    3
    Found this thread, but the issue still occurs. Does not seem to be resolved...