Search Unity

issue with ml version 14

Discussion in 'ML-Agents' started by mamaorha, Mar 17, 2021.

  1. mamaorha

    mamaorha

    Joined:
    Jun 16, 2015
    Posts:
    44
    after a while of training i get this message on pyhton and the unity stops
    Code (CSharp):
    1. Traceback (most recent call last):
    2.   File "d:\python\python37\lib\site-packages\mlagents\trainers\trainer_controller.py", line 175, in start_learning
    3.     n_steps = self.advance(env_manager)
    4.   File "d:\python\python37\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    5.     return func(*args, **kwargs)
    6.   File "d:\python\python37\lib\site-packages\mlagents\trainers\trainer_controller.py", line 250, in advance
    7.     trainer.advance()
    8.   File "d:\python\python37\lib\site-packages\mlagents\trainers\ghost\trainer.py", line 243, in advance
    9.     self.trainer.advance()
    10.   File "d:\python\python37\lib\site-packages\mlagents\trainers\trainer\rl_trainer.py", line 274, in advance
    11.     self._process_trajectory(t)
    12.   File "d:\python\python37\lib\site-packages\mlagents\trainers\ppo\trainer.py", line 67, in _process_trajectory
    13.     super()._process_trajectory(trajectory)
    14.   File "d:\python\python37\lib\site-packages\mlagents\trainers\trainer\rl_trainer.py", line 232, in _process_trajectory
    15.     self._maybe_save_model(self.get_step + len(trajectory.steps))
    16.   File "d:\python\python37\lib\site-packages\mlagents\trainers\trainer\rl_trainer.py", line 257, in _maybe_save_model
    17.     self._checkpoint()
    18.   File "d:\python\python37\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    19.     return func(*args, **kwargs)
    20.   File "d:\python\python37\lib\site-packages\mlagents\trainers\trainer\rl_trainer.py", line 155, in _checkpoint
    21.     checkpoint_path = self.model_saver.save_checkpoint(self.brain_name, self.step)
    22.   File "d:\python\python37\lib\site-packages\mlagents\trainers\model_saver\torch_model_saver.py", line 56, in save_checkpoint
    23.     torch.save(state_dict, os.path.join(self.model_path, "checkpoint.pt"))
    24.   File "d:\python\python37\lib\site-packages\torch\serialization.py", line 369, in save
    25.     with _open_file_like(f, 'wb') as opened_file:
    26.   File "d:\python\python37\lib\site-packages\torch\serialization.py", line 230, in _open_file_like
    27.     return _open_file(name_or_buffer, mode)
    28.   File "d:\python\python37\lib\site-packages\torch\serialization.py", line 211, in __init__
    29.     super(_open_file, self).__init__(open(name, mode))
    30. OSError: [Errno 22] Invalid argument: 'results\\one\\Balance\\checkpoint.pt'
    31.  
    32. During handling of the above exception, another exception occurred:
    33.  
    34. Traceback (most recent call last):
    35.   File "d:\python\python37\lib\runpy.py", line 193, in _run_module_as_main
    36.     "__main__", mod_spec)
    37.   File "d:\python\python37\lib\runpy.py", line 85, in _run_code
    38.     exec(code, run_globals)
    39.   File "D:\Python\Python37\Scripts\mlagents-learn.exe\__main__.py", line 7, in <module>
    40.   File "d:\python\python37\lib\site-packages\mlagents\trainers\learn.py", line 250, in main
    41.     run_cli(parse_command_line())
    42.   File "d:\python\python37\lib\site-packages\mlagents\trainers\learn.py", line 246, in run_cli
    43.     run_training(run_seed, options)
    44.   File "d:\python\python37\lib\site-packages\mlagents\trainers\learn.py", line 125, in run_training
    45.     tc.start_learning(env_manager)
    46.   File "d:\python\python37\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    47.     return func(*args, **kwargs)
    48.   File "d:\python\python37\lib\site-packages\mlagents\trainers\trainer_controller.py", line 200, in start_learning
    49.     self._save_models()
    50.   File "d:\python\python37\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    51.     return func(*args, **kwargs)
    52.   File "d:\python\python37\lib\site-packages\mlagents\trainers\trainer_controller.py", line 80, in _save_models
    53.     self.trainers[brain_name].save_model()
    54.   File "d:\python\python37\lib\site-packages\mlagents\trainers\ghost\trainer.py", line 320, in save_model
    55.     self.trainer.save_model()
    56.   File "d:\python\python37\lib\site-packages\mlagents\trainers\trainer\rl_trainer.py", line 181, in save_model
    57.     model_checkpoint = self._checkpoint()
    58.   File "d:\python\python37\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    59.     return func(*args, **kwargs)
    60.   File "d:\python\python37\lib\site-packages\mlagents\trainers\trainer\rl_trainer.py", line 155, in _checkpoint
    61.     checkpoint_path = self.model_saver.save_checkpoint(self.brain_name, self.step)
    62.   File "d:\python\python37\lib\site-packages\mlagents\trainers\model_saver\torch_model_saver.py", line 56, in save_checkpoint
    63.     torch.save(state_dict, os.path.join(self.model_path, "checkpoint.pt"))
    64.   File "d:\python\python37\lib\site-packages\torch\serialization.py", line 369, in save
    65.     with _open_file_like(f, 'wb') as opened_file:
    66.   File "d:\python\python37\lib\site-packages\torch\serialization.py", line 230, in _open_file_like
    67.     return _open_file(name_or_buffer, mode)
    68.   File "d:\python\python37\lib\site-packages\torch\serialization.py", line 211, in __init__
    69.     super(_open_file, self).__init__(open(name, mode))
    70. OSError: [Errno 22] Invalid argument: 'results\\one\\Balance\\checkpoint.pt'
     
  2. TreyK-47

    TreyK-47

    Unity Technologies

    Joined:
    Oct 22, 2019
    Posts:
    1,822
    I'll bounce this off the team for some guidance! Hang tight.
     
  3. mamaorha

    mamaorha

    Joined:
    Jun 16, 2015
    Posts:
    44
    i adjusted rewards so i dont get big negatives and it didnt reproduced since, not sure if related or just random thing that caused the crash.
     
  4. christophergoy

    christophergoy

    Unity Technologies

    Joined:
    Sep 16, 2015
    Posts:
    735
    Hi, please post more information like your:
    - Training configuration
    - unity logs
    - action/observation space
    - actuators/sensors you are using
    - the command used when you invoked mlagents-learn

    Thank you!