Search Unity

Question Pytorch error: Class values must be smaller than num_classes

Discussion in 'ML-Agents' started by reachout, Apr 8, 2021.

  1. reachout

    reachout

    Joined:
    Jan 15, 2021
    Posts:
    10
    After about 20 seconds of training my agent, I get this error in the Anaconda command prompt. The Unity editor and game keep running but I don't know what kind of training is being done. In either case, I'd like to find out what the error is. I tried doing different actions within the game to trigger it, but it seems to happen no matter what at around 20 seconds in.

    Even when I stop the game within Editor, python.exe keeps running and has to be closed within Task Manager. The trace below is from after I stopped python. If I didn't it wouldn't get past the 'RuntimeError' line.

    I'm using ml agents 1.9.0, Windows 7 64-bit, Anaconda3, Unity 2020-3-0f1. I am also noob.

    The full terminal trace is:

    Code (CSharp):
    1.  
    2. (mlagents-r8) E:\Programming\Unity Hub\My projects\Rocket Ball wAI>mlagents-lear
    3. n Configs\PlayRocketBallConfig.yaml --time-scale 0.02 --target-frame-rate 60 --r
    4. un-id P2AIAtt12 --force
    5.  
    6. (Unity logo)
    7.  
    8. Version information:
    9.   ml-agents: 0.25.0,
    10.   ml-agents-envs: 0.25.0,
    11.   Communicator API: 1.5.0,
    12.   PyTorch: 1.8.0
    13. 2021-04-08 14:15:08 INFO [learn.py:245] run_seed set to 7998
    14. 2021-04-08 14:15:08 INFO [torch.py:58] default Torch device: cuda
    15. 2021-04-08 14:15:10 INFO [environment.py:210] Listening on port 5004. Start trai
    16. ning by pressing the Play button in the Unity Editor.
    17. 2021-04-08 14:15:16 INFO [environment.py:112] Connected to Unity environment wit
    18. h package version 1.9.0-preview and communication version 1.5.0
    19. 2021-04-08 14:15:16 INFO [environment.py:282] Connected new brain:
    20. P2AIAtt?team=0
    21. 2021-04-08 14:15:16 WARNING [stats.py:237] events.out.tfevents.1617882358.TurNam
    22. e-PC.6608.0 was left over from a previous run. Deleting.
    23. 2021-04-08 14:15:16 INFO [stats.py:186] Hyperparameters for behavior name P2AIAt
    24. t:
    25.         trainer_type:   ppo
    26.         hyperparameters:
    27.           batch_size:   10
    28.           buffer_size:  100
    29.           learning_rate:        0.0003
    30.           beta: 0.0005
    31.           epsilon:      0.2
    32.           lambd:        0.99
    33.           num_epoch:    3
    34.           learning_rate_schedule:       linear
    35.         network_settings:
    36.           normalize:    False
    37.           hidden_units: 128
    38.           num_layers:   2
    39.           vis_encode_type:      simple
    40.           memory:       None
    41.         reward_signals:
    42.           extrinsic:
    43.             gamma:      0.99
    44.             strength:   1.0
    45.             network_settings:
    46.               normalize:        False
    47.               hidden_units:     128
    48.               num_layers:       2
    49.               vis_encode_type:  simple
    50.               memory:   None
    51.           gail:
    52.             gamma:      0.99
    53.             strength:   0.5
    54.             network_settings:
    55.               normalize:        False
    56.               hidden_units:     128
    57.               num_layers:       2
    58.               vis_encode_type:  simple
    59.               memory:   None
    60.             learning_rate:      0.0003
    61.             encoding_size:      None
    62.             use_actions:        False
    63.             use_vail:   False
    64.             demo_path:  Demos\PlayRocketBall_0.demo
    65.         init_path:      None
    66.         keep_checkpoints:       5
    67.         checkpoint_interval:    500000
    68.         max_steps:      500000
    69.         time_horizon:   64
    70.         summary_freq:   10000
    71.         threaded:       True
    72.         self_play:      None
    73.         behavioral_cloning:
    74.           demo_path:    Demos\PlayRocketBall_0.demo
    75.           steps:        0
    76.           strength:     0.5
    77.           samples_per_update:   0
    78.           num_epoch:    None
    79.           batch_size:   None
    80. Exception in thread Thread-2:
    81. Traceback (most recent call last):
    82.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\threading.py", line 932, i
    83. n _bootstrap_inner
    84.     self.run()
    85.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\threading.py", line 870, i
    86. n run
    87.     self._target(*self._args, **self._kwargs)
    88.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\site-packages\mlagents\tra
    89. iners\trainer_controller.py", line 297, in trainer_update_func
    90.     trainer.advance()
    91.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\site-packages\mlagents\tra
    92. iners\trainer\rl_trainer.py", line 297, in advance
    93.     if self._update_policy():
    94.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\site-packages\mlagents\tra
    95. iners\ppo\trainer.py", line 213, in _update_policy
    96.     update_stats = self.optimizer.bc_module.update()
    97.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\site-packages\mlagents\tra
    98. iners\torch\components\bc\module.py", line 93, in update
    99.     run_out = self._update_batch(mini_batch_demo, self.n_sequences)
    100.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\site-packages\mlagents\tra
    101. iners\torch\components\bc\module.py", line 175, in _update_batch
    102.     bc_loss = self._behavioral_cloning_loss(
    103.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\site-packages\mlagents\tra
    104. iners\torch\components\bc\module.py", line 116, in _behavioral_cloning_loss
    105.     one_hot_expert_actions = ModelUtils.actions_to_onehot(
    106.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\site-packages\mlagents\tra
    107. iners\torch\utils.py", line 275, in actions_to_onehot
    108.     onehot_branches = [
    109.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\site-packages\mlagents\tra
    110. iners\torch\utils.py", line 276, in <listcomp>
    111.     torch.nn.functional.one_hot(_act.T, action_size[i]).float()
    112. RuntimeError: Class values must be smaller than num_classes.
    113. 2021-04-08 14:15:54 INFO [subprocess_env_manager.py:220] UnityEnvironment worker
    114. 0: environment stopping.
    115. 2021-04-08 14:15:54 INFO [trainer_controller.py:187] Learning was interrupted. P
    116. lease wait while the graph is generated.
    117. 2021-04-08 14:15:56 INFO [model_serialization.py:183] Converting to results\P2AI
    118. Att12\P2AIAtt\P2AIAtt-128.onnx
    119. 2021-04-08 14:16:00 INFO [model_serialization.py:195] Exported results\P2AIAtt12
    120. \P2AIAtt\P2AIAtt-128.onnx
    121. 2021-04-08 14:16:00 INFO [torch_model_saver.py:116] Copied results\P2AIAtt12\P2A
    122. IAtt\P2AIAtt-128.onnx to results\P2AIAtt12\P2AIAtt.onnx.
    123. 2021-04-08 14:16:00 INFO [trainer_controller.py:81] Saved Model
    Update:
    If I keep playing in the Editor despite the error, after about 2 minutes of play, the editor freezes. If I stop python.exe again in Task Manager, I get control back and in the anaconda prompt I get these additional errors, which being right after 'Runtime Error' above

    Code (CSharp):
    1.  
    2. 2021-04-08 17:40:54 ERROR [subprocess_env_manager.py:226] UnityEnvironment worke
    3. r 0: environment raised an unexpected exception.
    4. Traceback (most recent call last):
    5.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\multiprocessing\connection
    6. .py", line 312, in _recv_bytes
    7.     nread, err = ov.GetOverlappedResult(True)
    8. BrokenPipeError: [WinError 109] The pipe has been ended
    9.  
    10. During handling of the above exception, another exception occurred:
    11.  
    12. Traceback (most recent call last):
    13.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\site-packages\mlagents\tra
    14. iners\subprocess_env_manager.py", line 172, in worker
    15.     req: EnvironmentRequest = parent_conn.recv()
    16.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\multiprocessing\connection
    17. .py", line 250, in recv
    18.     buf = self._recv_bytes()
    19.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\multiprocessing\connection
    20. .py", line 321, in _recv_bytes
    21.     raise EOFError
    22. EOFError
    23. Process Process-1:
    24. Traceback (most recent call last):
    25.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\multiprocessing\connection
    26. .py", line 312, in _recv_bytes
    27.     nread, err = ov.GetOverlappedResult(True)
    28. BrokenPipeError: [WinError 109] The pipe has been ended
    29.  
    30. During handling of the above exception, another exception occurred:
    31.  
    32. Traceback (most recent call last):
    33.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\site-packages\mlagents\tra
    34. iners\subprocess_env_manager.py", line 172, in worker
    35.     req: EnvironmentRequest = parent_conn.recv()
    36.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\multiprocessing\connection
    37. .py", line 250, in recv
    38.     buf = self._recv_bytes()
    39.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\multiprocessing\connection
    40. .py", line 321, in _recv_bytes
    41.     raise EOFError
    42. EOFError
    43.  
    44. During handling of the above exception, another exception occurred:
    45.  
    46. Traceback (most recent call last):
    47.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\multiprocessing\process.py
    48. ", line 315, in _bootstrap
    49.     self.run()
    50.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\multiprocessing\process.py
    51. ", line 108, in run
    52.     self._target(*self._args, **self._kwargs)
    53.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\site-packages\mlagents\tra
    54. iners\subprocess_env_manager.py", line 232, in worker
    55.     _send_response(EnvironmentCommand.ENV_EXITED, ex)
    56.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\site-packages\mlagents\tra
    57. iners\subprocess_env_manager.py", line 147, in _send_response
    58.     parent_conn.send(EnvironmentResponse(cmd_name, worker_id, payload))
    59.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\multiprocessing\connection
    60. .py", line 206, in send
    61.     self._send_bytes(_ForkingPickler.dumps(obj))
    62.   File "e:\programming\anaconda3\envs\mlagents-r8\lib\multiprocessing\connection
    63. .py", line 280, in _send_bytes
    64.     ov, err = _winapi.WriteFile(self._handle, buf, overlapped=True)
    65. BrokenPipeError: [WinError 232] The pipe is being closed
    In the VSCode debugger, if I hit pause while the editor is frozen, it takes me to E:\Programming\Unity Hub\My projects\Rocket Ball wAI\Library\PackageCache\com.unity.ml-agents@1.9.0-preview\Runtime\Grpc\CommunicatorObjects\UnityToExternalGrpc.cs line 90, the whole function being the following:
    Code (CSharp):
    1.  
    2. public virtual global::Unity.MLAgents.CommunicatorObjects.UnityMessageProto Exchange(global::Unity.MLAgents.CommunicatorObjects.UnityMessageProto request, grpc::CallOptions options)
    3.       {
    4.         return CallInvoker.BlockingUnaryCall(__Method_Exchange, null, options, request);
    5.       }
     
    Last edited: Apr 8, 2021
  2. reachout

    reachout

    Joined:
    Jan 15, 2021
    Posts:
    10
    So, I have narrowed it down to the Demonstration Recorder and the Buffer Size in parameters. When I don't have behavioral cloning or gail in the config file, I don't get the error. And when I increase buffer size, the error takes longer to occur. I don't understand buffer sizes well enough to be able to stop the error completely though.
     
  3. andrewcoh_unity

    andrewcoh_unity

    Unity Technologies

    Joined:
    Sep 5, 2019
    Posts:
    162
    Hi @reachout

    I am not sure where this issue is starting. Can you clarify what you mean by triggering different actions within the game? Are you trying to record demos to train the agent while you are training? The demos should be recorded before training begins.

    Would you mind explaining how you narrowed it down to the demo recorder/buffer size and why you think this might be the culprit?