Search Unity

First ML-Agents project divide by zero error

Discussion in 'ML-Agents' started by unity_294AC9523AC4C06AE218, Feb 2, 2022.

  1. unity_294AC9523AC4C06AE218

    unity_294AC9523AC4C06AE218

    Joined:
    Jan 21, 2022
    Posts:
    1
    I am following the codemonkey tutorial and my code is minimal at this point however I'm getting this error when I press play:
    File "c:\users\USER\anaconda3\lib\site-packages\torch\nn\init.py", line 381, in kaiming_uniform_
    std = gain / math.sqrt(fan)
    ZeroDivisionError: float division by zero

    The error message seems strange and I don't know if it's a problem with my installation of ml-agents or my project setup. Thanks for the help


    Full trace:
    [WARNING] Trainer has no policies, not saving anything.
    Traceback (most recent call last):
    File "c:\users\lmcck\anaconda3\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
    File "c:\users\lmcck\anaconda3\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
    File "C:\Users\lmcck\anaconda3\Scripts\mlagents-learn.exe\__main__.py", line 7, in <module>
    File "c:\users\lmcck\anaconda3\lib\site-packages\mlagents\trainers\learn.py", line 260, in main
    run_cli(parse_command_line())
    File "c:\users\lmcck\anaconda3\lib\site-packages\mlagents\trainers\learn.py", line 256, in run_cli
    run_training(run_seed, options, num_areas)
    File "c:\users\lmcck\anaconda3\lib\site-packages\mlagents\trainers\learn.py", line 132, in run_training
    tc.start_learning(env_manager)
    File "c:\users\lmcck\anaconda3\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
    File "c:\users\lmcck\anaconda3\lib\site-packages\mlagents\trainers\trainer_controller.py", line 173, in start_learning
    self._reset_env(env_manager)
    File "c:\users\lmcck\anaconda3\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
    File "c:\users\lmcck\anaconda3\lib\site-packages\mlagents\trainers\trainer_controller.py", line 107, in _reset_env
    self._register_new_behaviors(env_manager, env_manager.first_step_infos)
    File "c:\users\lmcck\anaconda3\lib\site-packages\mlagents\trainers\trainer_controller.py", line 268, in _register_new_behaviors
    self._create_trainers_and_managers(env_manager, new_behavior_ids)
    File "c:\users\lmcck\anaconda3\lib\site-packages\mlagents\trainers\trainer_controller.py", line 166, in _create_trainers_and_managers
    self._create_trainer_and_manager(env_manager, behavior_id)
    File "c:\users\lmcck\anaconda3\lib\site-packages\mlagents\trainers\trainer_controller.py", line 137, in _create_trainer_and_manager
    policy = trainer.create_policy(
    File "c:\users\lmcck\anaconda3\lib\site-packages\mlagents\trainers\trainer\rl_trainer.py", line 119, in create_policy
    return self.create_torch_policy(parsed_behavior_id, behavior_spec)
    File "c:\users\lmcck\anaconda3\lib\site-packages\mlagents\trainers\ppo\trainer.py", line 227, in create_torch_policy
    policy = TorchPolicy(
    File "c:\users\lmcck\anaconda3\lib\site-packages\mlagents\trainers\policy\torch_policy.py", line 57, in __init__
    self.actor = SimpleActor(
    File "c:\users\lmcck\anaconda3\lib\site-packages\mlagents\trainers\torch\networks.py", line 606, in __init__
    self.network_body = NetworkBody(observation_specs, network_settings)
    File "c:\users\lmcck\anaconda3\lib\site-packages\mlagents\trainers\torch\networks.py", line 212, in __init__
    self._body_endoder = LinearEncoder(
    File "c:\users\lmcck\anaconda3\lib\site-packages\mlagents\trainers\torch\layers.py", line 148, in __init__
    linear_layer(
    File "c:\users\lmcck\anaconda3\lib\site-packages\mlagents\trainers\torch\layers.py", line 49, in linear_layer
    layer = torch.nn.Linear(input_size, output_size)
    File "c:\users\lmcck\anaconda3\lib\site-packages\torch\nn\modules\linear.py", line 83, in __init__
    self.reset_parameters()
    File "c:\users\lmcck\anaconda3\lib\site-packages\torch\nn\modules\linear.py", line 86, in reset_parameters
    init.kaiming_uniform_(self.weight, a=math.sqrt(5))
    File "c:\users\lmcck\anaconda3\lib\site-packages\torch\nn\init.py", line 381, in kaiming_uniform_
    std = gain / math.sqrt(fan)
    ZeroDivisionError: float division by zero
     
  2. ChillX

    ChillX

    Joined:
    Jun 16, 2016
    Posts:
    145
    Which version of ML agents are you using. And which version of com.unity.ml-agents are you using
    For example I am using:
    Version information:
    ml-agents: 0.28.0,
    ml-agents-envs: 0.28.0,
    Communicator API: 1.5.0,
    PyTorch: 1.7.1+cu110

    and on the Unity Side:
    com.unity.ml-agents@2.2.1-exp.1
    com.unity.ml-agents.extensions@0.6.1-preview

    If there is a mismatch in the paring of the Unity side version and the Python side version weird things can sometimes happen.

    Note if you cloned followed the installation docs and cloned the github repository recently and got release 19. Then check that in Unity your not using v2.1.0-exp.1
    instead of 2.2.1-exp.1

    Otherwise check the sensors your using.

    Just reading the stack trace the specific error is:
    ML Agents is trying to initialize an NN Layer (linear layer) with most likely a Glorot variant maybe HE or Xavier etc... Anyway the initializer uses fan in and fan out parameters which are dependent on the shape of the layer tensor. In this case the fan parameter is zero, the square root of which is also zero. Since fan is derived from the tensor shape check your config yaml file. Also check that your sensors don't have an invalid configuration.
     
  3. warnerjonn

    warnerjonn

    Joined:
    Jun 29, 2020
    Posts:
    2
    The super class of ZeroDivisionError is ArithmeticError. This exception raised when the second argument of a division or modulo operation is zero. In Mathematics, when a number is divided by a zero, the result is an infinite number. It is impossible to write an Infinite number physically. Python interpreter throws “ZeroDivisionError: division by zero” error if the result is infinite number. While implementing any program logic and there is division operation make sure always handle ArithmeticError or ZeroDivisionError so that program will not terminate.

    try:

    Code (CSharp):
    1.     z = x / y
    2. except ZeroDivisionError:
    3.     z = 0
    Or check before you do the division:

    Code (CSharp):
    1. if y == 0:
    2.     z = 0
    3. else:
    4.     z = x / y
    The latter can be reduced to:

    Code (CSharp):
    1. z = 0 if y == 0 else (x / y)
     
  4. almostgiants

    almostgiants

    Joined:
    Nov 10, 2021
    Posts:
    5
    This was happening to me when I had my sensor component (Grid) and Agent Script component attached to two different game objects. When I put them on the same game object the problem was fixed.