Search Unity

Error with curriculum change 14.1

Discussion in 'ML-Agents' started by wwaero, Apr 8, 2020.

  1. wwaero

    wwaero

    Joined:
    Feb 18, 2020
    Posts:
    42
    INFO:mlagents_envs:Connected new brain:
    Orbit?team=0
    INFO:mlagents.trainers:Hyperparameters for the PPOTrainer of brain Orbit:
    trainer: ppo
    batch_size: 1024
    beta: 0.005
    buffer_size: 10240
    epsilon: 0.2
    hidden_units: 128
    lambd: 0.95
    learning_rate: 0.0003
    learning_rate_schedule: linear
    max_steps: 1.0e8
    memory_size: 256
    normalize: False
    num_epoch: 3
    num_layers: 2
    time_horizon: 64
    sequence_length: 64
    summary_freq: 10000
    use_recurrent: False
    vis_encode_type: simple
    reward_signals:
    extrinsic:
    strength: 1.0
    gamma: 0.99
    summary_path: orbit_0_Orbit
    model_path: ./models/orbit_0/Orbit
    keep_checkpoints: 5
    2020-04-07 17:21:52.478674: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
    INFO:mlagents.trainers:Orbit lesson changed. Now in lesson 1: platformDifficulty -> 1.1
    Traceback (most recent call last):
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
    File "C:\Users\ww\AppData\Local\Continuum\anaconda3\envs\ml-agents-14-1\Scripts\mlagents-learn.exe\__main__.py", line 7, in <module>
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\learn.py", line 479, in main
    run_cli(parse_command_line())
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\learn.py", line 475, in run_cli
    run_training(run_seed, options)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\learn.py", line 320, in run_training
    tc.start_learning(env_manager)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\trainer_controller.py", line 218, in start_learning
    self.reset_env_if_ready(env_manager, global_step)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\trainer_controller.py", line 270, in reset_env_if_ready
    self.end_trainer_episodes(env, lessons_incremented)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\trainer_controller.py", line 236, in end_trainer_episodes
    self._reset_env(env)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\trainer_controller.py", line 151, in _reset_env
    env.reset(config=sampled_reset_param)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\env_manager.py", line 54, in reset
    manager.end_episode()
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\agent_processor.py", line 208, in end_episode
    self._clean_agent_data(_gid)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\agent_processor.py", line 187, in _clean_agent_data
    del self.episode_rewards[global_id]
    KeyError: '$0-4'

    Orbit.yaml

    Orbit:
    measure: progress
    thresholds: [-3.0, -3.0, -3.0]
    min_lesson_length: 20
    signal_smoothing: true
    parameters:
    platformDifficulty: [0.0, 1.1, 2.2, 3.3]

    my command
    mlagents-learn config/trainer_config.yaml --run-id=orbit_0 --train --curriculum=config/curricula/Orbit.yaml
     
    Last edited: Apr 8, 2020
  2. Xinzz

    Xinzz

    Joined:
    Jun 28, 2015
    Posts:
    67
    thresholds: [-3.0, -3.0, -3.0]

    thresholds (float array) - Points in value of measure where lesson should be increased.
     
  3. andrewcoh_unity

    andrewcoh_unity

    Unity Technologies

    Joined:
    Sep 5, 2019
    Posts:
    162
    Hi,

    As pointed out above, your threshold values are negative numbers. Since you are using progress, this should be some ratio of steps to max steps. For example, if you are running for a total of 10 million timesteps, and want your lesson to change at 1 million and 2 million timesteps, your thresholds would be:

    thresholds: [.1, .2]

    Did you mean to use reward instead of progress? Please see documentation here https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Training-Curriculum-Learning.md
     
  4. wwaero

    wwaero

    Joined:
    Feb 18, 2020
    Posts:
    42
    Crap, that explains it perfectly! Thanks!
     
  5. wwaero

    wwaero

    Joined:
    Feb 18, 2020
    Posts:
    42
    Even with changing to reward I'm hitting the error.

    Orbit:
    measure: reward
    thresholds: [5.0, 10.0, 15.0, 20.0]
    min_lesson_length: 5
    signal_smoothing: true
    parameters:
    platformDifficulty: [0.0, 1.0, 2.0, 3.0, 4.0]

    (ml-agents-14-1) D:\Documents\GitHub\ml-agents>mlagents-learn config/trainer_config.yaml --run-id=orbit_0 --train --curriculum=config/curricula/Orbit.yaml
    WARNING:tensorflow:From c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\tensorflow_core\python\compat\v2_compat.py:65: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
    Instructions for updating:
    non-resource variables are not supported in the long term


    ▄▄▄▓▓▓▓
    ╓▓▓▓▓▓▓█▓▓▓▓▓
    ,▄▄▄m▀▀▀' ,▓▓▓▀▓▓▄ ▓▓▓ ▓▓▌
    ▄▓▓▓▀' ▄▓▓▀ ▓▓▓ ▄▄ ▄▄ ,▄▄ ▄▄▄▄ ,▄▄ ▄▓▓▌▄ ▄▄▄ ,▄▄
    ▄▓▓▓▀ ▄▓▓▀ ▐▓▓▌ ▓▓▌ ▐▓▓ ▐▓▓▓▀▀▀▓▓▌ ▓▓▓ ▀▓▓▌▀ ^▓▓▌ ╒▓▓▌
    ▄▓▓▓▓▓▄▄▄▄▄▄▄▄▓▓▓ ▓▀ ▓▓▌ ▐▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▌ ▐▓▓▄ ▓▓▌
    ▀▓▓▓▓▀▀▀▀▀▀▀▀▀▀▓▓▄ ▓▓ ▓▓▌ ▐▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▌ ▐▓▓▐▓▓
    ^█▓▓▓ ▀▓▓▄ ▐▓▓▌ ▓▓▓▓▄▓▓▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▓▄ ▓▓▓▓`
    '▀▓▓▓▄ ^▓▓▓ ▓▓▓ └▀▀▀▀ ▀▀ ^▀▀ `▀▀ `▀▀ '▀▀ ▐▓▓▌
    ▀▀▀▀▓▄▄▄ ▓▓▓▓▓▓, ▓▓▓▓▀
    `▀█▓▓▓▓▓▓▓▓▓▌
    ¬`▀▀▀█▓


    Version information:
    ml-agents: 0.14.1,
    ml-agents-envs: 0.14.1,
    Communicator API: API-14,
    TensorFlow: 2.0.1
    WARNING:tensorflow:From c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\tensorflow_core\python\compat\v2_compat.py:65: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
    Instructions for updating:
    non-resource variables are not supported in the long term
    INFO:mlagents_envs:Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
    INFO:mlagents_envs:Connected new brain:
    Orbit?team=0
    INFO:mlagents.trainers:Hyperparameters for the PPOTrainer of brain Orbit:
    trainer: ppo
    batch_size: 1024
    beta: 0.005
    buffer_size: 10240
    epsilon: 0.2
    hidden_units: 128
    lambd: 0.95
    learning_rate: 0.0003
    learning_rate_schedule: linear
    max_steps: 1.0e8
    memory_size: 256
    normalize: False
    num_epoch: 3
    num_layers: 2
    time_horizon: 64
    sequence_length: 64
    summary_freq: 10000
    use_recurrent: False
    vis_encode_type: simple
    reward_signals:
    extrinsic:
    strength: 1.0
    gamma: 0.99
    summary_path: orbit_0_Orbit
    model_path: ./models/orbit_0/Orbit
    keep_checkpoints: 5
    2020-04-09 16:13:49.288449: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
    INFO:mlagents.trainers:Orbit lesson changed. Now in lesson 1: platformDifficulty -> 1.0
    Traceback (most recent call last):
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
    File "C:\Users\ww\AppData\Local\Continuum\anaconda3\envs\ml-agents-14-1\Scripts\mlagents-learn.exe\__main__.py", line 7, in <module>
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\learn.py", line 479, in main
    run_cli(parse_command_line())
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\learn.py", line 475, in run_cli
    run_training(run_seed, options)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\learn.py", line 320, in run_training
    tc.start_learning(env_manager)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\trainer_controller.py", line 218, in start_learning
    self.reset_env_if_ready(env_manager, global_step)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\trainer_controller.py", line 270, in reset_env_if_ready
    self.end_trainer_episodes(env, lessons_incremented)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\trainer_controller.py", line 236, in end_trainer_episodes
    self._reset_env(env)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\trainer_controller.py", line 151, in _reset_env
    env.reset(config=sampled_reset_param)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\env_manager.py", line 54, in reset
    manager.end_episode()
    File "c:\users\ww3\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\agent_processor.py", line 208, in end_episode
    self._clean_agent_data(_gid)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\agent_processor.py", line 187, in _clean_agent_data
    del self.episode_rewards[global_id]
    KeyError: '$0-4'
     
  6. andrewcoh_unity

    andrewcoh_unity

    Unity Technologies

    Joined:
    Sep 5, 2019
    Posts:
    162
    Your thresholds should be between 0 and 1.
     
  7. wwaero

    wwaero

    Joined:
    Feb 18, 2020
    Posts:
    42
    I'm crashing even when I'm within 0-1 for threshold.

    Orbit:
    measure: progress
    thresholds: [.2, .3, .4, .5]
    min_lesson_length: 5
    signal_smoothing: true
    parameters:
    platformDifficulty: [0.0, 1.0, 2.0, 3.0, 4.0]

    D:\Documents\GitHub\ml-agents>mlagents-learn config/trainer_config.yaml --run-id=orbit_0 --train --curriculum=config/curricula/Orbit.yaml
    WARNING:tensorflow:From c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\tensorflow_core\python\compat\v2_compat.py:65: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
    Instructions for updating:
    non-resource variables are not supported in the long term


    ▄▄▄▓▓▓▓
    ╓▓▓▓▓▓▓█▓▓▓▓▓
    ,▄▄▄m▀▀▀' ,▓▓▓▀▓▓▄ ▓▓▓ ▓▓▌
    ▄▓▓▓▀' ▄▓▓▀ ▓▓▓ ▄▄ ▄▄ ,▄▄ ▄▄▄▄ ,▄▄ ▄▓▓▌▄ ▄▄▄ ,▄▄
    ▄▓▓▓▀ ▄▓▓▀ ▐▓▓▌ ▓▓▌ ▐▓▓ ▐▓▓▓▀▀▀▓▓▌ ▓▓▓ ▀▓▓▌▀ ^▓▓▌ ╒▓▓▌
    ▄▓▓▓▓▓▄▄▄▄▄▄▄▄▓▓▓ ▓▀ ▓▓▌ ▐▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▌ ▐▓▓▄ ▓▓▌
    ▀▓▓▓▓▀▀▀▀▀▀▀▀▀▀▓▓▄ ▓▓ ▓▓▌ ▐▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▌ ▐▓▓▐▓▓
    ^█▓▓▓ ▀▓▓▄ ▐▓▓▌ ▓▓▓▓▄▓▓▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▓▄ ▓▓▓▓`
    '▀▓▓▓▄ ^▓▓▓ ▓▓▓ └▀▀▀▀ ▀▀ ^▀▀ `▀▀ `▀▀ '▀▀ ▐▓▓▌
    ▀▀▀▀▓▄▄▄ ▓▓▓▓▓▓, ▓▓▓▓▀
    `▀█▓▓▓▓▓▓▓▓▓▌
    ¬`▀▀▀█▓


    Version information:
    ml-agents: 0.14.1,
    ml-agents-envs: 0.14.1,
    Communicator API: API-14,
    TensorFlow: 2.0.1
    WARNING:tensorflow:From c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\tensorflow_core\python\compat\v2_compat.py:65: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
    Instructions for updating:
    non-resource variables are not supported in the long term
    INFO:mlagents_envs:Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
    INFO:mlagents_envs:Connected new brain:
    Orbit?team=0
    INFO:mlagents.trainers:Hyperparameters for the PPOTrainer of brain Orbit:
    trainer: ppo
    batch_size: 1024
    beta: 0.005
    buffer_size: 10240
    epsilon: 0.2
    hidden_units: 128
    lambd: 0.95
    learning_rate: 0.0003
    learning_rate_schedule: linear
    max_steps: 1.0e5
    memory_size: 256
    normalize: False
    num_epoch: 3
    num_layers: 2
    time_horizon: 64
    sequence_length: 64
    summary_freq: 10000
    use_recurrent: False
    vis_encode_type: simple
    reward_signals:
    extrinsic:
    strength: 1.0
    gamma: 0.99
    summary_path: orbit_0_Orbit
    model_path: ./models/orbit_0/Orbit
    keep_checkpoints: 5
    2020-04-16 13:35:43.846804: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
    INFO:mlagents.trainers: orbit_0: Orbit: Step: 10000. Time Elapsed: 137.540 s Mean Reward: -16.265. Std of Reward: 19.271. Training.
    INFO:mlagents.trainers: orbit_0: Orbit: Step: 20000. Time Elapsed: 272.975 s Mean Reward: -12.294. Std of Reward: 19.108. Training.
    INFO:mlagents.trainers:Orbit lesson changed. Now in lesson 1: platformDifficulty -> 1.0
    Traceback (most recent call last):
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
    File "C:\Users\ww\AppData\Local\Continuum\anaconda3\envs\ml-agents-14-1\Scripts\mlagents-learn.exe\__main__.py", line 7, in <module>
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\learn.py", line 479, in main
    run_cli(parse_command_line())
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\learn.py", line 475, in run_cli
    run_training(run_seed, options)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\learn.py", line 320, in run_training
    tc.start_learning(env_manager)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\trainer_controller.py", line 218, in start_learning
    self.reset_env_if_ready(env_manager, global_step)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\trainer_controller.py", line 270, in reset_env_if_ready
    self.end_trainer_episodes(env, lessons_incremented)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\trainer_controller.py", line 236, in end_trainer_episodes
    self._reset_env(env)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\trainer_controller.py", line 151, in _reset_env
    env.reset(config=sampled_reset_param)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\env_manager.py", line 54, in reset
    manager.end_episode()
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\agent_processor.py", line 208, in end_episode
    self._clean_agent_data(_gid)
    File "c:\users\ww\appdata\local\continuum\anaconda3\envs\ml-agents-14-1\lib\site-packages\mlagents\trainers\agent_processor.py", line 187, in _clean_agent_data
    del self.episode_rewards[global_id]
    KeyError: '$0-4'
     
  8. wwaero

    wwaero

    Joined:
    Feb 18, 2020
    Posts:
    42
    Is anyone else having this error with curriculum learning?
     
  9. vincentpierre

    vincentpierre

    Joined:
    May 5, 2017
    Posts:
    160
    Is there a reason you are using v0.14.1 ? I am looking at the code for version 0.15.1 and it seems this part of the code changed between 0.15.0 and 0.15.1(exactly on the line you are reporting).
    I think it is possible that when using curriculum, Agents are deleted when Python sends a reset command to the environment and if the Agent sends a "Done" signal to Python afterwards, it is possible that Python will try to delete the agent AGAIN even though it was deleted during the previous reset (causing errors).
    I would recommend updating the 0.15.1 and if not possible, modify this method directly to safely clean the agent data :
    https://github.com/Unity-Technologi...nts/mlagents/trainers/agent_processor.py#L179
     
  10. rudehouse

    rudehouse

    Joined:
    Feb 16, 2019
    Posts:
    4
  11. vincentpierre

    vincentpierre

    Joined:
    May 5, 2017
    Posts:
    160