Search Unity

Question Behavior Name DOES match the one in the trainer configuration file but default config is still used?

Discussion in 'ML-Agents' started by theriser777, Nov 1, 2022.

  1. theriser777

    theriser777

    Joined:
    Feb 2, 2020
    Posts:
    12
    Sorry for the weird phrasing on the title, had to keep it short.

    I want to change the max_step parameter in the configurations, and despite the Behavior Name in Unity matching the one specified in configuration.yaml (that name being "WarshAgent"), MLAgents still insists on hitting me with this:

    [WARNING] Behavior name WarshAgent does not match any behaviors specified in the trainer configuration file. A default configuration will be used.

    And then uses a default configuration, so everytime I changed the max_step, it would get changed back to the default value.

    I did try using the command "python -m mlagents.trainers.upgrade_config configuration.yaml
    WarshAgent.yaml"
    but all that did was copy the content of configuration.yaml to WarshAgent.yaml, even when I deleted configuration.yaml it still created a new one and completely ignored WarshAgent.yaml.

    Here are more details:

    Behavior Parameters:


    configuration.yaml:
    default_settings: null
    behaviors:
    WarshAgent:
    trainer_type: ppo
    hyperparameters:
    batch_size: 1024
    buffer_size: 10240
    learning_rate: 0.0003
    beta: 0.005
    epsilon: 0.2
    lambd: 0.95
    num_epoch: 3
    learning_rate_schedule: linear
    beta_schedule: linear
    epsilon_schedule: linear
    network_settings:
    normalize: false
    hidden_units: 128
    num_layers: 2
    vis_encode_type: simple
    memory: null
    goal_conditioning_type: hyper
    deterministic: false
    reward_signals:
    extrinsic:
    gamma: 0.99
    strength: 1.0
    network_settings:
    normalize: false
    hidden_units: 128
    num_layers: 2
    vis_encode_type: simple
    memory: null
    goal_conditioning_type: hyper
    deterministic: false
    init_path: null
    keep_checkpoints: 5
    checkpoint_interval: 500000
    max_steps: 500000
    time_horizon: 64
    summary_freq: 50000
    threaded: false
    self_play: null
    behavioral_cloning: null
    env_settings:
    env_path: null
    env_args: null
    base_port: 5005
    num_envs: 1
    num_areas: 1
    seed: -1
    max_lifetime_restarts: 10
    restarts_rate_limit_n: 1
    restarts_rate_limit_period_s: 60
    engine_settings:
    width: 84
    height: 84
    quality_level: 5
    time_scale: 20
    target_frame_rate: -1
    capture_frame_rate: 60
    no_graphics: false
    environment_parameters: null
    checkpoint_settings:
    run_id: ppo
    initialize_from: null
    load_model: false
    resume: true
    force: false
    train_model: false
    inference: false
    results_dir: results
    torch_settings:
    device: null
    debug: false

    WarshAgent.yaml:
    behaviors:
    WarshAgent:
    trainer_type: ppo
    hyperparameters:
    batch_size: 1024
    buffer_size: 10240
    learning_rate: 0.0003
    beta: 0.005
    epsilon: 0.2
    lambd: 0.95
    num_epoch: 3
    learning_rate_schedule: linear
    beta_schedule: linear
    epsilon_schedule: linear
    network_settings:
    normalize: false
    hidden_units: 128
    num_layers: 2
    vis_encode_type: simple
    goal_conditioning_type: hyper
    deterministic: false
    reward_signals:
    extrinsic:
    gamma: 0.99
    strength: 1.0
    network_settings:
    normalize: false
    hidden_units: 128
    num_layers: 2
    vis_encode_type: simple
    goal_conditioning_type: hyper
    deterministic: false
    keep_checkpoints: 5
    checkpoint_interval: 500000
    max_steps: 900000000
    time_horizon: 64
    summary_freq: 50000
    threaded: false
    env_settings:
    base_port: 5005
    num_envs: 1
    num_areas: 1
    seed: -1
    max_lifetime_restarts: 10
    restarts_rate_limit_n: 1
    restarts_rate_limit_period_s: 60
    engine_settings:
    width: 84
    height: 84
    quality_level: 5
    time_scale: 20
    target_frame_rate: -1
    capture_frame_rate: 60
    no_graphics: false
    checkpoint_settings:
    run_id: ppo
    load_model: false
    resume: true
    force: false
    train_model: false
    inference: false
    results_dir: results
    torch_settings: {}
    debug: false

    What am I doing wrong? Or am is my understanding of how this works is just wrong?
     
  2. B1aster

    B1aster

    Joined:
    Nov 29, 2021
    Posts:
    1
    Im having the same issue
     
  3. firdiar

    firdiar

    Joined:
    Aug 2, 2017
    Posts:
    25
  4. Peanut_Butcher

    Peanut_Butcher

    Joined:
    Nov 18, 2020
    Posts:
    4
    I found out that typing the command "mlagents-learn <path_to_your_config_file> --run-id=YourSimulationID" works


    For example, for me, it was : "mlagents-learn results/Test/MoveToTarget.yaml --run-id=Test --force" (added "force" to reset the neural network)