Search Unity

  1. Unity 6 Preview is now available. To find out what's new, have a look at our Unity 6 Preview blog post.
    Dismiss Notice
  2. Unity is excited to announce that we will be collaborating with TheXPlace for a summer game jam from June 13 - June 19. Learn more.
    Dismiss Notice

Question ML_AGENT POCA Training, One side learning well, other side bad

Discussion in 'ML-Agents' started by InquisitorTR, Mar 21, 2024.

  1. InquisitorTR

    InquisitorTR

    Joined:
    Nov 28, 2022
    Posts:
    5
    Hi, I am trying to train competitive 2 vs 2 hockey game. It is symetrical game. I am using MA-POCA. My problem one of the teams learning very well but other side is very bad. How this can happen? MA-POCA algorthm using self-play and if one side is playing good, other team also should play well because it is using old policy. But this is not happening.

    I also want to ask you one more thing. I add “footballer begining side” input which is 0 or 1. I do this because agent can start right sided or left sided so it should know which post he should score. Should I remove this king of input? Thanks for answers.

    I check my team_id’s → They should be different so I arrange like that

    I am sharing my config file

    behaviors:
    SoccerAgent:
    trainer_type: poca
    hyperparameters:
    batch_size: 1024
    buffer_size: 10240
    learning_rate: 0.0003
    beta: 0.005
    epsilon: 0.2
    lambd: 0.95
    num_epoch: 3
    learning_rate_schedule: constant
    network_settings:
    normalize: false
    hidden_units: 512
    num_layers: 2
    vis_encode_type: simple
    goal_conditioning_type: none
    reward_signals:
    extrinsic:
    gamma: 0.99
    strength: 1.0
    keep_checkpoints: 5
    max_steps: 5000000
    time_horizon: 1024
    summary_freq: 20000
    self_play:
    save_steps: 100000
    team_change: 500000
    swap_steps: 10000
    window: 10
    play_against_latest_model_ratio: 0.5
    initial_elo: 1200.0
     
  2. smallg2023

    smallg2023

    Joined:
    Sep 2, 2018
    Posts:
    153
    if 1 side is able to train but the other isn't it sounds like the observations are not quite right, did you base your project on the demos?
     
    InquisitorTR likes this.
  3. InquisitorTR

    InquisitorTR

    Joined:
    Nov 28, 2022
    Posts:
    5
    Thank you brother, as you say observation order should change for second team, I arrange that and now it is working.