Search Unity

Working of self-play training loop?

Discussion in 'ML-Agents' started by dhyeythumar, Jan 16, 2021.

  1. dhyeythumar

    dhyeythumar

    Joined:
    Mar 15, 2020
    Posts:
    7
    I am trying to understand the working of self-play's training loop. And to do this, I created the following diagram. So can anyone confirm that the below diagram is correct for the following example values?

    self_play:
    window: 10
    play_against_latest_model_ratio: 0.5
    save_steps: 50000
    swap_steps: 50000
    team_change: 200000​

    Diagram:
     
    Luke-Houlihan likes this.
  2. andrewcoh_unity

    andrewcoh_unity

    Unity Technologies

    Joined:
    Sep 5, 2019
    Posts:
    162
    Hi @dhyeythumar

    This looks correct to me. Cool diagram!
     
    dhyeythumar likes this.