Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Curriculum Learning Logic

Discussion in 'ML-Agents' started by mertekin, Feb 9, 2021.

?

Wall Jump Curriculum Agent is As It should be?

  1. YES

    0 vote(s)
    0.0%
  2. NO

    1 vote(s)
    33.3%
  3. Partly

    2 vote(s)
    66.7%
  1. mertekin

    mertekin

    Joined:
    Nov 16, 2016
    Posts:
    2
    Hello,

    I'm trying to use curriculum learning. Unity has wall jump example and explained in this doc . As I know curriculum learning is about teaching environment to your agent step by step. Curriculum part of yaml config file has configs for every step and value field in it. Wall Jump agents chooses random environment on every episode and uses different NNmodel for each one. I think environment should be selected or created using curriculum value field and there should be one NNmodel. Am I wrong ? Is it possible to get that value ? How can I do that ?

    Thank you all for your responses.
     
  2. awjuliani

    awjuliani

    Unity Technologies

    Joined:
    Mar 1, 2017
    Posts:
    69
  3. mertekin

    mertekin

    Joined:
    Nov 16, 2016
    Posts:
    2
    hello awjuliani thank you for your answer.As I understand GetWithDefault("small_wall_height") takes value field of that curriculum lesson.. What i wonder is which specific lesson am I in current episode ? So we need a field like name to identify current curriculum step.
     
    Last edited: Feb 10, 2021
  4. m4l4

    m4l4

    Joined:
    Jul 28, 2020
    Posts:
    81
    given a simple config example:

    Code (CSharp):
    1. Lesson_number:
    2.     curriculum:
    3.       - name: Lesson0 # The '-' is important as this is a list
    4.         completion_criteria:
    5.           measure: reward
    6.           behavior: My_B
    7.           signal_smoothing: true
    8.           min_lesson_length: 100
    9.           threshold: 2500
    10.         value: 0
    11.       - name: Lesson1 # This is the start of the second lesson
    12.         completion_criteria:
    13.           measure: reward
    14.           behavior: My_B
    15.           signal_smoothing: true
    16.           min_lesson_length: 100
    17.           threshold: 2500
    18.         value: 1
    you can get the lesson num with:

    Code (CSharp):
    1. EnvironmentParameters m_ResetParams;
    2.  
    3. float lessonNum = m_ResetParams.GetWithDefault("Lesson_number", 0);
    that way, after training, you can also manually set the lesson you want to test.
     
  5. ademord

    ademord

    Joined:
    Mar 22, 2021
    Posts:
    49
    hello, sorry for joining the question party late.
    what does
    - measure: progress
    - threshold: 0.1

    or
    - measure: reward
    - threshold: 2500

    mean ? does it mean that after 10% of max_steps it will go to the next lesson? and that after getting a reward of 2500 or higher on the last 100 lessons it will then go to the next lesson?

    also on the WallJumpAgent.cs there are three variables that are then assigned on ModelOverride, how do these work? are these just different onnx that will be saved with different names or do we have to provide something else / a network trained somewhere else (?).

    public NNModel noWallBrain;
    public NNModel smallWallBrain;
    public NNModel bigWallBrain;
    string m_NoWallBehaviorName = "SmallWallJump";
    string m_SmallWallBehaviorName = "SmallWallJump";
    string m_BigWallBehaviorName = "BigWallJump";