Search Unity

  1. We are migrating the Unity Forums to Unity Discussions. On July 12, the Unity Forums will become read-only.

    Please, do not make any changes to your username or email addresses at id.unity.com during this transition time.

    It's still possible to reply to existing private message conversations during the migration, but any new replies you post will be missing after the main migration is complete. We'll do our best to migrate these messages in a follow-up step.

    On July 15, Unity Discussions will become read-only until July 18, when the new design and the migrated forum contents will go live.


    Read our full announcement for more information and let us know if you have any questions.

Continue at last step for loaded training?

Discussion in 'ML-Agents' started by JPhilipp, Jan 31, 2020.

  1. JPhilipp

    JPhilipp

    Joined:
    Oct 7, 2014
    Posts:
    56
    I'm having 3 different Runs (different Unity & Yaml Config settings) for my agent training. How can I ensure when continuing a past run training of these (using "... --train --load" on the blue run as pictured; it's using linear_rate, by the way) that it will continue from the last step it stopped, instead of jumping back to the very left in the graph?



    Thanks!
     
  2. mbaske

    mbaske

    Joined:
    Dec 31, 2017
    Posts:
    473
  3. JPhilipp

    JPhilipp

    Joined:
    Oct 7, 2014
    Posts:
    56
    The latest checkpoint file now contains:

    Code (CSharp):
    1. model_checkpoint_path: "model-135397.cptk"
    2. all_model_checkpoint_paths: "model-50000.cptk"
    3. all_model_checkpoint_paths: "model-100000.cptk"
    4. all_model_checkpoint_paths: "model-135397.cptk"
     
  4. JPhilipp

    JPhilipp

    Joined:
    Oct 7, 2014
    Posts:
    56
    I'm still having problems with this, does anyone know what to do to continue the training exactly where it left off?

    As it is, the Steps counter resets to zero everytime I use load, even when I know it does load the neural network (based on its performing level). When I then pick "relative" in Tensorboard it helps a bit -- at least it displays the lines chained side by side -- but it still feels like sometimes, the training cumulative success takes a brief but heavy fall before it recovers (I reckon that might be because it measures the training rate differently, as it thinks it's on step 0 again, and not say 100k).
     
  5. JPhilipp

    JPhilipp

    Joined:
    Oct 7, 2014
    Posts:
    56
  6. MaksChojniak

    MaksChojniak

    Joined:
    Feb 3, 2022
    Posts:
    2
    when creating an agent with ml agents for the first 9 million steps everything went well, but after that the agent became even worse than before. How can I go back to a certain step. My chart looks like this : https://ibb.co/0VpY1b4