Search Unity

Question Resume training from specific checkpoint with Pytorch?

Discussion in 'ML-Agents' started by mbaske, Jan 7, 2021.

  1. mbaske

    mbaske

    Joined:
    Dec 31, 2017
    Posts:
    473
    Hi - With Tensorflow, I was just editing the checkpoints textfile if I wanted to resume training from a specific checkpoint. What do I do when using Pytorch (checkpoint.pt looks wild..)? Thanks!
     
  2. celion_unity

    celion_unity

    Joined:
    Jun 12, 2019
    Posts:
    289
    Looks like we always load from "checkpoint.pt", so you should just be able to copy (or symlink) the checkpoint you want to that name.
    https://github.com/Unity-Technologi...ners/model_saver/torch_model_saver.py#L82-L83

    We have MLA-1517 in our internal tracker to provide a better interface for this on the command line. It should be a pretty straightforward PR (I think) if anybody reading this wants to try to implement it.
     
    Raikir-i-sh and mbaske like this.