Policy in Inference mode

Discussion in 'ML-Agents' started by alowcosta96, Jan 4, 2021.

  1. alowcosta96


    Nov 26, 2020
    Good morning.

    I trained a model through SAC, convergence is reached and everything seems nice, my agent complete his task in around 99.5% cases. I wanted to try the model in inference mode in order to see if higher percentages are reached.

    In inference mode, which is the action chosen by the policy network? In training phase we train a stochastic policy, but in inference mode does this result in a deterministic action? If so, which action is chosen? The mean of the distribution described by stochastic policy? The mode of the distribution described by the stochastic policy?

    Thanks a lot
  2. ervteng_unity


    Unity Technologies

    Dec 6, 2018
    Inference is exactly the same as training - it uses a stochastic policy. This is mainly so we don't have different or unexpected behavior when we export the model to use in a game.