Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. We have updated the language to the Editor Terms based on feedback from our employees and community. Learn more.
    Dismiss Notice

Policy in Inference mode

Discussion in 'ML-Agents' started by alowcosta96, Jan 4, 2021.

  1. alowcosta96

    alowcosta96

    Joined:
    Nov 26, 2020
    Posts:
    8
    Good morning.

    I trained a model through SAC, convergence is reached and everything seems nice, my agent complete his task in around 99.5% cases. I wanted to try the model in inference mode in order to see if higher percentages are reached.

    In inference mode, which is the action chosen by the policy network? In training phase we train a stochastic policy, but in inference mode does this result in a deterministic action? If so, which action is chosen? The mean of the distribution described by stochastic policy? The mode of the distribution described by the stochastic policy?

    Thanks a lot
     
  2. ervteng_unity

    ervteng_unity

    Unity Technologies

    Joined:
    Dec 6, 2018
    Posts:
    150
    Inference is exactly the same as training - it uses a stochastic policy. This is mainly so we don't have different or unexpected behavior when we export the model to use in a game.