Search Unity

  1. Unity 6 Preview is now available. To find out what's new, have a look at our Unity 6 Preview blog post.
    Dismiss Notice
  2. Unity is excited to announce that we will be collaborating with TheXPlace for a summer game jam from June 13 - June 19. Learn more.
    Dismiss Notice
  3. Dismiss Notice

Question Why does Inference act differently than Training?

Discussion in 'ML-Agents' started by ohclipe, Mar 6, 2024.

  1. ohclipe

    ohclipe

    Joined:
    Sep 11, 2023
    Posts:
    1
    I am trying to teach an agent to follow markings on the floor, simple enough. But the resulting policy of this agent is going to be used in the real world, with real sensors (a camera). Because of that, I didn't use Unity's physics to simulate the episodes, I built a code that calculated the expected behavior of certain action. It should work fine in this order:
    1 - Get sensory input.
    2 - Decide which action to take.
    3 - Literally teleport to next state.
    4 - Repeat.
    The agent runs perfectly when training, but unfortunately when getting the .onnx to test the agent's behavior it is almost completely different from the training (training gets 100% accuracy inference gets ~30%).

    Myself when trying to solve this issue had seen in this forum about how this problem could relate to timing and only possibly timing I tried to control the Academy steps and call them manually. For that I created an Overseer.cs.
    Code (CSharp):
    1. public class Overseer: MonoBehaviour
    2. {
    3.     public static Overseer Instance ( get; private set; }
    4.     private int _mAgents = 0;
    5.     private int _mReady = 0;
    6.  
    7.     private void Awake()
    8.     {
    9.         if (Instance == null)
    10.             {
    11.                  Instance = this;
    12.             }
    13.     }
    14.  
    15.     public void Update()
    16.     {
    17.         if (Istance == this)
    18.         {
    19.             Debug.Log("agents=" + Instance._mAgents + " ready=" + Instance._mReady);
    20.             if (Instance._mReady == Instance._mAgents)
    21.             {
    22.                 Instance._mReady = 0;
    23.                 Academy.Instance.EnvironmentStep();
    24.             }
    25.         }
    26.     }
    27.  
    28.     public void AddAgent()
    29.     {
    30.         Instance._mAgents += 1;
    31.     }
    32.  
    33.     public void ReadyAgent()
    34.     {
    35.         Instance._mReady += 1;
    36.     }
    37.  
    38. }
    39. }
    Agent's Start method and the end of OnActionReceived for context, :
    Code (CSharp):
    1. private void Start()
    2. {
    3.     Academy.Instance.AutomaticSteppingEnabled = false;
    4.     RequestDecision();
    5.     overseer.AddAgent();
    6.     overseer.ReadyAgent();
    7. }
    Code (CSharp):
    1. private void Start()
    2. {
    3.     Academy.Instance.AutomaticSteppingEnabled = false;
    4.     RequestDecision();
    5.     overseer.AddAgent();
    6.     overseer.ReadyAgent();
    7. }
    8.  
    9. public override void OnActionReceived(ActionBuffers actions)
    10. {
    11. // code
    12.  
    13. RequestDecision();
    14. overseer.ReadyAgent();
    15. }
    16.  
    Setting DecisionRequester's Decision Period to 2, so that two Academy.Instance.EnvironmentStep() calls are required for the agent to move:

    • Running inference in my 8 agent scene, the console logs "agents=8 ready=8" for every time Update is called.
    • Running training, the console logs "agents=8 ready=8" once (at the start all agents are ready) but then endlessly prints "agents=8 ready=0" as expected.

    What's going on? Why does inference apparently does not care for Academy steps? And what could I possibly do to stabilize Inference (It's actions look quite random compared to when training)?

    Thank you for reading!