Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Question time_horizon configuration settings for my case

Discussion in 'ML-Agents' started by mcdenyer, Jul 27, 2021.

  1. mcdenyer

    mcdenyer

    Joined:
    Mar 4, 2014
    Posts:
    48
    I have been working on ml-agents for my 2d-game where the agent simply must get from point start to finish.

    The mechanics in this game are such that your movement from the very start of the level can effect your position/velocity at end of the level and every moment in between(swinging/grappling into projectile motion that will be followed by more grappling/swinging which will be dependent on previous actions.)

    Due to this mechanic I have been zeroing in on the time_horizon variable in the configuration and am trying to optimize this setting for my game.

    1. Am I correct in focusing on this parameter for telling the agent how far back in it's sequence of actions to consider in terms since in my game the agents actions from as earliest to the very start of the level can effect its success much later on in the level?

    2. Assuming I am correct in my understanding of time horizon how do I calculate the proper value for time_horizon and will it need to be adjusted for varying lengths of levels?
     
  2. ruoping_unity

    ruoping_unity

    Unity Technologies

    Joined:
    Jul 10, 2020
    Posts:
    134
    time_horizon affects the value estimate (how you calculate expected total reward in the future) in training, more detailed explanation is here. It be set to a number large enough to capture all the important behavior within a sequence of actions. However this config doesn't explicitly enable "memory" for the agents if that's what you're looking for. If you want to tell the agent explicitly how many previous steps should be considered, you would probably want to look at the LSTM settings. Be cautious that if you enable LSTM with a too large sequence_length, the training can be very slow and very unstable.
     
    mcdenyer likes this.
  3. mcdenyer

    mcdenyer

    Joined:
    Mar 4, 2014
    Posts:
    48
    In the definition of time_horizon: "This number should be large enough to capture all the important behavior within a sequence of an agent's actions." this is what made me focus on time horizon as the due the to the mechanics of my game everything is sequential in terms of projectile motion.

    When you initiate a new swing the agent is dependent on previous swings because it determines the location and current velocity going into the next swing. My intuition is that this is not so much explicit memory but more a sequence of actions and that LSTM is more appropriate for a puzzle where for example you choose option A at point X and you need to remember that when you get to point Y so you can choos Option B. Does my intuition seem accurate or am I still off?

    Are there example ml-agents projects that demonstrate the use of varying time_horizon and LSTM settings?