Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Question about Stacking observation order

Discussion in 'ML-Agents' started by shizsun0609tw, Mar 9, 2022.

  1. shizsun0609tw

    shizsun0609tw

    Joined:
    Nov 1, 2018
    Posts:
    2
    Hi,
    I checked the doc about stacking observations before I used to collect visual observations in Python.

    https://github.com/Unity-Technologies/ml-agents/blob/main/docs/Learning-Environment-Design-Agents.md

    #### Stacking
    Stacking refers to repeating observations from previous steps as part of a
    larger observation. For example, consider an Agent that generates these
    observations in four steps

    step 1: 0.1
    step 2: 0.2
    step 3: 0.3
    step 4: 0.4

    If we use a stack size of 3, the observations would instead be:

    step 1: [0.1, 0.0, 0.0]
    step 2: [0.2, 0.1, 0.0]
    step 3: [0.3, 0.2, 0.1]
    step 4: [0.4, 0.3, 0.2]

    The doc says the stacking order is like (t, t-1, t-2,...), but I get the stacking order is like (..., t-2, t-1, t) in Python.
    I checked the StackingSensors.cs in package, and I think the order is actually like this(.., t-2, t-1, t).
    I confuse that the order of doc said. Does anyone have the same question?
     
  2. ChillX

    ChillX

    Joined:
    Jun 16, 2016
    Posts:
    145
    The order in which the stacked observations are sent to Python may not necessarily match the order in which Python is processing them. It could be that the tensors are stacked differently in Python. If so this could be because of some internal implementation details such as perhaps support for hyper conditioning etc... No idea exactly why.

    Having said that either way the order in which they are stacked would have little if any impact for a neural network itself.
    You could take an input tensor for PyTorch, slice it in multiple parts and then concat it back in a totally different order and it would make zero difference as long as the method used remains consistent throughout.

    Note: I find that memory enhanced usually provides better results than stacking. Consider that stacking is brute force memory while using the LSTM from the memory enhanced feature is smart memory.
     
    shizsun0609tw likes this.
  3. shizsun0609tw

    shizsun0609tw

    Joined:
    Nov 1, 2018
    Posts:
    2
    So, the stacked observations, maybe have the different order when I packed the environment wrapped to Gym environment.
    The stacked observations are only keeping the doc order in using mlagents.

    I used mlagents to generate the environment, wrapped it to gym environment, and then used another Imitation Learning package(ilpyt) to train it. I want to build the neural network model by Transformer.

    Exactly, I think that I need to pack the observations in the correct order by myself. Thanks your reply!