Search Unity

Question Buffer Size and Sensor observations

Discussion in 'ML-Agents' started by seboz123, Aug 14, 2020.

  1. seboz123

    seboz123

    Joined:
    Mar 7, 2020
    Posts:
    24
    Hi,

    I've got two seperate questions:

    First one regarding PPO: Why does the buffer size matter for learning?
    For example: Buffer size 10k and batch size 1k -> Updates the model when 10k experiences are collected 10 times. But when Buffer size is 1k and batch size is 1k -> The model just gets updated more frequently (every 1k steps so after 10k steps also 10k differnet observations should be used for an update)?
    Also why should you multiply it with num_envs, since more parallel envs should only mean faster generation of samples and not something different with the buffer?

    Second question is regarding the obeservations in the Agent script. Is there a way to access the observations of the Vectorsensor Component? I can add them in the CollectObservations function, but can I access them in a different function?

    Thanks in advance!
     
  2. jeffrey_unity538

    jeffrey_unity538

    Unity Technologies

    Joined:
    Feb 15, 2018
    Posts:
    59
    hi seboz123. Buffer size matters in order to store enough experience for training. Depending on the scenario, if you need to train on more "recent" experience, then lowering the buffer size might make more sense. And vice versa. However, there is a memory (and hence speed of training) tradeoff if the buffer sizes are really large. Larger batches may shrink down the number of training steps required to let the behavior converge to a minima. A large buffer could contain old experiences, which are not useful anymore for updating the agent's policy, thus more training steps may be required.

    Checking on your second question.
     
  3. jeffrey_unity538

    jeffrey_unity538

    Unity Technologies

    Joined:
    Feb 15, 2018
    Posts:
    59
    for your 2nd question, are you trying to access this via the heuristic function?
     
  4. seboz123

    seboz123

    Joined:
    Mar 7, 2020
    Posts:
    24
    Thanks for your fast response.
    Thanks for clarification! So the buffer does store all experiences right? Not only finished trajectories.
    No I was trying to access it in the OnActionReceived function -> I was trying to create an text overlay with all the oberservations, so I can see the observations during training. But for me it doesnt really matter in which function I can access them. I can store the manually when setting them in the CollectOberservations function. But I was looking for a more elegant solution.
     
  5. Luke-Houlihan

    Luke-Houlihan

    Joined:
    Jun 26, 2007
    Posts:
    303
    There is a function called
    GetObservations()
    that you can call on the agent, heres the docs description of it -

    ReadOnlyCollection<float> GetObservations ( )
    Returns a read-only view of the observations that were generated in CollectObservations(VectorSensor).

    This is mainly useful inside of a Heuristic(float[]) method to avoid recomputing the observations.
    Returns
    A read-only view of the observations list.