Search Unity

What is the meaning of "network_settings -> normalize" in Configurations

Discussion in 'ML-Agents' started by biu_biubiu, Sep 29, 2020.

  1. biu_biubiu

    biu_biubiu

    Joined:
    Dec 24, 2017
    Posts:
    5
    For example, in the walker environment, if the normalization is set to false, it will learning nothing. In the C # code, the input is not scaled between 0 and 1, so the normalization should be done by network_ Settings > normalize, right? That's where I'm confused.
    if normalize is true, is that means the start of network is

    torch.nn.BatchNorm1d(obs_features)
    torch.nn.Relu() ?

    I'm using my own RL code and stuck here. if it means BatchNorm1d(), due to the need to interact with the environment, the network can only get one OBS from the environment each time, so the running average and variance calculated in this way will have a very large error, especially at the beginning. I did it without success.

    Did I get the wrong understanding, or that the official code first adopted random actions to get a more appropriate running average and variance (I didn't fully understand the official code)?
     
  2. henrypeteet

    henrypeteet

    Unity Technologies

    Joined:
    Aug 19, 2020
    Posts:
    37
    Thanks for the question. The normalization we use is based on a running average across all previous data in a session (not just one batch). I have included some links below to where and how the normalization is done but please let me know if you have other questions.

    Config docs
    From: https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Training-Configuration-File.md
    "normalization is based on the running average and variance of the vector observation".

    Specific
    implementations

    Tensorflow: https://github.com/Unity-Technologi...ents/mlagents/trainers/tf/models.py#L230-L273

    Torch: https://github.com/Unity-Technologi...ts/mlagents/trainers/torch/encoders.py#L8-L43

    When we process the data
    [PPO] https://github.com/Unity-Technologi...l-agents/mlagents/trainers/ppo/trainer.py#L70
    [SAC] https://github.com/Unity-Technologi...-agents/mlagents/trainers/sac/trainer.py#L133
     
    biu_biubiu likes this.
  3. biu_biubiu

    biu_biubiu

    Joined:
    Dec 24, 2017
    Posts:
    5
    Thank you for your help. I see.
     
  4. TulioMMo

    TulioMMo

    Joined:
    Dec 30, 2020
    Posts:
    29
    During Inference is the "test data" being normalised as well? Does the onnx file keeps a record of the average and variance from the training data and utilises that? Thanks!