Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. We have updated the language to the Editor Terms based on feedback from our employees and community. Learn more.
    Dismiss Notice

Question Varying System Resource Utilization Results in Different Learning Curves

Discussion in 'ML-Agents' started by batuaytemiz, May 14, 2021.

  1. batuaytemiz

    batuaytemiz

    Joined:
    Apr 1, 2015
    Posts:
    3
    Here is the environment I am working on to give you an idea:
    Jump.gif (now the jump flying is fixed :D)

    I realized during a hyperparameter sweep that the number of agents and Time.TimeScales affects the final results immensely, with reward performance suffering when either one is too big.

    This suggested to me that whenever there are some resource constraints on the machine this might affect the training. This went against my intuition so I decided to test it.

    Below are two groups of runs. SAC_SingleProcess means that I am only running one python training instance. (3 seeds are run sequentially), and SAC_MultiProcessOverTaxes is where I am running 8 python instances in parallel. The environment and the hyperparameters are identical.
    upload_2021-5-14_14-22-57.png
    _MeanTrainingSuccessRate is the average success (defined as agent reaching the goal instead of timing out) of the last 250 episodes, and the final success rate is the average success rate of 1000 episodes with deterministic sampling. The timescale is set to a reasonable 5, and there are 8 agents in the scene.

    I am not using a physics-based character controller (I know the physics simulation can go wild at high timescales) but instead, my agent is using the controller provided in the MicroFPS template, which in turn uses the CharacterController.Move().

    Based on this observation I have a few questions:
    - Is it possible that some of the DecisionRequests are being dropped? By that I mean the python process is too slow While I believe the UnityEnv waits for the response from the python side before continuing the simulation I want to confirm my understanding.

    - Is it possible that CharacterController.Move() acts unexpectedly when the process is resources starved?

    - Should I be multiplying characterVelocity with * Time.deltaTime when using the
    m_Controller.Move(characterVelocity * Time.deltaTime); as shown here?(And why is gravity multiplied twice...?)

    - Any other aspects that you think I should check?

    Thank you for the cool framework and sorry for the flurry of questions!
     
  2. batuaytemiz

    batuaytemiz

    Joined:
    Apr 1, 2015
    Posts:
    3
    Future experiments confirmed there are no frames being dropped but the problem isn't solved. If anyone has any similar experience would be glad to know about!