Search Unity

Unity ML && Python

Discussion in 'ML-Agents' started by unity_5kcJqPVlb9nWrA, Apr 18, 2021.

  1. unity_5kcJqPVlb9nWrA

    unity_5kcJqPVlb9nWrA

    Joined:
    Apr 15, 2020
    Posts:
    27
    Hello , Unity Team!

    I have started playing a bit with Unity ML and I have 2 questions. First of all I would like to apologize if I shouldn't ask about them here , I just didn't found the right track , so feel free to guide me if this isn't appropriate.

    Question 1. Unity ML package talks with a python script and that pyton uses PyTorch for the computational graph and brings back information to Unity. What is the IPC here. Shared memory or Kernel messages? How to know? I think it is shared memory because it would be much slower with the kernel procedure. Probably when we start training the kernel does some block allocation and this is the shared memory between our two processes?

    Question 2. When we are training Unity Documentation says to have couple of training examples at the same time. For example 9 instead of 1. Why? How is this speeding things up at all? We are not doing multiple threads nor processes , what we are doing is searching 9 times for an optimal policy? I have troubles understanding why that is going to be faster.

    With wishes for great and productive Day/Night!

    Ivailo.M
     
  2. ruoping_unity

    ruoping_unity

    Unity Technologies

    Joined:
    Jul 10, 2020
    Posts:
    134
    First of all, this is the best place for asking ML-Agents questions :)

    To your questions:
    1. The communicator uses Grpc to send/receive messages. So no it's not shared memory.

    2. The training time is spent on two major part - simulation (running the game with current model to collect data) and update (update the neural network model using the data collected in simulation stage). In reinforcement learning the former is even more of a bottleneck since with one game instance there's no way to "parallelize" it. You have to wait for the game episodes run and finish until you get enough data for training. So the logic here is, if we have N training instances running at the same time instead of 1, we can collect data N times faster and therefore the speed up. During update, we use data from all training instances to update the model once, just like with 1 training instance.

    Hope that clears your questions.
     
    unity_5kcJqPVlb9nWrA likes this.
  3. unity_5kcJqPVlb9nWrA

    unity_5kcJqPVlb9nWrA

    Joined:
    Apr 15, 2020
    Posts:
    27
    .... Oh my ! This is the best support I have ever seen.

    Thank you , * ruoping_uunity *. I appreciate it a lot !
     
    TreyK-47 likes this.