Search Unity

  1. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

Loading ML-Agents Trained Froze Graph Into Tensorflow

Discussion in 'ML-Agents' started by billwaahau, Jul 15, 2020.

  1. billwaahau


    Jan 14, 2017
    Hi, I am trying to load the ML-Agents trained neural network back into python to run inference. I guess the Tensorflow Froze Graph is the best approach? However, I am running into some difficulty, since I am not quite sure what some of the items are.

    By experiment, here are the input and output layers. I am not exactly sure what action_masks is, but Tensorflow keep giving me an error without including it.

    Code (CSharp):
    2. input0 = graph.get_tensor_by_name('prefix/vector_observation:0')
    3. intput1 = graph.get_tensor_by_name('prefix/action_masks:0')
    5. output = graph.get_tensor_by_name('prefix/action:0')

    best photo sharing websites

    My original neural network has 52 observations and the output has 2 branches with 3 possibilities each.

    Input = [52 items]
    Output = [ 0, 1 or 2 ] ; [ 0, 1 or 2 ] = [out1] ; [out2]

    Code (csharp):
    2. Vertical Movement = [0 , 1 , 2]       0 - no action   1 - forward      2-    backward
    3. Horizontal Movement = [0 , 1 , 2]     0 - no action   1 - turn left    2-  turn right

    This is the code I tried with running inference... not sure how to make sense of the action mask and output

    Code (csharp):
    2. with tf.Session(graph=graph) as sess:
    3.     y_out =, feed_dict={
    4.             x: [[0, 0, 30, 50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50, False, False, False, False, False, False,True,False,False,False, True, False, False]],
    5.             x1:[[0,1,2,0,1,2]]
    6.         })
    Code (csharp):
    2. array([[-1.6118095e+01, -1.5712630e+01,  1.1920928e-07, -1.6118095e+01,
    3.        -1.7318338e-02, -4.0646248e+00]], dtype=float32)

    Any idea how to make this work? Thanks!
  2. vincentpierre


    May 5, 2017
    Running trained models outside of Unity is not a supported feature. These models are made to work with the Unity Inference Engine which is why they are hard to read from Python.
    Action masks are for masking some of the actions, The values can be zeros or ones. If you don't want to mask anything, make this tensor all zeros (or all ones I do not remember).
    Regarding the actions, for legacy reasons, the output corresponds to the log probabilities of the actions for each branches. Since you have 2 actions with 3 possibilities, the 3 first numbers correspond to the log probabilities of the first branch and the last 3 to the second.
    To sample from them, you need to exponentiate them and sample using a multinomial distribution. For example:

    Code (csharp):
    1.  array([[-1.6118095e+01, -1.5712630e+01,  1.1920928e-07, -1.6118095e+01, -1.7318338e-02, -4.0646248e+00]], dtype=float32) [\code]
    means that the first action has logits : -1.6118095e+01, -1.5712630e+01, 1.1920928e-07
    Note that exp(-1.6118095e+01)+ exp(-1.5712630e+01) + exp( 1.1920928e-07) = 1
    And the second logits -1.6118095e+01, -1.7318338e-02, -4.0646248e+00
    Also note that exp( -1.6118095e+01)+ exp(-1.7318338e-02) + exp( -4.0646248e+00) = 1

    If we look only at the first branch, -1.6118095e+01, -1.5712630e+01, 1.1920928e-07
    after exponential becomes 1.00000065e-7, 1.50000081e-7 and 0.99999988079, so the for the first branch, the selected action is 2 (with very high certaincy).