Search Unity

  1. All Pro and Enterprise subscribers: find helpful & inspiring creative, tech, and business know-how in the new Unity Success Hub. Sign in to stay up to date.
    Dismiss Notice
  2. Dismiss Notice

Help Wanted Can't load trained model for inference

Discussion in 'ML-Agents' started by justkittenaround, May 26, 2021.

  1. justkittenaround

    justkittenaround

    Joined:
    Sep 28, 2020
    Posts:
    19
    Alright, so I have done a lot of Unity ML-Agents projects and I haven't run into this bad boy... until today. I made a custom environment, very simple. The agent has to jump and collide with a target gameobject. Everything works good. I use heuristic, it works. I then train 10 different algorithms, PPO, SAC, extrinsic vs curiosity, with and without memory, etc. Spent two whole days figuring our the hyperparameters.

    Go to load any one of the 10 .onnx files into the behavior parameters, drag and drop the file in like usual. Immediate error and the file icon doesn't switch to the model graphic, still looks like an A11 white paper. Can't drag into the model, can't select it from the pop-up menu.

    Literally nothing has changed in the environment or any files since I finished training. Did this same process three days ago with a different project with no problems. Please help!

    This is the error I received:

    Code (CSharp):
    1. Asset import failed, "Assets/results/search.onnx" > OnnxImportException: Unexpected error while parsing layer 56 of type Gemm.
    2. Assertion failure. Values are not equal.
    3. Expected: 16512 == 16768
    4.  
    5. Json: { "input": [ "55", "network_body.linear_encoder.seq_layers.0.weight", "network_body.linear_encoder.seq_layers.0.bias" ], "output": [ "56" ], "name": "Gemm_12", "opType": "Gemm", "attribute": [ { "name": "alpha", "f": 1, "type": "FLOAT" }, { "name": "beta", "f": 1, "type": "FLOAT" }, { "name": "transB", "i": "1", "type": "INT" } ] }
    6.   at UnityEngine.Assertions.Assert.Fail (System.String message, System.String userMessage) [0x0003c] in /home/bokken/buildslave/unity/build/Runtime/Export/Assertions/Assert/AssertBase.cs:29
    7.   at UnityEngine.Assertions.Assert.AreEqual[T] (T expected, T actual, System.String message, System.Collections.Generic.IEqualityComparer`1[T] comparer) [0x0004d] in /home/bokken/buildslave/unity/build/Runtime/Export/Assertions/Assert/AssertGeneric.cs:31
    8.   at UnityEngine.Assertions.Assert.AreEqual[T] (T expected, T actual, System.String message) [0x00001] in /home/bokken/buildslave/unity/build/Runtime/Export/Assertions/Assert/AssertGeneric.cs:19
    9.   at UnityEngine.Assertions.Assert.AreEqual (System.Int32 expected, System.Int32 actual) [0x0000c] in /home/bokken/buildslave/unity/build/Runtime/Export/Assertions/Assert/AssertPrimitiveTypes.cs:176
    10.   at Unity.Barracuda.Tensor.Reshape (Unity.Barracuda.TensorShape newShape, System.String newName) [0x00001] in /home/whale/Documents/Unity/Bananna/Library/PackageCache/com.unity.barracuda@1.0.4/Barracuda/Runtime/Core/Tensor.cs:810
    11.   at Unity.Barracuda.ONNXModelImporter.SwapSpatialDimensionsAndFeaturesInMatMulWeights (Unity.Barracuda.Tensor weights, System.Int32 featureCount, Unity.Barracuda.VariableTensor+Layout layout) [0x00051] in /home/whale/Documents/Unity/Bananna/Library/PackageCache/com.unity.barracuda@1.0.4/Barracuda/Editor/ONNXModelImporter.cs:843
    12.   at Unity.Barracuda.ONNXModelImporter.<.ctor>b__14_58 (Unity.Barracuda.ModelBuilder net, Unity.Barracuda.ONNXNodeWrapper node) [0x00074] in /home/whale/Documents/Unity/Bananna/Library/PackageCache/com.unity.barracuda@1.0.4/Barracuda/Editor/ONNXModelImporter.cs:623
    13.   at Unity.Barracuda.ONNXModelImporter.ConvertOnnxModel (Onnx.ModelProto onnxModel) [0x00367] in /home/whale/Documents/Unity/Bananna/Library/PackageCache/com.unity.barracuda@1.0.4/Barracuda/Editor/ONNXModelImporter.cs:1088
    14.  
    15. Unity.Barracuda.ONNXModelImporter.Err (Unity.Barracuda.Model model, System.String layerName, System.String message, System.String extendedMessage, System.String debugMessage) (at Library/PackageCache/com.unity.barracuda@1.0.4/Barracuda/Editor/ONNXModelImporter.cs:1404)
    16. Unity.Barracuda.ONNXModelImporter.ConvertOnnxModel (Onnx.ModelProto onnxModel) (at Library/PackageCache/com.unity.barracuda@1.0.4/Barracuda/Editor/ONNXModelImporter.cs:1097)
    17. Unity.Barracuda.ONNXModelImporter.OnImportAsset (UnityEditor.Experimental.AssetImporters.AssetImportContext ctx) (at Library/PackageCache/com.unity.barracuda@1.0.4/Barracuda/Editor/ONNXModelImporter.cs:1005)
    18. UnityEditor.Experimental.AssetImporters.ScriptedImporter.GenerateAssetData (UnityEditor.Experimental.AssetImporters.AssetImportContext ctx) (at /home/bokken/buildslave/unity/build/Modules/AssetPipelineEditor/Public/ScriptedImporter.cs:22)
    19. UnityEditorInternal.InternalEditorUtility:ProjectWindowDrag(HierarchyProperty, Boolean)
    20. UnityEngine.GUIUtility:ProcessEvent(Int32, IntPtr) (at /home/bokken/buildslave/unity/build/Modules/IMGUI/GUIUtility.cs:197)
    21.  
    22. UnityEditorInternal.InternalEditorUtility:ProjectWindowDrag(HierarchyProperty, Boolean)
    23. UnityEngine.GUIUtility:ProcessEvent(Int32, IntPtr)
     
  2. vincentpierre

    vincentpierre

    Unity Technologies

    Joined:
    May 5, 2017
    Posts:
    142
    This looks like an inference engine issue. You might have encountered a scenario that generates a model that does not import properly. Since the error happens regardless of the training algorithm, I suspect the error is in the encoder of the observation. Could you post the model so we can inspect it?
    It might also be an issue with the version of Barracuda you are using, which version do you have?
     
  3. justkittenaround

    justkittenaround

    Joined:
    Sep 28, 2020
    Posts:
    19

    Hi, Thank you for the reply! Sorry I did not respond sooner, for some reason I was expecting an email notification from any replies that never came. Anyways...

    How would I check the Barracuda version?
    These are my other package versions
    Code (CSharp):
    1. mlagents                      0.23.0
    2. mlagents-envs              0.23.0
    3. python                           3.6
    4. tensorboard                   1.15.0
    5. tensorboard-plugin-wit        1.7.0
    6. tensorflow                    1.15.0
    7. tensorflow-estimator          1.15.1
    8. tensorflow-tensorboard        0.4.0
    9. torch                         1.7.0+cu101
    10. torchaudio                    0.7.0
    11. torchfile                     0.1.0
    12. torchtext                     0.7.0
    13. torchvision                   0.8.1+cu101
    14. nvcc: NVIDIA (R) Cuda compiler driver
    15. Copyright (c) 2005-2018 NVIDIA Corporation
    16. Built on Sat_Aug_25_21:08:01_CDT_2018
    17. Cuda compilation tools, release 10.0, V10.0.130
    18. NVIDIA-SMI 450.119.03   Driver Version: 450.119.03   CUDA Version: 11.0
    And my script on the agent.
    Code (CSharp):
    1. using System.Collections;
    2. using System.Collections.Generic;
    3. using UnityEngine;
    4. using Unity.MLAgents;
    5. using Unity.MLAgents.Sensors;
    6.  
    7. public class sheepMelon : Agent
    8. {
    9.     private Rigidbody rBody;
    10.     private AudioSource audioSource;
    11.     public GameObject cubeTool;
    12.     public GameObject Target;
    13.     private GameObject toolprefab;
    14.     private GameObject targetprefab;
    15.  
    16.     // Start is called before the first frame update
    17.     void Start()
    18.     {
    19.       rBody = GetComponent<Rigidbody>();
    20.       audioSource = GetComponent<AudioSource>();
    21.     }
    22.  
    23.     // Update is called once per frame
    24.     public override void OnEpisodeBegin()
    25.     {
    26.         this.transform.position = new Vector3(Random.Range(-6.5f,1.0f), 0f, Random.Range(-19.5f,-10.5f));
    27.         int R = Random.Range(-180,180);
    28.         transform.rotation = Quaternion.Euler(0, R, 0);
    29.         rBody.velocity = new Vector3(0,0,0);
    30.  
    31.         toolprefab = Instantiate(cubeTool, new Vector3(-3.0f, 0f, -15.0f), Quaternion.identity);
    32.         targetprefab = Instantiate(Target, new Vector3(Random.Range(-6.5f,1.0f), 3.0f, Random.Range(-19.5f,-10.5f)), Quaternion.identity);
    33.  
    34.         if (toolprefab.transform.position.y < -5)
    35.         {
    36.           EndEpisode();
    37.         }
    38.     }
    39.  
    40.  
    41.  
    42.  
    43.     public override void CollectObservations(VectorSensor sensor)
    44.     {
    45.       sensor.AddObservation(this.transform.localPosition);
    46.       sensor.AddObservation(rBody.velocity);
    47.     }
    48.  
    49.  
    50.  
    51.  
    52.     private bool canJump;
    53.     public float speed = 4;
    54.     public float jumpForce = 70f;
    55.     public Vector3 rotateSpeed = new Vector3(0,120,0);
    56.  
    57.     public override void OnActionReceived(float[] vectorAction)
    58.     {
    59.       // AddReward(-0.01f);
    60.  
    61.       Vector3 controlSignal = Vector3.zero;
    62.       controlSignal.x = vectorAction[0];
    63.       controlSignal.z = vectorAction[1];
    64.       controlSignal.Normalize();
    65.  
    66.       Quaternion deltaRotation = Quaternion.Euler(controlSignal.z * rotateSpeed * Time.deltaTime);
    67.       rBody.MoveRotation(rBody.rotation * deltaRotation);
    68.       rBody.MovePosition(rBody.position + transform.forward * speed * controlSignal.x * Time.deltaTime);
    69.  
    70.       float isJump = vectorAction[2];
    71.       if (isJump > 0 && canJump==true)
    72.       {
    73.         rBody.AddForce(Vector3.up*jumpForce);
    74.         canJump = false;
    75.       }
    76.  
    77.  
    78.       if (this.transform.position.y < -10f)
    79.       {
    80.         SetReward(-1.0f);
    81.         Destroy(toolprefab);
    82.         Destroy(targetprefab);
    83.         EndEpisode();
    84.       }
    85.       if (this.transform.position.y > 10f)
    86.       {
    87.         Destroy(toolprefab);
    88.         Destroy(targetprefab);
    89.         EndEpisode();
    90.       }
    91.     }
    92.  
    93.     private void OnCollisionStay(Collision collided)
    94.     {
    95.       // Debug.Log(collided.gameObject.tag);
    96.       if (collided.gameObject.tag == "floor")
    97.       {
    98.         canJump = true;
    99.       }
    100.     }
    101.  
    102.     private void OnTriggerEnter(Collider other)
    103.     {
    104.         if (other.gameObject.CompareTag("melon"))
    105.         {
    106.             SetReward(1.0f);
    107.             Debug.Log("TARGET ACQUIRED!!");
    108.             audioSource.Play();
    109.             Destroy(toolprefab);
    110.             Destroy(targetprefab);
    111.             EndEpisode();
    112.         }
    113.     }
    114.  
    115.  
    116.  
    117.  
    118.  
    119.     public override void Heuristic(float[] actionsOut)
    120.     {
    121.       actionsOut[0] = Input.GetAxis("Vertical");
    122.       actionsOut[1] = -Input.GetAxis("Horizontal");
    123.       actionsOut[2] = Input.GetAxis("jumpKey");
    124.     }
    125.  
    126. }
    127.  
    And there is a youtube video of the environment I made here
     
  4. vincentpierre

    vincentpierre

    Unity Technologies

    Joined:
    May 5, 2017
    Posts:
    142
    What is the barracuda version in Unity? You can see it in the package manager or in Project/Packages/manifest.json. Can you also post the model you created?
     
  5. justkittenaround

    justkittenaround

    Joined:
    Sep 28, 2020
    Posts:
    19

    The barracuda version is:
    Code (CSharp):
    1.   "dependencies": {
    2.     "com.unity.barracuda": {
    3.       "version": "1.0.4",
    4.       "depth": 1,
    5.       "source": "registry",
    6.       "dependencies": {
    7.         "com.unity.burst": "1.3.4"
    In the previous post, I posted my code, it's just that one file and the associated yaml file which I'll attach here. What else can I provide? Thank you again for looking into this!




    Code (CSharp):
    1. behaviors:
    2.   search:
    3.     trainer_type: ppo
    4.     hyperparameters:
    5.       batch_size: 2304
    6.       buffer_size: 10240
    7.       # batch_size: 5120
    8.       # buffer_size: 409600
    9.       learning_rate: 0.0003
    10.       beta: 0.00005
    11.       epsilon: 0.2
    12.       lambd: 0.9
    13.       num_epoch: 10
    14.       learning_rate_schedule: linear
    15.     network_settings:
    16.       normalize: true
    17.       hidden_units: 128
    18.       num_layers: 2
    19.       vis_encode_type: simple
    20.       # memory:
    21.       #   memory_size: 128
    22.       #   sequence_length: 64
    23.     reward_signals:
    24.       extrinsic:
    25.         gamma: 0.9
    26.         strength: 1.0
    27.       # curiosity:
    28.       #   strength: .1
    29.       #   gamma: .9
    30.       #   learning_rate: .0001
    31.     keep_checkpoints: 5
    32.     max_steps: 500000
    33.     time_horizon: 1000
    34.     summary_freq: 12000
    35.     threaded: true
    36.  
     
  6. vincentpierre

    vincentpierre

    Unity Technologies

    Joined:
    May 5, 2017
    Posts:
    142
    Your barracuda version is 1.0.4 which is very old. I think you might be using a version of the ML-Agents package that is too ancient compared to the trainers you use. I would recommend updating the C# version of ML-Agents to a version compatible with the python version 0.23.0. When training starts, is there a warning in the console informing you that your versions are not compatible.
    Please provide :
    The model you are trying to use.
    The versions of C# ml-agents you are using.
    A print out of the console outputs at the start of training.

    If you want, you can also post an issue on github since this seems to be a bug.
     
  7. justkittenaround

    justkittenaround

    Joined:
    Sep 28, 2020
    Posts:
    19

    I see. Thank you. Please let me know if I should still post on github after this. What is concerning though, is that I have trained many other vision-based environments/agents using the same techniques with no problems. SO, I'm not sure why there is an error with this particular environment.

    I'm using PPO model and ml-agents/ml-envs version 0.23.0 (which I know is older).

    Below is console and terminal when I start training.

    training.png

    Below is what happens when I try to drag the saved model "search" into the assets and use it in the behavior component for inference mode. In this picture, you can see the .onnx file in the asset window to the left. Something is wrong with it since normally there is a wire ball icon instead of a paper icon. I want to be clear here, the error occurs when I import the file. testing.png
     
  8. vincentpierre

    vincentpierre

    Unity Technologies

    Joined:
    May 5, 2017
    Posts:
    142
    Ok, but what is the version of ML-Agents on C# ?
    Go to package manager and look for ML-Agents, it should show the version you are on. Please also do the same for the Barracuda package. My guess is that your C# version of ML-Agents in incompatible with ML-Agents Python 0.23.0.
    Can you also send the model so I can inspect what it in it?
     
unityunity