Search Unity

  1. Looking for a job or to hire someone for a project? Check out the re-opened job forums.
    Dismiss Notice
  2. Good news ✨ We have more Unite Now videos available for you to watch on-demand! Come check them out and ask our experts any questions!
    Dismiss Notice

Bug Freezes/crashes when Using Barracuda on XR, XS Max

Discussion in 'Barracuda' started by apiotuch_unity, Feb 23, 2021.

  1. apiotuch_unity

    apiotuch_unity

    Joined:
    Jun 28, 2019
    Posts:
    75
    we have Barracuda running for image recognition. Commonly the app freezes and then crashes. It only happens on newer devices such as XR and XS Max, Pixel 3 XL, but for iPhone 7 and 8 it seems fine. We have no idea what causes it. There are no error logs, and no analytics are reported. We have been able to crash it locally and through test flight, and our current release experiences the problem as well. It only occurs while Barracuda is active. We are using 2019.4.5f1 but have tried 2019.4.16f1 and 2019.4.20f1 but still we experience the issue. We have not been able to isolate the problem. We are currently using Barracuda 1.0.4 and Burst 1.2.3, ML-agents 1.0.6.

    Update:

    More testing shows that iPhone 8 also freeze/crash. Some times it happens right away, other times it is after a few tries. Sometimes after a couple separate sessions. iPhone 7 and iPad mini 4 don't seem to exhibit the issue.
     
    Last edited: Feb 25, 2021
  2. alexandreribard_unity

    alexandreribard_unity

    Unity Technologies

    Joined:
    Sep 18, 2019
    Posts:
    25
    Could you try using version 1.3.0?
    Could you also provide the model you are trying to run? I can test it out on my end to see if I can repro it?
     
  3. apiotuch_unity

    apiotuch_unity

    Joined:
    Jun 28, 2019
    Posts:
    75
    @alexandreribard_unity We narrowed it down, basically, a development build works without issue. A non development build always crashes.
     
  4. apiotuch_unity

    apiotuch_unity

    Joined:
    Jun 28, 2019
    Posts:
    75
  5. apiotuch_unity

    apiotuch_unity

    Joined:
    Jun 28, 2019
    Posts:
    75
  6. alexandreribard_unity

    alexandreribard_unity

    Unity Technologies

    Joined:
    Sep 18, 2019
    Posts:
    25
    Barracuda v1.3.0
     
  7. apiotuch_unity

    apiotuch_unity

    Joined:
    Jun 28, 2019
    Posts:
    75

    Attached Files:

  8. apiotuch_unity

    apiotuch_unity

    Joined:
    Jun 28, 2019
    Posts:
    75
    @alexandreribard_unity We have tried using Burst 1.2.3 and 1.3.9 but we still experience issues. We will try Barracuda 1.3.0 and will get back to you.
     
  9. nlacroixAOD

    nlacroixAOD

    Joined:
    Jul 20, 2018
    Posts:
    18
    We'll be trying 1.3.0 but noticed that 1.3.1 is available, is 1.3.1 not as stable as 1.3.0? Would it be better to try 1.3.1?
     
  10. nlacroixAOD

    nlacroixAOD

    Joined:
    Jul 20, 2018
    Posts:
    18
    Still have issues on Barracuda 1.3.0 with Burst 1.3.9
     
    Last edited: Feb 25, 2021
  11. apiotuch_unity

    apiotuch_unity

    Joined:
    Jun 28, 2019
    Posts:
    75
    @alexandreribard_unity We recently were able to get a crash to log to Unity Analytics. This message came up in the log:

    Internal: JobTempAlloc has allocations that are more than 4 frames old - this is not allowed and likely a leak

    We found another forum thread with more information for this warning:
    https://forum.unity.com/threads/job...e-more-than-4-frames-old.513124/#post-3366232
    This thread is a couple years old so we don't know atm if there is a newer solution. We don't know if this is an internal problem or some calls that we need to update in our code. We will investigate further on our end more and will get back to you if we find any more information.

    Does Barracuda need further updating to handle race conditions?
     
  12. apiotuch_unity

    apiotuch_unity

    Joined:
    Jun 28, 2019
    Posts:
    75
    @alexandreribard_unity We were able to get it to crash again, similar chain of logs but the last "Internal: JobTempAlloc" warning wasn't there. My guess is it crashed before being able to write to log. Here is a more complete list of logs:

    Log Message Feb 25, 2021, 23:53:34.798 2961 StopLocationServices Location Services, Stopped
    Log Message Feb 25, 2021, 23:53:34.836 2961 Probing Unity.Barracuda.iOSBLAS
    Log Message Feb 25, 2021, 23:53:34.842 2961 Probing Unity.Barracuda.iOSBLAS
    Log Message Feb 25, 2021, 23:53:34.860 2961 set data for: ImageRecognition
    Log Message Feb 25, 2021, 23:53:35.257 2961 <color=blue> AR enabled: True
    Warning Feb 25, 2021, 23:53:35.339 2961 REARCAM is front facing:False
    Warning Feb 25, 2021, 23:53:38.154 3043 Internal: JobTempAlloc has allocations that are more than 4 frames old - this is not allowed and likely a leak.

    I believe it is coming from PluginInterface.cs part of Unity.Barracuda namespace.
     
  13. apiotuch_unity

    apiotuch_unity

    Joined:
    Jun 28, 2019
    Posts:
    75
    @alexandreribard_unity We were able to get more info after uploading symbols:

    Thread 3 (crashed)
    0 UnityFramework 0x00000001087aa8ec JobQueue::EnqueueAllInternal(JobGroup*, JobGroup*, AtomicQueue*, int*)
    1 UnityFramework 0x00000001087aaf8c JobQueue::ScheduleGroups(JobGroup*, JobGroup*)
    2 UnityFramework 0x00000001087a8660 JobBatchDispatcher::KickJobs()
    3 UnityFramework 0x0000000108913fa8 JobHandle_CUSTOM_ScheduleBatchedJobsAndComplete(JobFence&)
    4 UnityFramework 0x000000010a1e6db0 BurstTensorData_CompleteAllPendingOperations_mF7193A785CFD260E47CFC465C4608716B0D141D5 (Unity.Barracuda.cpp:0)
    5 UnityFramework 0x000000010a1e6ce8 BurstTensorData_Dispose_m8088AC977044D887B1174773FD8AA14908E9080E (Unity.Barracuda.cpp:24172)
    6 UnityFramework 0x000000010a25099c UnsafeArrayTensorData_Finalize_m433E0F5E4FE8B97DDAC2D2C5CA8CCE7C099BC68F (Unity.Barracuda4.cpp:29218)
    7 UnityFramework 0x0000000108110ab0 RuntimeInvoker_TrueVoid_t22962CB4C05B1D89B55A6E1139F0E87A90987017(void (*)(), MethodInfo const*, void*, void**) (Il2CppInvokerTable.cpp:69712)
    8 UnityFramework 0x0000000108f584a0 il2cpp::vm::Runtime::Invoke(MethodInfo const*, void*, void**, Il2CppException**)
    9 UnityFramework 0x0000000108f097bc il2cpp::gc::GarbageCollector::RunFinalizer(void*, void*)
    10 UnityFramework 0x0000000108f00464 GC_invoke_finalizers
    11 UnityFramework 0x0000000108f0970c il2cpp::gc::FinalizerThread(void*)
    12 UnityFramework 0x0000000108f28ad4 il2cpp::os::Thread::RunWrapper(void*)
    13 UnityFramework 0x0000000108f2b2a8 il2cpp::os::ThreadImpl::ThreadStartWrapper(void*)
    14 libsystem_pthread.dylib 0x00000001e8547cb0 <system symbols missing>
    15 libsystem_pthread.dylib 0xb701e081e8550778 <system symbols missing>


    This seems to occur if it crashes. If it freezes and then crashes, no log is uploaded to Unity Analytics.
     
  14. alexandreribard_unity

    alexandreribard_unity

    Unity Technologies

    Joined:
    Sep 18, 2019
    Posts:
    25
    ok thanks a lot. We're looking into it.
    @Aurimasp will build a test app for it and we'll investigate.
    Initial assumptions looks like a out of memory (memory leak or too model being too big)
    We'll keep you posted
     
    apiotuch_unity likes this.
  15. nlacroixAOD

    nlacroixAOD

    Joined:
    Jul 20, 2018
    Posts:
    18
    We've tested a bit with smaller models we have and the same issue still seems to happen @alexandreribard_unity
     
  16. nlacroixAOD

    nlacroixAOD

    Joined:
    Jul 20, 2018
    Posts:
    18
    @alexandreribard_unity any update from your side? I think we may have resolved some other issue we had going on but now we seem to be getting pretty consistent logs with some kind of network issue. It still freezes up when we're attempting to do the recognition but it will start spitting out the following logs upon freezing:

    Code (CSharp):
    1.  
    2. Updating user data
    3. (Filename: ./Runtime/Export/Debug/Debug.bindings.h Line: 39)
    4.  
    5. StopLocationServices Location Services, Stopped
    6. set data for: ImageRecognition
    7. <color=blue> AR enabled: True
    8. REARCAM is front facing:False
    9. (Filename: ./Runtime/Export/Debug/Debug.bindings.h Line: 39)
    10.  
    11. 2021-03-01 11:48:25.600431-0800 aod[797:251985] [tcp] tcp_input [C5.1:3] flags=[R] seq=2559442872, ack=0, win=0 state=LAST_ACK rcv_nxt=2559442872, snd_una=154208125
    12. 2021-03-01 11:48:25.603794-0800 aod[797:251985] [tcp] tcp_input [C5.1:3] flags=[R] seq=2559442872, ack=0, win=0 state=CLOSED rcv_nxt=2559442872, snd_una=154208125
    13. 2021-03-01 11:48:25.604633-0800 aod[797:251985] [tcp] tcp_input [C5.1:3] flags=[R] seq=2559442872, ack=0, win=0 state=CLOSED rcv_nxt=2559442872, snd_una=154208125
    14. 2021-03-01 11:48:33.571040-0800 aod[797:251604] [] nw_path_evaluator_create_flow_inner NECP_CLIENT_ACTION_ADD_FLOW 66F2E2FC-A2CB-4092-B960-417EE2C2EFF3 [22: Invalid argument]
    15. 2021-03-01 11:48:33.571665-0800 aod[797:251604] [connection] nw_endpoint_flow_setup_channel [C17.1 0.0.0.0:443 in_progress channel-flow (satisfied (Path is satisfied), interface: en0, ipv4, dns)] failed to request add nexus flow
    16. 2021-03-01 11:48:33.573824-0800 aod[797:251604] Connection 17: received failure notification
    17. 2021-03-01 11:48:33.574107-0800 aod[797:251604] Connection 17: failed to connect 1:22, reason -1
    18. 2021-03-01 11:48:33.574268-0800 aod[797:251604] Connection 17: encountered error(1:22)
    19. 2021-03-01 11:48:33.576445-0800 aod[797:251898] Task <7BDBBF75-47A9-4D76-8EE0-3CEA6B9B7015>.<6> HTTP load failed, 0/0 bytes (error code: 22 [1:22])
    20. 2021-03-01 11:48:33.577406-0800 aod[797:251898] Task <7BDBBF75-47A9-4D76-8EE0-3CEA6B9B7015>.<6> finished with error [22] Error Domain=NSPOSIXErrorDomain Code=22 "Invalid argument" UserInfo={_NSURLErrorFailingURLSessionTaskErrorKey=LocalUploadTask <7BDBBF75-47A9-4D76-8EE0-3CEA6B9B7015>.<6>, _kCFStreamErrorDomainKey=1, _NSURLErrorRelatedURLSessionTaskErrorKey=(
    21.     "LocalUploadTask <7BDBBF75-47A9-4D76-8EE0-3CEA6B9B7015>.<6>"
    22. ), _kCFStreamErrorCodeKey=22}
    23. 2021-03-01 11:48:58.901803-0800 aod[797:252108] [] nw_path_evaluator_create_flow_inner NECP_CLIENT_ACTION_ADD_FLOW 50A0685E-7520-4E16-A0D4-0B09E5E62BC3 [22: Invalid argument]
    24. 2021-03-01 11:48:58.902242-0800 aod[797:252108] [connection] nw_endpoint_flow_setup_channel [C18.1 0.0.0.0:443 in_progress channel-flow (satisfied (Path is satisfied), interface: en0, ipv4, dns)] failed to request add nexus flow
    25. 2021-03-01 11:48:58.918738-0800 aod[797:252108] Connection 18: received failure notification
    26. 2021-03-01 11:48:58.918966-0800 aod[797:252108] Connection 18: failed to connect 1:22, reason -1
    27. 2021-03-01 11:48:58.919111-0800 aod[797:252108] Connection 18: encountered error(1:22)
    28. 2021-03-01 11:48:58.921718-0800 aod[797:252108] Task <CC3A5D8B-52CF-4690-8678-8BA634F818A2>.<7> HTTP load failed, 0/0 bytes (error code: 22 [1:22])
    29. 2021-03-01 11:48:58.923130-0800 aod[797:252108] Task <CC3A5D8B-52CF-4690-8678-8BA634F818A2>.<7> finished with error [22] Error Domain=NSPOSIXErrorDomain Code=22 "Invalid argument" UserInfo={_NSURLErrorFailingURLSessionTaskErrorKey=LocalUploadTask <CC3A5D8B-52CF-4690-8678-8BA634F818A2>.<7>, _kCFStreamErrorDomainKey=1, _NSURLErrorRelatedURLSessionTaskErrorKey=(
    30.     "LocalUploadTask <CC3A5D8B-52CF-4690-8678-8BA634F818A2>.<7>"
    31. ), _kCFStreamErrorCodeKey=22}
    32. 2021-03-01 11:49:33.482460-0800 aod[797:252108] [] nw_path_evaluator_create_flow_inner NECP_CLIENT_ACTION_ADD_FLOW AF8067ED-018F-4E5A-B014-E0A5CD447541 [22: Invalid argument]
    33. 2021-03-01 11:49:33.482662-0800 aod[797:252108] [connection] nw_endpoint_flow_setup_channel [C19.1 0.0.0.0:443 in_progress channel-flow (satisfied (Path is satisfied), interface: en0, ipv4, dns)] failed to request add nexus flow
    34. 2021-03-01 11:49:33.498156-0800 aod[797:252108] Connection 19: received failure notification
    35. 2021-03-01 11:49:33.498320-0800 aod[797:252108] Connection 19: failed to connect 1:22, reason -1
    36. 2021-03-01 11:49:33.498452-0800 aod[797:252108] Connection 19: encountered error(1:22)
    37. 2021-03-01 11:49:33.500391-0800 aod[797:252108] Task <5CAB56C6-6777-42BF-AEF1-C64750948870>.<8> HTTP load failed, 0/0 bytes (error code: 22 [1:22])
    38. 2021-03-01 11:49:33.501435-0800 aod[797:252108] Task <5CAB56C6-6777-42BF-AEF1-C64750948870>.<8> finished with error [22] Error Domain=NSPOSIXErrorDomain Code=22 "Invalid argument" UserInfo={_NSURLErrorFailingURLSessionTaskErrorKey=LocalUploadTask <5CAB56C6-6777-42BF-AEF1-C64750948870>.<8>, _kCFStreamErrorDomainKey=1, _NSURLErrorRelatedURLSessionTaskErrorKey=(
    39.     "LocalUploadTask <5CAB56C6-6777-42BF-AEF1-C64750948870>.<8>"
    40. ), _kCFStreamErrorCodeKey=22}
    41. 2021-03-01 11:50:03.522524-0800 aod[797:252204] [] nw_path_evaluator_create_flow_inner NECP_CLIENT_ACTION_ADD_FLOW EDD24596-EC16-4F14-8626-AD53124B144C [22: Invalid argument]
    42. 2021-03-01 11:50:03.522895-0800 aod[797:252204] [connection] nw_endpoint_flow_setup_channel [C20.1 0.0.0.0:443 in_progress channel-flow (satisfied (Path is satisfied), interface: en0, ipv4, dns)] failed to request add nexus flow
    43. 2021-03-01 11:50:03.525550-0800 aod[797:252204] Connection 20: received failure notification
    44. 2021-03-01 11:50:03.525833-0800 aod[797:252204] Connection 20: failed to connect 1:22, reason -1
    45. 2021-03-01 11:50:03.525980-0800 aod[797:252204] Connection 20: encountered error(1:22)
    46. 2021-03-01 11:50:03.528774-0800 aod[797:252204] Task <1C77EB3E-C9D3-4746-A526-A180C35541E6>.<9> HTTP load failed, 0/0 bytes (error code: 22 [1:22])
    47. 2021-03-01 11:50:03.530117-0800 aod[797:252284] Task <1C77EB3E-C9D3-4746-A526-A180C35541E6>.<9> finished with error [22] Error Domain=NSPOSIXErrorDomain Code=22 "Invalid argument" UserInfo={_NSURLErrorFailingURLSessionTaskErrorKey=LocalUploadTask <1C77EB3E-C9D3-4746-A526-A180C35541E6>.<9>, _kCFStreamErrorDomainKey=1, _NSURLErrorRelatedURLSessionTaskErrorKey=(
    48.     "LocalUploadTask <1C77EB3E-C9D3-4746-A526-A180C35541E6>.<9>"
    49. ), _kCFStreamErrorCodeKey=22}
    50. 2021-03-01 11:50:33.536750-0800 aod[797:252109] [] nw_path_evaluator_create_flow_inner NECP_CLIENT_ACTION_ADD_FLOW 9291E229-59B8-4D79-9860-5E53136D57E7 [22: Invalid argument]
    51. 2021-03-01 11:50:33.537040-0800 aod[797:252109] [connection] nw_endpoint_flow_setup_channel [C21.1 0.0.0.0:443 in_progress channel-flow (satisfied (Path is satisfied), interface: en0, ipv4, dns)] failed to request add nexus flow
    52. 2021-03-01 11:50:33.552884-0800 aod[797:252387] Connection 21: received failure notification
    53. 2021-03-01 11:50:33.553153-0800 aod[797:252387] Connection 21: failed to connect 1:22, reason -1
    54. 2021-03-01 11:50:33.553384-0800 aod[797:252387] Connection 21: encountered error(1:22)
    55. 2021-03-01 11:50:33.557302-0800 aod[797:252387] Task <378FB7F8-9B5B-4629-83C0-B54A2EAD4E64>.<10> HTTP load failed, 0/0 bytes (error code: 22 [1:22])
    56. 2021-03-01 11:50:33.558919-0800 aod[797:252109] Task <378FB7F8-9B5B-4629-83C0-B54A2EAD4E64>.<10> finished with error [22] Error Domain=NSPOSIXErrorDomain Code=22 "Invalid argument" UserInfo={_NSURLErrorFailingURLSessionTaskErrorKey=LocalUploadTask <378FB7F8-9B5B-4629-83C0-B54A2EAD4E64>.<10>, _kCFStreamErrorDomainKey=1, _NSURLErrorRelatedURLSessionTaskErrorKey=(
    57.     "LocalUploadTask <378FB7F8-9B5B-4629-83C0-B54A2EAD4E64>.<10>"
    58. ), _kCFStreamErrorCodeKey=22}
    59. 2021-03-01 11:51:03.481215-0800 aod[797:252468] [] nw_path_evaluator_create_flow_inner NECP_CLIENT_ACTION_ADD_FLOW 48556F12-2875-416C-9A2C-D96788F474CB [22: Invalid argument]
    60. 2021-03-01 11:51:03.481460-0800 aod[797:252468] [connection] nw_endpoint_flow_setup_channel [C22.1 0.0.0.0:443 in_progress channel-flow (satisfied (Path is satisfied), interface: en0, ipv4, dns)] failed to request add nexus flow
    61. 2021-03-01 11:51:03.500982-0800 aod[797:252468] Connection 22: received failure notification
    62. 2021-03-01 11:51:03.501150-0800 aod[797:252468] Connection 22: failed to connect 1:22, reason -1
    63. 2021-03-01 11:51:03.501248-0800 aod[797:252468] Connection 22: encountered error(1:22)
    64. 2021-03-01 11:51:03.503009-0800 aod[797:252468] Task <EAECC8AF-E77F-493A-BE6C-270A647AE7F6>.<11> HTTP load failed, 0/0 bytes (error code: 22 [1:22])
    65. 2021-03-01 11:51:03.504078-0800 aod[797:252468] Task <EAECC8AF-E77F-493A-BE6C-270A647AE7F6>.<11> finished with error [22] Error Domain=NSPOSIXErrorDomain Code=22 "Invalid argument" UserInfo={_NSURLErrorFailingURLSessionTaskErrorKey=LocalUploadTask <EAECC8AF-E77F-493A-BE6C-270A647AE7F6>.<11>, _kCFStreamErrorDomainKey=1, _NSURLErrorRelatedURLSessionTaskErrorKey=(
    66.     "LocalUploadTask <EAECC8AF-E77F-493A-BE6C-270A647AE7F6>.<11>"
    67. ), _kCFStreamErrorCodeKey=22}
    68. 2021-03-01 11:51:33.581457-0800 aod[797:252469] [] nw_path_evaluator_create_flow_inner NECP_CLIENT_ACTION_ADD_FLOW E19B56B4-14FE-4B88-B4F2-7F512F4B4634 [22: Invalid argument]
    69. 2021-03-01 11:51:33.581922-0800 aod[797:252469] [connection] nw_endpoint_flow_setup_channel [C23.1 0.0.0.0:443 in_progress channel-flow (satisfied (Path is satisfied), interface: en0, ipv4, dns)] failed to request add nexus flow
    70. 2021-03-01 11:51:33.583785-0800 aod[797:252469] Connection 23: received failure notification
    71. 2021-03-01 11:51:33.584022-0800 aod[797:252469] Connection 23: failed to connect 1:22, reason -1
    72. 2021-03-01 11:51:33.584168-0800 aod[797:252469] Connection 23: encountered error(1:22)
    73. 2021-03-01 11:51:33.586601-0800 aod[797:252469] Task <DDD8C917-133F-414B-A1AD-B3655F0B32A4>.<12> HTTP load failed, 0/0 bytes (error code: 22 [1:22])
    74. 2021-03-01 11:51:33.588043-0800 aod[797:252469] Task <DDD8C917-133F-414B-A1AD-B3655F0B32A4>.<12> finished with error [22] Error Domain=NSPOSIXErrorDomain Code=22 "Invalid argument" UserInfo={_NSURLErrorFailingURLSessionTaskErrorKey=LocalUploadTask <DDD8C917-133F-414B-A1AD-B3655F0B32A4>.<12>, _kCFStreamErrorDomainKey=1, _NSURLErrorRelatedURLSessionTaskErrorKey=(
    75.     "LocalUploadTask <DDD8C917-133F-414B-A1AD-B3655F0B32A4>.<12>"
    76. ), _kCFStreamErrorCodeKey=22}
    77. 2021-03-01 11:51:43.294646-0800 aod[797:252563] XPC connection interrupted
    78. 2021-03-01 11:52:03.545952-0800 aod[797:252679] [] nw_path_evaluator_create_flow_inner NECP_CLIENT_ACTION_ADD_FLOW 2E18148D-2934-4A22-BF84-E373231D8C45 [22: Invalid argument]
    79. 2021-03-01 11:52:03.546257-0800 aod[797:252679] [connection] nw_endpoint_flow_setup_channel [C24.1 0.0.0.0:443 in_progress channel-flow (satisfied (Path is satisfied), interface: en0, ipv4, dns)] failed to request add nexus flow
    80. 2021-03-01 11:52:03.561651-0800 aod[797:252679] Connection 24: received failure notification
    81. 2021-03-01 11:52:03.561925-0800 aod[797:252679] Connection 24: failed to connect 1:22, reason -1
    82. 2021-03-01 11:52:03.562157-0800 aod[797:252679] Connection 24: encountered error(1:22)
    83. 2021-03-01 11:52:03.565764-0800 aod[797:252679] Task <3E8386B3-0427-4DDD-823F-DFB35C4B47AC>.<13> HTTP load failed, 0/0 bytes (error code: 22 [1:22])
    84. 2021-03-01 11:52:03.566944-0800 aod[797:252469] Task <3E8386B3-0427-4DDD-823F-DFB35C4B47AC>.<13> finished with error [22] Error Domain=NSPOSIXErrorDomain Code=22 "Invalid argument" UserInfo={_NSURLErrorFailingURLSessionTaskErrorKey=LocalUploadTask <3E8386B3-0427-4DDD-823F-DFB35C4B47AC>.<13>, _kCFStreamErrorDomainKey=1, _NSURLErrorRelatedURLSessionTaskErrorKey=(
    85.     "LocalUploadTask <3E8386B3-0427-4DDD-823F-DFB35C4B47AC>.<13>"
    86. ), _kCFStreamErrorCodeKey=22}
    87. (lldb)
    88.  
    I'm wondering if this is related to our previous issue we mentioned here:
    https://forum.unity.com/threads/uni...finitely-even-with-timeout-set.1012276/page-3

    One of the most recent comments in that thread actually mention a similar thing we noticed, the issue happens less often (or not at all) when we have Development Build checked.
     
  17. nlacroixAOD

    nlacroixAOD

    Joined:
    Jul 20, 2018
    Posts:
    18
    BTW from that same log, it appears we're getting the error at the beginning and throughout the app running as well, it just seems to actually freeze when attempting IR. Below is what I see just after app startup (after hitting play in XCode to run the build)

    Code (CSharp):
    1.  
    2. App start
    3.  
    4. 2021-03-01 11:45:58.161668-0800 aod[797:250930] FCM: Loading UIApplication FIRFCM category
    5. CrashReporter: initialized
    6. 2021-03-01 11:45:58.363738-0800 aod[797:250930] Built from '2019.4/staging' branch, Version '2019.4.21f1 (b76dac84db26)', Build type 'Release', Scripting Backend 'il2cpp'
    7. 2021-03-01 11:45:58.364551-0800 aod[797:250930] MemoryManager: Using 'Default' Allocator.
    8. 2021-03-01 11:45:58.542257-0800 aod[797:250930] Setting up iOS 10 message delegate.
    9. -> applicationDidFinishLaunching()
    10. 2021-03-01 11:45:58.834207-0800 aod[797:251059] [] nw_path_evaluator_create_flow_inner NECP_CLIENT_ACTION_ADD_FLOW 44BC6AEC-F36B-40C5-AEFE-F9B67869722F [22: Invalid argument]
    11. 2021-03-01 11:45:58.834446-0800 aod[797:251059] [connection] nw_endpoint_flow_setup_channel [C1.1 0.0.0.0:443 in_progress channel-flow (satisfied (Path is satisfied), interface: en0, ipv4, dns)] failed to request add nexus flow
    12. 2021-03-01 11:45:58.835584-0800 aod[797:251059] Connection 1: received failure notification
    13. 2021-03-01 11:45:58.835679-0800 aod[797:251059] Connection 1: failed to connect 1:22, reason -1
    14. 2021-03-01 11:45:58.835739-0800 aod[797:251059] Connection 1: encountered error(1:22)
    15. 2021-03-01 11:45:58.837323-0800 aod[797:251056] Task <EFC26F09-785A-4F13-96A7-C680FC175EB5>.<1> HTTP load failed, 0/0 bytes (error code: 22 [1:22])
    16. 2021-03-01 11:45:58.838659-0800 aod[797:251056] Task <EFC26F09-785A-4F13-96A7-C680FC175EB5>.<1> finished with error [22] Error Domain=NSPOSIXErrorDomain Code=22 "Invalid argument" UserInfo={_NSURLErrorFailingURLSessionTaskErrorKey=LocalUploadTask <EFC26F09-785A-4F13-96A7-C680FC175EB5>.<1>, _kCFStreamErrorDomainKey=1, _NSURLErrorRelatedURLSessionTaskErrorKey=(
    17.     "LocalUploadTask <EFC26F09-785A-4F13-96A7-C680FC175EB5>.<1>"
    18. ), _kCFStreamErrorCodeKey=22}
    19.  
     
  18. nlacroixAOD

    nlacroixAOD

    Joined:
    Jul 20, 2018
    Posts:
    18
    May have found the cause for the above errors I mentioned but still seem to get this at some point after freezing:
    Code (CSharp):
    1. 2021-03-01 17:28:53.942860-0800 aod[1335:320705] [tcp] tcp_input [C5.1:3] flags=[R] seq=780464410, ack=0, win=0 state=LAST_ACK rcv_nxt=780464410, snd_una=2317917624
    2. 2021-03-01 17:28:53.945628-0800 aod[1335:320705] [tcp] tcp_input [C5.1:3] flags=[R] seq=780464410, ack=0, win=0 state=CLOSED rcv_nxt=780464410, snd_una=2317917624
    3. 2021-03-01 17:28:53.947749-0800 aod[1335:320705] [tcp] tcp_input [C5.1:3] flags=[R] seq=780464410, ack=0, win=0 state=CLOSED rcv_nxt=780464410, snd_una=2317917624
    We saw the same thing logged when experiencing the 14.2 issue I linked so could be that something is still going on there even with 2019.4.21.
     
  19. alexandreribard_unity

    alexandreribard_unity

    Unity Technologies

    Joined:
    Sep 18, 2019
    Posts:
    25
    So I've tested running the model with different input sizes on a OnePlusA6010 with a GPU worker.
    It's not that recent so maybe there is that.

    With 256x256 input it runs fine.
    With 719x1280 input like you showed me, the app doesn't crash but it's lagging pretty heavily.
    I do see it could run out of memory.
    Can you try with a lower image size?

    Also can you use a onnx model as input, we optimize away some layers so it could help out.

    We'll try on newer phones and apple devices and get back to you
     
  20. alexandreribard_unity

    alexandreribard_unity

    Unity Technologies

    Joined:
    Sep 18, 2019
    Posts:
    25
    Second update after checking the issue on iOS. We were able to reproduce it :)
    We have some guess to what might be the issue.
     
    apiotuch_unity likes this.
  21. apiotuch_unity

    apiotuch_unity

    Joined:
    Jun 28, 2019
    Posts:
    75
    Which version of Barracuda/Burst has onnx? Barracuda 1.3.1?
     
  22. nlacroixAOD

    nlacroixAOD

    Joined:
    Jul 20, 2018
    Posts:
    18
    @alexandreribard_unity Glad to hear you were able to reproduce something! I'm not sure that's quite what we were doing here, the images we provided get picked up from the device camera and from there we do the recognition. I believe we size the texture from the camera down a bit before turning it into a Tensor and then doing the recognition. It's possible the texture its taking in might still be a bit large. Right now from the logs I'm seeing it appears we're getting stuck at 'PeekOutput'. I figured there might be an issue with that and have tried 'CopyOutput' as well and it still seems to get caught at that point. It's possible it's actually getting stuck at a different point as well (we typically have a couple Workers going that are doing prediction) and just happens to get caught there in the logs.

    Can you provide some more details about what it is your team was able to see/reproduce? I will try messing with our texture/input a bit in the meantime. Please let us know as well if there is any workaround we could do in the meantime.

    Just for specific info these are the devices we've encountered the issues with so far:
    iPhone XS Max, iPhone XR, iPhone 11 Pro Max, iPhone 8 Plus

    Devices without any issues:
    iPad Mini 4, iPhone 7, and Android's seem to be fine (haven't noticed any crashing recently on Nexus 5 or Pixel 3XL but we've mainly been focused on the iOS side)

    As a side note, I believe those network issues I was seeing before are just a symptom of the problem. It looks like every time it gets stuck the network connection eventually gets closed soon after and causes some infinite loading. I'm guessing once the issue is resolved that should go away. I should note as well that it doesn't always freeze necessarily, sometimes it will let the app continue but that's when we see the infinite load issue.
     
  23. alexandreribard_unity

    alexandreribard_unity

    Unity Technologies

    Joined:
    Sep 18, 2019
    Posts:
    25
    Yeah testing on 1.3.1
    On iPhone X, 719x1280 we got:
    Execution of the command buffer was aborted due to an error during execution. Caused GPU Hang Error (IOAF code 3
    We also saw this with 256x256 input
    So atm we are betting on a out of bound write/read in one of the kernel.

    Once we know for sure what the problem is we'll give you a workaround
     
  24. nlacroixAOD

    nlacroixAOD

    Joined:
    Jul 20, 2018
    Posts:
    18
    @alexandreribard_unity Interesting, I don't think I've seen any mention of that message specifically but it might be I just can't get messages from that far in? It looks like the textures we're creating a tensor from are 224x224 after sizing down and they come in at a somewhat larger size initially (about 640x480 from what I remember). But if you're seeing that at even smaller sizes then its definitely possible that this is what we're encountering.
     
  25. lkuich

    lkuich

    Joined:
    Nov 6, 2015
    Posts:
    1
    Awesome. Thank you guys. Just weighing in here, I've been testing on an iPhone 8 running 14.3, after having the scanner run for a while it eventually crashes. This Barracuda model accepts 224x224 inputs, it seems the actual prediction is what's eventually causing a crash. Just wanted to share our prediction block here, though it may have already been shared.

    Code (CSharp):
    1. public Tensor ExecuteAsync(Texture2D texture, int syncEveryNthLayer = 5)
    2. {
    3.     this.Texture = texture;
    4.     this.Tensor = ImageUtil.TensorFromTexture(texture, Model.ImageDims);
    5.          
    6.     var executor = Worker.StartManualSchedule(this.Tensor);
    7.     var it = 0;
    8.     bool hasMoreWork;
    9.  
    10.     do
    11.     {
    12.         hasMoreWork = executor.MoveNext();
    13.         if (++it % syncEveryNthLayer == 0)
    14.             Worker.FlushSchedule();
    15.  
    16.     } while (hasMoreWork);
    17.  
    18.     return Worker.PeekOutput();
    19. }
     
    Last edited: Mar 3, 2021
  26. nlacroixAOD

    nlacroixAOD

    Joined:
    Jul 20, 2018
    Posts:
    18
    @alexandreribard_unity Quick update, it seems like I've somewhat solved our issue for now. It looks like our issue is specific to WorkerFactory.Type.CSharpBurst. I've changed our type to just CSharp and we no longer seem to have the freezing and following network errors. I would still like to use CSharpBurst eventually as there was definitely a lot of improvement with older devices from what I saw. Definitely keep us updated on a potential workaround for the issue you've found and we'll see if that resolves it for us. Thanks for all your help!
     
  27. Mantas-Puida

    Mantas-Puida

    Unity Technologies

    Joined:
    Nov 13, 2008
    Posts:
    1,860
    This model looks to be non-resolution independent as it has
    Dense
    layer at the end. Also I see assert error in editor console when running it with
    256x256
    input:
    Assertion failure. Values are not equal. Expected: 1280 == 5120
    . Running it against
    224x224
    input works fine. I think we could do better job at catching cases when supplied input shape does not match expectations imposed by the internal network structure.
     
  28. alexandreribard_unity

    alexandreribard_unity

    Unity Technologies

    Joined:
    Sep 18, 2019
    Posts:
    25
    In more details (looking at the end of your network):
    upload_2021-3-5_10-28-40.png
    AvgPool2D is not a global pool operator. It has a fixed kernel size (in this case 7). So unfortunately that means it is resolution dependent if you want to end up with a [N, 1,1, 1280] tensor so that the Dense layer can operate as intended.
    In this case input needs to be a 224x244 image. Any bigger will leave some residual H and W dimension after the AvgPoll, resulting in a crash.
    I would recommend swapping this layer with a GlobalAvgPool which is resolution independent (and will pool over the whole input tensor, getting you the desired shape regardless of the input resolution)

    TLDR: Your network is not resolution independent due to AvgPool2D. Change it to a GlobalAvgPool layer for your network to work for any input dimension
     
    apiotuch_unity likes this.
unityunity