Search Unity

Bug Batch mode occasionally hangs indefinitely when attempting to download cached assets

Discussion in 'Unity Accelerator' started by RageAgainstThePixel, Dec 15, 2022.

  1. RageAgainstThePixel

    RageAgainstThePixel

    Joined:
    Mar 11, 2020
    Posts:
    66
    This error in the dashboard always happens when our batch builds stall and hang forever.
    Subsequent runs also make the batch builds hang as well. Seems like the cache gets corrupted or something.

    Not sure what is the cause. Any help or insight it helpful.

    Code (CSharp):
    1.   {
    2.     "level": "error",
    3.     "ts": "2022-12-14T19:11:37.410-0500",
    4.     "msg": "error reading size preamble",
    5.     "agent_id": "DESKTOP_id",
    6.     "agent_name": "Desktop",
    7.     "component": "pbservice",
    8.     "subprocess_id": 6,
    9.     "conn_id": 2,
    10.     "remote": "127.0.0.1:60625",
    11.     "error": "read tcp 127.0.0.1:10080->127.0.0.1:60625: wsarecv: An existing connection was forcibly closed by the remote host."
    12.   }
     
  2. RageAgainstThePixel

    RageAgainstThePixel

    Joined:
    Mar 11, 2020
    Posts:
    66
    I enabled debug mode and it seems that the editor isn't being signaled that anything wrong has happened:

    This is the last message from the accelerator before it hangs indefinitely.

    Code (CSharp):
    1. {
    2.   "level":"debug",
    3.   "ts":"2022-12-14T21:56:19.719-0500","msg":"All outstanding requests have finished",
    4.   "agent_id":"DESKTOP_id",
    5.   "agent_name":"Desktop",
    6.   "component":"pbservice",
    7.   "subprocess_id":6,
    8.   "conn_id":5,
    9.   "remote":"127.0.0.1:65181"
    10. }
     
  3. RageAgainstThePixel

    RageAgainstThePixel

    Joined:
    Mar 11, 2020
    Posts:
    66
    The only thing that seems to fix it, is to reboot the machine the accelerator instance is hosted on
     
  4. ryanme-unity

    ryanme-unity

    Unity Technologies

    Joined:
    Nov 3, 2016
    Posts:
    3
    What operating system and version are you using?
     
  5. RageAgainstThePixel

    RageAgainstThePixel

    Joined:
    Mar 11, 2020
    Posts:
    66
    Windows v1.0.941+g6b39b61
     
  6. BFS-Kyle

    BFS-Kyle

    Joined:
    Jun 12, 2013
    Posts:
    883
    I'm experiencing the same (if not very similar) problem on Windows (Server 2022) with Unity 2021.3.2f1 (accelerator v1.0.941+g6b39b61). For me it's 100% reproducible, the build hangs every time. There is nothing obvious in either the accelerator nor the editor logs.

    We run our automated builds on GitLab CI, however I've noticed that if I run a build manually on one of our build machines in an interactive terminal (with the exact same command-line arguments) then it does not hang, and the build succeeds normally. I've also noticed that Unity's logging output is quite different when I do this.

    On the automated builds, the following message is show repeatedly (this is the last log entry before the hang):

    Code (csharp):
    1.  
    2. Artifact(artifact id=bd0db8a123c3cfcefaf7c3a36b67f78e, static dependencies=1aa1f3e572a60765e487d354dd0a23d6, content hash=6301c122b7790c4be84b3c2639b73426) uploaded to cacheserver
    3.  
    However, on the manual builds (on the same machine, with the same arguments) I don't see the above message at all. Instead, I see the following type of message shown repeatedly:

    Code (csharp):
    1.  
    2. ShaderCacheRemote uploaded key '98175cdba1fa03856d510daa42d7ad68'
    3.  
    Some more potentially useful info:

    - I see the exact same issue on Linux (Ubuntu 20.04) when building WebGL targets with the same project, including the difference in log message output. These builds also work fine when running manually from an interactive terminal.

    - I do NOT have the issue on macOS with the same project. The build never hangs.

    My investigation so far leads me to suspect that this is an issue with the 2021.3.2f1 editor, and not with the accelerator. I have other projects which work just fine on other Unity versions (on all platforms).
     
  7. BFS-Kyle

    BFS-Kyle

    Joined:
    Jun 12, 2013
    Posts:
    883
    I was able to take a stack trace from the editor process on a Linux machine while the process was stuck:

    Code (CSharp):
    1.  
    2. #0  __lll_lock_wait (futex=futex@entry=0x7f4d805f2bb8, private=0) at lowlevellock.c:52
    3. #1  0x00007f4e48bdb131 in __GI___pthread_mutex_lock (mutex=0x7f4d805f2bb8) at ../nptl/pthread_mutex_lock.c:115
    4. #2  0x0000559f0646ab71 in TcpProtobufClient::CancelMessage(unsigned int) ()
    5. #3  0x0000559f06348696 in AcceleratorClient::CancelRequest(unsigned int) ()
    6. #4  0x0000559f06345770 in AcceleratorClient::CancelAllRequests() ()
    7. #5  0x0000559f063c6c4f in CleanUpAssetDatabaseV2() ()
    8. #6  0x0000559f063572f6 in AssetDatabase::CleanupAssetDatabase() ()
    9. #7  0x0000559f05fa18de in Application::CoreShutdown() ()
    10. #8  0x0000559f05f9f4f9 in Application::ParseARGVCommands() ()
    11. #9  0x0000559f05f9c75b in Application::FinishLoadingProject() ()
    12. #10 0x0000559f06040dca in InitializeUnity(void*) ()
    13. #11 0x0000559f06040444 in main ()
    14.  
     
  8. BFS-Kyle

    BFS-Kyle

    Joined:
    Jun 12, 2013
    Posts:
    883
    In my case I was able to resolve this by upgrading to 2021.3.16f1.
     
    RageAgainstThePixel likes this.