Search Unity

iPhone 8 / A11 / GPU Hang Error / Command Buffer

Discussion in 'iOS and tvOS' started by thomas-weltenbauer, Oct 17, 2017.

  1. thomas-weltenbauer

    thomas-weltenbauer

    Joined:
    Oct 23, 2013
    Posts:
    72
    Hello,

    currently our game isn't working on the iPhone 8. After a short time playing the game (few seconds to a minute), the game freezes and the device log shows this error messages:

    Execution of the command buffer was aborted due to an error during execution. Caused GPU Hang Error (IOAF code 3)

    Execution of the command buffer was aborted due to an error during execution. Ignored (for causing prior/excessive GPU errors) (IOAF code 4)

    The last message (with IOAF code 4) is then produces infinitely.

    As we use command buffer to draw some meshes, the first step for me was to totally disable all command buffers. But even with all our command buffers disabled, the game freezes with the same error messages. I double checked it by displaying the commandBufferCount of the camera which was zero all the time while testing.

    Our game is currently in the app store using Unity 5.5.4p3 and has this error. But we are internaly updated to Unity 2017.2.0f3 and also 2017.3.0b4, but both produce the same error.

    We couldn't reproduce this problem on iOS Devices with older GPUs. So e.g. the iPhone 7 is working without any problems.

    Do you have any suggestions what I could test for further investigation?
     
  2. martonekler

    martonekler

    Unity Technologies

    Joined:
    Feb 5, 2015
    Posts:
    31
    Hi Thomas!

    The error message you pasted indicates a GPU restart and could be either a driver issue or a bug in Unity. It can be due to various reasons and based on your workaround description we haven't isolated a similar problem yet.

    Could you file a bugreport? If possible, include your project or a stripped down repro case so we can try to reproduce it. If that's not possible for any reason, please include at least a console log with Metal validation turned on. To do this, in Scheme settings -> Run -> Options -> "GPU Frame Capture: Metal" + "Metal API Validation: Extended"

    Thanks
    Marton
     
  3. thomas-weltenbauer

    thomas-weltenbauer

    Joined:
    Oct 23, 2013
    Posts:
    72
    Thank you for your answer!

    If I enable the Metal API Validation, I get a signal SIGABRT with the error message while starting the game (occures during SplashScreen):

    Thread: UnityGfxDeviceWorker
    Method: DrawBufferRanges()
    Message: validateFunctionArguments:3379: failed assertion `Fragment Function(xlatMtlMain): The pixel format (MTLPixelFormatR16Unorm) of the texture (name:UnityAttenuation) bound at index 1 is incompatible with the data type (MTLDataTypeHalf) of the texture parameter (_LightTextureB0 [[texture(0)]]). MTLPixelFormatR16Unorm is compatible with the data type(s) (
    float
    ).'

    I added a bug report (https://fogbugz.unity3d.com/default.asp?959930_q59tm1l8k3s3tnpo).


    In the meantime: Any further things I could test? I also will try to reproduce this in a smaller project, as our game is to big to upload it.
     
  4. martonekler

    martonekler

    Unity Technologies

    Joined:
    Feb 5, 2015
    Posts:
    31
    This seems to be unrelated to your issue.

    Please try to craft a smaller repro case so we can investigate on our side!
     
  5. martonekler

    martonekler

    Unity Technologies

    Joined:
    Feb 5, 2015
    Posts:
    31
    Meanwhile you could try creating a temporary copy of your Unity installation and replacing offending samplers in the bundle e.g. in ./Unity.app/Contents/CGIncludes/AutoLight.cginc

    change
    sampler2D _LightTextureB0;
    to
    sampler2D_float _LightTextureB0;

    to work around these validation errors and maybe catch one related to your issue.
     
  6. thomas-weltenbauer

    thomas-weltenbauer

    Joined:
    Oct 23, 2013
    Posts:
    72
    I tried sampler2D_float and sampler2D_half for both samplers with the name "_LightTextureB0" in AutoLight.cginc but I'm still getting the "incompatible with the data type" error.
     
  7. thomas-weltenbauer

    thomas-weltenbauer

    Joined:
    Oct 23, 2013
    Posts:
    72
    Ok, the game it's working now with extended Metal API Validation.
    I had to change sampler2D to sampler2D_float for both _LightTextureB0 in the AutoLight.cginc file and also for unity_NHxRoughness in the UnityStandardBRDF.cginc file.

    The game is now starting with "GPU Frame Capture" set to "Metal" and "Metal API Validation" set to "Extended". I'm sure it is enabled because it logs:

    Metal GPU Frame Capture Enabled
    Metal API Extended Validation Enabled


    I played the game until it freezes again. But there are not additional logs to the "GPU hang errors" mentioned above.
    Do I have to look anywhere else? Do you have any other suggestions?

    I can't find out what exactly is causing the problem. So currently I can't reproduce the problem in a smaller repro.
    I made a build with a test scene where I added cubes with all of our materials to find out, if this is a problem of a specific shader. But the game didn't freeze.

    Any other ideas?

    Thank you for your help!
     
  8. thomas-weltenbauer

    thomas-weltenbauer

    Joined:
    Oct 23, 2013
    Posts:
    72
    While driving around in the game I got another API Validation Error:

    -[MTLRenderPipelineDescriptorInternal validateWithDevice:]:2150: failed assertion `vertexFunction is associated with a different device'


     
  9. thomas-weltenbauer

    thomas-weltenbauer

    Joined:
    Oct 23, 2013
    Posts:
    72
    Any further ideas what I could test?
     
  10. CDF

    CDF

    Joined:
    Sep 14, 2013
    Posts:
    1,311
  11. thomas-weltenbauer

    thomas-weltenbauer

    Joined:
    Oct 23, 2013
    Posts:
    72
    Just as info for other developers:

    Together with the Unity Support we found out, this is really a driver problem of the devices.
    The way matrices in shaders are handled have changed on newer apple devices.

    The fix for us was to make sure, indices to handle matrix "cells" are valid, so a simple clamp between max and min possible values fix this problem.

    Our small test project will be handed over to Apple for deeper investigation of the problem.
     
  12. RussBartley2

    RussBartley2

    Joined:
    May 4, 2017
    Posts:
    3
    Hi Thomas,
    I have a similar issue & would like to clamp my matrix indices.
    Where did you do this? Was it in a shader? If so which shader?

    I'd like to avoid inducing this issue if possible.

    Thanks
     
  13. thomas-weltenbauer

    thomas-weltenbauer

    Joined:
    Oct 23, 2013
    Posts:
    72
    We have very special self written shaders which are using a texture array for lightmapping. The matrix defines the index of the lightmap in the texture array.
    The index is calculated in the shader based on the world position. We just make sure the calculated indices are not out of bounds of the matrix:

    Code (CSharp):
    1. // iPhone 8 fix
    2. gridOffsetX = max(0 , min(3, gridOffsetX));
    3. gridOffsetZ = max(0 , min(3, gridOffsetZ));
    where gridOffsetX and gridOffsetZ are the calculated indices.
     
    wmarsman likes this.
  14. Roth5child

    Roth5child

    Joined:
    Jan 25, 2018
    Posts:
    1
    Hi, I am using the metal framwork on iOS as well. I am capturing images and zipping them and saving them to the phone's document directory and and after about 300 MB are saved (using iOS 12), I get the error "Execution of the command buffer was aborted/GPU Error". This is probably stemming from the same issue you had. What do you think? Is there a work around?
     
  15. martonekler

    martonekler

    Unity Technologies

    Joined:
    Feb 5, 2015
    Posts:
    31
    Hi Roth5child, please file a bugreport with repro case!
     
  16. efge

    efge

    Joined:
    Dec 13, 2007
    Posts:
    62
    Same issue on latest tvOS when anti aliasing is enabled.
    Work around: Disable AA in quality settings ;)
     
  17. martonekler

    martonekler

    Unity Technologies

    Joined:
    Feb 5, 2015
    Posts:
    31
    Hi,

    Please file bugreports for such cases. These errors mean a GPU crash, something we would like to investigate.

    Thanks
    Marton
     
  18. newlife

    newlife

    Joined:
    Jan 20, 2010
    Posts:
    1,080
    Hello,
    we are facing the same issue with Unity 2018.2 and ios 12.1.1. Disabling Metal in Graphics API seems to fix the issue.
    Is there any other way to fix this? We are using cloud build, is it possible to enable extended Metal API Validation in cloud build?
     
  19. martonekler

    martonekler

    Unity Technologies

    Joined:
    Feb 5, 2015
    Posts:
    31
    Hi newlife,

    Could you file a bugreport for this with a repro case Unity project?

    Thanks,
    Marton
     
  20. Ben-BearFish

    Ben-BearFish

    Joined:
    Sep 6, 2011
    Posts:
    1,204
    @martonekler I'm getting a similar crash, but we're not using Metal in Unity 2019.2.5. Is there a reason this would happen?
     
  21. jack-fang

    jack-fang

    Joined:
    May 10, 2013
    Posts:
    17
    IOS:
    I'm getting a similar crash, using Metal api in Unity 2018.2.21f1. Is there a reason this would happen?
    shader no shadow ,no commanbuffer script no fog ,prefab no cast shaddows no receive shadows
    Execution of the command buffer was aborted due to an error during execution. Caused GPU Hang Error (IOAF code 3)
    Execution of the command buffer was aborted due to an error during execution. Ignored (for causing prior/excessive GPU errors) (IOAF code 4)
     
  22. jack-fang

    jack-fang

    Joined:
    May 10, 2013
    Posts:
    17
    Execution of the command buffer was aborted due to an error during execution. Ignored (for causing prior/excessive GPU errors) (IOAF code 4)
    Every frame appears ,Is there a reason this would happen?
     
  23. martonekler

    martonekler

    Unity Technologies

    Joined:
    Feb 5, 2015
    Posts:
    31
    Hi small-U, Ben-BearFish,

    Please file bug reports for this issue. There is no straightforward advice we can give for this kind of issues, will need to investigate first.

    Thanks,
    Marton