Search Unity

  1. Megacity Metro Demo now available. Download now.
    Dismiss Notice
  2. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

Resolved Dither node (Converting float to uint) causes extreme slowdown on Mac OS.

Discussion in 'macOS' started by BasicallyGames, Jun 6, 2020.

  1. BasicallyGames

    BasicallyGames

    Joined:
    Aug 31, 2018
    Posts:
    91
    So I've finally gotten around to figuring out how to get my game notarized for Mac. That's going pretty well, but I've encountered an extremely unexpected problem. Once I signed my game and was able to run it, I decided to take it for a test run just to check if it runs properly. While it runs without errors, it runs *extremely* slow once the game loads (Menus are fine). Less than 1 FPS when full screen, but performance improves at lower resolutions (Not enough to be playable). This is a 2014 MacBook Pro, and I can run the game just fine on a significantly weaker Windows PC I have, so I have no idea what the problem is. The game uses the URP and shaders I created with Shader Graph. Scripting backend was Mono, though I don't think IL2CPP will make a difference since the issue seems to be with rendering. Game was built with Unity 2019.3.15f.

    Any advice? I really have no idea where to go from here.
     
    Last edited: Jun 6, 2020
  2. JoeStrout

    JoeStrout

    Joined:
    Jan 14, 2011
    Posts:
    9,859
    I'd say the next step is to get Unity on that Mac, and run it under the profiler. You can't know where the problem is just by looking at it from the outside.
     
  3. BasicallyGames

    BasicallyGames

    Joined:
    Aug 31, 2018
    Posts:
    91
    Just did that! The issue is stemming from Gfx.WaitForPresentOnGfxThread. This has a dropdown which reveals Semaphore.WaitForSignal which is using the entirety of Gfx.WaitForPresentOnGfxThread. It's taking a couple hundred milliseconds on some frames! Any clue what could be causing that?

    Edit: To make matters even weirder, everything runs fine while I'm in the editor, but I enter play mode and everything slows to a crawl again. I know it's not a script causing the slowdown because I was able to get a decent speed by disabling the layer with most geometry in my camera's culling mask.

    Edit 2: I tried simplifying all of my shaders (Basically making it so all they do is display a texture) and that seems to have improved performance immensely. I'll look into this more tomorrow and see if I can figure out what's causing the performance issues. Hopefully this won't require any major reworking of my shaders...
     
    Last edited: Jun 8, 2020
  4. BasicallyGames

    BasicallyGames

    Joined:
    Aug 31, 2018
    Posts:
    91
    Okay, so I was able to isolate the cause of the slowdown. It seems to be this dither node used in my shaders as part of a fog effect.



    This is within a fog sub-graph. When I disconnect these three nodes from the graph, performance is normal. Is there some issue with this graph that I'm not seeing, or is the slowdown a bug on Unity's end? For now I guess I'll just disable the dithering on Mac builds of my game, but I hope it's possible for this to be fixed eventually as the fog in my game looks much nicer with the dithering.

    Edit: For kicks I created a new URP project and set all the materials to a new one I made that is just the dithering node assigned to color. Same issue. This definitely seems to be a bug with Unity.
     
    Last edited: Jun 8, 2020
  5. melos_han_tani

    melos_han_tani

    Joined:
    Jan 11, 2018
    Posts:
    78
    ...I have this slowdown on Shader Graph as well (Unity 2019.4), only on my Mac (A Macbook Pro from 2012). A simple scene runs at FPS but becomes 1 FPS when the dither effect is applied to my character model.

    Just... why? Was this fixed in a later version of Shader Graph? Is this only an issue on really old MacBooks?

    (Willing to admit that I don't mind that the game (using URP) is not really performant on a 2012 Macbook Pro)
     
  6. BasicallyGames

    BasicallyGames

    Joined:
    Aug 31, 2018
    Posts:
    91
    This is still happening as of Unity 2020.3.11f. I was hoping the updates to the URP would fix this, but it's still an issue. Really annoying that I have to use different shaders that don't use the dithering on Mac.
     
  7. mons00n

    mons00n

    Joined:
    Sep 18, 2013
    Posts:
    304
    I wonder if this is part of a larger problem? I am experiencing a similar bug with volumetric fog in HDRP on unity 2020 but not 2019 (see this thread). Which feels like a slightly different manifestation of the Mac instanced indirect bug that’s been around for over 2 years now. If unity is using instance in the editor and instanced indirect in the build that may explain the discrepancy you are seeing. Again, all speculation, but Gfx.WaitForPresentOnGfxThread Semaphore.WaitForSignal is what shows as the huge slowdown when doing any instanced indirect rendering on macOS.
     
  8. BasicallyGames

    BasicallyGames

    Joined:
    Aug 31, 2018
    Posts:
    91
    I think I've found the core of the issue. I tried copying the example code of the dither node into a custom function node.
    Code (Shader):
    1. {
    2.     float2 uv = ScreenPosition.xy * _ScreenParams.xy;
    3.     float DITHER_THRESHOLDS[16] =
    4.     {
    5.         1.0 / 17.0,  9.0 / 17.0,  3.0 / 17.0, 11.0 / 17.0,
    6.         13.0 / 17.0,  5.0 / 17.0, 15.0 / 17.0,  7.0 / 17.0,
    7.         4.0 / 17.0, 12.0 / 17.0,  2.0 / 17.0, 10.0 / 17.0,
    8.         16.0 / 17.0,  8.0 / 17.0, 14.0 / 17.0,  6.0 / 17.0
    9.     };
    10.     uint index = (uint(uv.x) % 4) * 4 + uint(uv.y) % 4;
    11.     Out = In - DITHER_THRESHOLDS[index];
    12. }
    The first thing I tried was changing the line that determined which index to use to use "fmod" instead of "%". So this line
    Code (Shader):
    1. uint index = (uint(uv.x) % 4) * 4 + uint(uv.y) % 4;
    Changed to this
    Code (Shader):
    1. uint index = fmod(uint(uv.x), 4) * 4 + fmod(uint(uv.y), 4);
    This didn't make any difference. However, I then changed the line to just this
    Code (Shader):
    1. uint index = 1;
    And performance went back to normal. Next I tried this
    Code (Shader):
    1. uint index = (4 % 4) * 4 + 4 % 4;
    And performance was still good. Next I tried this
    Code (Shader):
    1. uint index = (uv.x % 4) * 4 + uv.y % 4;
    And performance was poor again. Next I tried
    Code (Shader):
    1. uint index = (uint(200.5) % 4) * 4 + uint(100.555) % 4;
    And performance was good again...

    I've done a bunch of other tests, but it seems to me like the problem lies in converting the float values from uv into a uint. Which doesn't make any sense to me considering that the test where I just replace uv.x and uv.y with static floats ran very well (And also the fact that something like that is extremely basic and shouldn't cause any issues).

    I'm having a hard time coming up with a workaround because of the need to use an int to select from the dither thresholds array. I've tried a bunch of different methods to try and get uv into an int, such as flooring it, or not casting it as a uint, but it seems that as long as that value is used to calculate the index value, performance is ruined on my MacBook.

    I've managed to create a similar dithering effect by using texture sampling at least, but despite it basically using the same color values it doesn't look as good. Not sure what the problem there is but if I can't find a way to modify this code to get it to work on Mac I might continue messing with this texture sampling method.

    Anyone have any ideas on what could possibly be going on here, and how I might work around this?
     
  9. Neto_Kokku

    Neto_Kokku

    Joined:
    Feb 15, 2018
    Posts:
    1,751
    It does seem more like a Mac problem than a Unity problem. That major drop in performance sounds like the OS is falling back to software rendering.

    That dither function does have a red flag to it: the dither threshold array. In GPUs there really isn't such thing as actual local variable arrays: compilers will do their best to compile them into whatever can fullfil the intent of the high level code.

    When you replace the UV by hard coded values, the compiler skips the array indexing completely because it knows exacy what value to use at compile time.

    So the problem here is probably not converting the uv to int, but dynamically indexing into the local array, which could very well be converted into 16 branches by the shader compiler or something like that.
     
  10. BasicallyGames

    BasicallyGames

    Joined:
    Aug 31, 2018
    Posts:
    91
    Ah, that makes a lot of sense, thanks. I'll keep looking into getting my texture sampling method right then.