Search Unity

About branching and min/max

Discussion in 'Shaders' started by jamius19, Oct 1, 2017.

  1. jamius19

    jamius19

    Joined:
    Mar 30, 2015
    Posts:
    96
    As we know in mobile GPU branching caused by if-else statement kills performance depending on the situation.
    Now looking at the CG Documentations from Nvidia, it seems that the max() or min() functions are implemented by means of ternary operator. See here.

    Now, as the ternary operator is basically just another form of if statement, would that mean using max() or min() functions will have an impact on mobile game performance (Open GL ES 2.0)?
    Can anybody elaborate on this topic a little bit, as I'm a bit confused about this.

    If so, how can I possibly optimize my shader that's already using the max() or min() functions.
    Any kind of direction is appreciated.

    Thanks for any help regarding this issue.
     
  2. Michal_

    Michal_

    Joined:
    Jan 14, 2015
    Posts:
    365
    First of all you have to understand that there is a big difference between the source code you write and the compiled code that runs on the hardware. What that means is that not every 'if-else' will be compiled as a conditional branch. Most of them won't in fact. 'if-else' statements are often compiled in a way that both branches are executed every time and the correct result is then selected with conditional move instruction without any branching.
    Consider following snippet
    Code (CSharp):
    1. if(a < b)
    2.     c = a;
    3. else
    4.     c = b;
    This will probably be compiled into 2 hardware instructions (pseudo assembler):
    Code (CSharp):
    1. // less-than instruction; c = a < b ? 1 : 0
    2. lt c, a, b
    3. // conditional move instruction; c = c == 1 ? a : b
    4. movc c, c, a, b
    Above instructions are usually implemented in hardware and are therefore very fast.

    So to summarize, min/max functions and simple if-else statements are very fast on any half decent gpu. But it all depends on your specific compiler/driver/gpu combination. The only way how to be sure is to take a look at the disassembly for your target platform (different GPU vendors provide different tools for that).

    With all that being said I wouldn't worry about micro optimizations without profiling the shaders first. Modern GPUs are very complex machines and they sometimes behave in a counter-intuitive ways. So profiling is a must and learn to love disassembly if you really care about optimizations.
     
  3. bgolus

    bgolus

    Joined:
    Dec 7, 2012
    Posts:
    12,352
    The if statement gets a bad rap from people not understanding what it does in shaders, and from people who did understand telling people who didn't to not use them.

    Like @Michal_ said it really depends on the way it's used, the if itself isn't really the problem. The biggest problem with conditionals for OpenGL ES 2.0 is people treating them like they would in C# or any other programming language and using if statements to choose between different relatively expensive chunks of code. Doing so produces a shader that's considerably more expensive than shaders that only does one option, but it's not because the if, it's because the shader is still calculating both chunks of code.

    In OpenGL ES 3.0 and modern desktops an if statement can cause a real branch and only execute one side of the code. In this case the if can have an impact on performance by itself depending on how it's used. If you're curious about that you should try reading up on dynamic flow control and static vs dynamic branching.
     
    rigidbuddy and jamius19 like this.
  4. jamius19

    jamius19

    Joined:
    Mar 30, 2015
    Posts:
    96
    Thanks for your detailed explanation. Much appreciated!
     
  5. jamius19

    jamius19

    Joined:
    Mar 30, 2015
    Posts:
    96
    Thanks for the explanation and suggestion regarding the dynamic flow control and static vs dynamic branching. :D