Search Unity

Why does smoothstep get "optimised" OpenGLES?

Discussion in 'Shaders' started by mikesx, Mar 3, 2016.

  1. mikesx

    mikesx

    Joined:
    Oct 1, 2013
    Posts:
    4
    I'm trying to find out why on openglES smoothstep gets expanded:
    Code (CSharp):
    1.  tmpvar_3 = clamp (((c_1.xyz - vec3(_min)) / (vec3(_max) - vec3(_min))), 0.0, 1.0);
    2.   c_1.xyz = (c_1.xyz * (tmpvar_3 * (tmpvar_3 *
    3.   (3.0 - (2.0 * tmpvar_3))
    4.   )));
    If the gpu has a native function for it, it will be faster?
    The must be a reason for doing this?

    According to PVRShaderEditor , the shader I'm looking at goes from 11 cycles to 3 cycles when I put the smoothstep back in.
     
  2. Farfarer

    Farfarer

    Joined:
    Aug 17, 2010
    Posts:
    2,249
    I'm guessing for compatibility purposes?
     
  3. Michal_

    Michal_

    Joined:
    Jan 14, 2015
    Posts:
    365
    I wouldn't worry about that. GLSL Optimizer will inline anything it can to make it easier for the compiler in the driver.
    The important thing to realize is that not every built-in GLSL/HLSL function is directly supported by the gpu. Very few are in fact. Functions like sin(), sqrt() or rcp() map directly to hw instructions (it depends on the specific hw). But more complex functions like normalize(), pow(), smoothstep() don't have direct hw support and will be expanded into multiple instructions by the compiler in driver.
    So built-in functions are more of a hint to the compiler than anything else. If you use normalize(), you're telling the compiler that you want to normalize a vector and it will give you the best way to do it on a given hw. But that doesn't mean the hw has a direct support for it. The driver will likely expand normalize() into 3 different instructions.

    In other word, if GLSL Optimizer wouldn't expand smoothstep, then the compiler in driver likely would. I even tried PVRShaderEditor with both variants and got the same estimation for both. Not sure why you get different results. You can always measure the difference if you want to be 100% sure.
     
  4. mikesx

    mikesx

    Joined:
    Oct 1, 2013
    Posts:
    4
    This doesn't make logical sense, why would you expand it before the driver. The driver will know the best way for this function to be executed based on it's hardware.

    I've just double checked again and I must of made a mistake editing in the PVRShaderEditor, I do get a small gain using smoothstep rather than the expanded, but its more like a cycle difference which is nominal.
     
  5. Michal_

    Michal_

    Joined:
    Jan 14, 2015
    Posts:
    365
    In ideal world, yes. It would be better to let the driver do its job. Not so much in real world. Driver compilers are notoriously bad. Or at least they were. That's why Unity and many others pre-optimize glsl shaders. It was common to get several times better performance if you pre-optimized your shaders with GLSL Optimizer. Drivers were that bad!

    It is possible that PowerVR now has a decent driver and that it will give you better performance than GLSL Optimizer but this is not the case in general.

    I don't know why GLSL Optimizer expands smoothstep but chances are it makes the shader faster on some GPUs.