Search Unity

expensiveness of high powers

Discussion in 'Shaders' started by CauseMoss, Dec 4, 2019.

  1. CauseMoss

    CauseMoss

    Joined:
    Sep 16, 2017
    Posts:
    20
    Does rising to high powers (like 50) in shader code is more expensive than raising to low powers? and is raising to integer powers less expensive than raising to float powers? if it is how to avoid them? you know when you want a smooth gradient be sharper and "moved" close to 1.
     
  2. Olmi

    Olmi

    Joined:
    Nov 29, 2012
    Posts:
    1,553
    Hi,
    To answer one of your questions - on PC GPUs shaders always use full 32-bit floating point precision as far as I know, so it really doesn't make much difference if you use float, half or fixed. But if someone knows better correct me.

    And if you use pow, maybe you might instead try to avoid using it altogether if possible, like bake lookup textures in advance and only update it when a change happens in your value. Then it will be for sure faster.
     
  3. bgolus

    bgolus

    Joined:
    Dec 7, 2012
    Posts:
    12,339
    Yes and no.

    If you're hard coding the number in the shader, then yes. It also depends on if you're using integer or floating point powers. A pow(x, 1.99) is several times more expensive than pow(x, 2).

    If a number is a hardcoded integer the cycle cost is roughly proportional to the sqrt of the power, rounded to the next whole number. Basically it’s how many multiplies needed to reproduce the power.

    pow(x, 2) = x * x, so 1 instruction.

    pow(x, 3) = x * x * x, so 2 instructions.

    pow(x, 4) = ...
    Code (csharp):
    1. float pow4(x) {
    2.   float x2 = x * x;
    3.   return x2 * x2;
    4. }
    Which is also 2 instructions.

    etc.

    For non integer values, as well as any non hardcoded power (material property, interpolated value from vertex shader, per vertex data, etc) gets calculated as:
    Code (csharp):
    1. float power(x, y) {
    2.   return exp(x * log(y));
    3. }
    That’s three instructions, an exp, log, and mul, but those first two aren’t guaranteed to be a single cycle on all hardware. Most likely 2 cycles each. So a total of 5 cycles. Equivalent to a hardcoded pow(x, 25).

    edit: As a side note, once a hardcoded integer pow() would take more than 3 multiplies (ie: pow(x, 10)), the shader gets compiled to the exp(x * (log)) version, so it would seem modern shader compilers assume that's just as fast or faster than 4 multiplies.
     
    Last edited: Dec 5, 2019
  4. CauseMoss

    CauseMoss

    Joined:
    Sep 16, 2017
    Posts:
    20
    Ok, thank you.
    And how would you do something like tightening those dot products... like specular light or fresnel. if you want it to be sharper, closer to the edge (fresnel)... it works with high powers, but how to avoid expensive calculations.
     
  5. bgolus

    bgolus

    Joined:
    Dec 7, 2012
    Posts:
    12,339
    Specular powers tend to be pretty high values, and set by a non-static variable. So it's going to always be the 3 instructions (3~5 cycle) version of pow. That's fine and honestly inconsequential to the cost of the rest of the lighting calculations.
     
    Genebris and CauseMoss like this.