Step vs. >= operator

AlexWige · Dec 29, 2018

Hi everyone,

I know that, for performance reasons, we should avoid if statement in Cg. So as one does, I use the step function quite a lot, but I've also seen the > and < operators used for the same kind of things.

Code (Cg):

// Outputs 0 or 1

float test1 = (x > y);

So I was wondering, does this count as branching? In other words, can use those operators instead of step() and have the same performance? I just think it is so much clearer at first glance, instantly readable and offers much more options, since you can use >, <, <= or >=.

Peter77 · Dec 29, 2018

If I remember correctly, step() is being expanded to >= by Unity's shader generator. So it doesn't seem to make a difference in unity if you use step or >.

Just fire up a profiler like renderdoc/pix/instruments/etc and look at the shader and compare performance to be sure.

Przemyslaw_Zaworski · Dec 29, 2018

Comparison:

Code (CSharp):

float4 SetPixelShader (float4 vertex:POSITION, float2 uv:TEXCOORD0) : SV_TARGET

{

float k = uv.x >= 0.5;

return k.xxxx;

}

Code (CSharp):

-- Hardware tier variant: Tier 1

-- Fragment shader for "d3d11":

// Stats: 2 math, 1 temp registers

Shader Disassembly:

//

// Generated by Microsoft (R) D3D Shader Disassembler

//

//

// Input signature:

//

// Name Index Mask Register SysValue Format Used

// -------------------- ----- ------ -------- -------- ------- ------

// SV_Position 0 xyzw 0 POS float

// TEXCOORD 0 xy 1 NONE float x

//

//

// Output signature:

//

// Name Index Mask Register SysValue Format Used

// -------------------- ----- ------ -------- -------- ------- ------

// SV_TARGET 0 xyzw 0 TARGET float xyzw

//

ps_4_0

dcl_input_ps linear v1.x

dcl_output o0.xyzw

dcl_temps 1

0: ge r0.x, v1.x, l(0.500000)

1: and o0.xyzw, r0.xxxx, l(0x3f800000, 0x3f800000, 0x3f800000, 0x3f800000)

2: ret

// Approximately 0 instruction slots used

-----------------------------------------------------------------------------------------------------------------

Code (CSharp):

float4 SetPixelShader (float4 vertex:POSITION, float2 uv:TEXCOORD0) : SV_TARGET

{

float k = step (0.5, uv.x);

return k.xxxx;

}

Code (CSharp):

-- Hardware tier variant: Tier 1

-- Fragment shader for "d3d11":

// Stats: 2 math, 1 temp registers

Shader Disassembly:

//

// Generated by Microsoft (R) D3D Shader Disassembler

//

//

// Input signature:

//

// Name Index Mask Register SysValue Format Used

// -------------------- ----- ------ -------- -------- ------- ------

// SV_Position 0 xyzw 0 POS float

// TEXCOORD 0 xy 1 NONE float x

//

//

// Output signature:

//

// Name Index Mask Register SysValue Format Used

// -------------------- ----- ------ -------- -------- ------- ------

// SV_TARGET 0 xyzw 0 TARGET float xyzw

//

ps_4_0

dcl_input_ps linear v1.x

dcl_output o0.xyzw

dcl_temps 1

0: ge r0.x, v1.x, l(0.500000)

1: and o0.xyzw, r0.xxxx, l(0x3f800000, 0x3f800000, 0x3f800000, 0x3f800000)

2: ret

// Approximately 0 instruction slots used

bgolus · Dec 29, 2018

Peter77 said: ↑

step() is being expanded to >= by Unity's shader generator
Click to expand...

Not by Unity’s shader generator, that’s how step() is implemented by HLSL & GLSL. It would compile to the same code regardless of using Unity or not.

Gloubi88 said: ↑

does this count as branching?
Click to expand...

Not really, no. I often refer to step() and similar in-line >= comparisons as “fast branches”, but really they’re just comparisons and aren’t branches at all. There’s no control flow; there’s no divergent code paths, it’s just choosing one value or another. In the vast majority of situations even more complex if statements get turned into these kinds of comparisons with both sides of a conditional running 100% of the time and the GPU simply choosing the appropriate results afterwards.

An actual branch would show an if_z or if_nz in the compiled shader, unlike the above examples from @Przemyslaw_Zaworski which show a ge.

ReadyPlayGames · Jan 2, 2019

Interesting! I didn't know this myself.

Search Unity

Step vs. >= operator

AlexWige

Peter77

QA Jesus

Przemyslaw_Zaworski

bgolus

ReadyPlayGames

Search Unity

Unity ID

Useful Searches

Step vs. >= operator

AlexWige

Peter77

QA Jesus

Przemyslaw_Zaworski

bgolus

ReadyPlayGames