Help with looping bottleneck in shader

AzeExMachina · Feb 25, 2019

Hi everyone,

I am creating a heatmap for my application, I am simply visualizing it over a world map such as this example here

I've used a shader applied on a quad as big as my camera frustum to visualize it and I was able to do it.

The issue is that for this shader to work it needs to loop every single pixel of the quad for many times as all the points I want to visualize, which is actually a big bottleneck, since it can be like 100x100x300 loops, this obviously weighs too hard on the app performance. Does anyone know how to avoid this behaviour?

One thing I did to try to avoid this was to divide my quad into many little quads each with its own material, positioned in a grid as big as the first quad. This way I could save on my performance but it's a bit confusing when the point is shared between quads

Down here I'm posting my shader code, to pass the points I'm simply using things like Material.SetVectorArray() etc

Code (CSharp):

struct vertInput

{

float4 pos : POSITION;

};

struct vertOutput

{

float4 pos : POSITION;

fixed3 worldPos : TEXCOORD1;

};

vertOutput vert(vertInput input)

{

vertOutput o;

o.pos = UnityObjectToClipPos(input.pos);

o.worldPos = mul(unity_ObjectToWorld, input.pos).xyz;

return o;

}

half4 frag(vertOutput output) : COLOR

{

half h = 0;

for (int i = 0; i < _Points_Length; i ++)

{

half dist = distance(output.worldPos, _Points[i].xyz);//_Points[i].xyz is the point I'm passing to my shader

half radi = _Properties[i].x; //this is the radius of the area around the actual point

half hi = 1 - saturate(dist / radi);

h += hi * _Properties[i].y; //Properties[i].y is just an intensity modifier

}

h = saturate(h);

half4 color = tex2D(_HeatTex, fixed2(h, 0.5));

return color;

}

bgolus · Feb 25, 2019

AzeExMachina said: ↑

One thing I did to try to avoid this was to divide my quad into many little quads each with its own material, positioned in a grid as big as the first quad. This way I could save on my performance but it's a bit confusing when the point is shared between quads
Click to expand...

Yeah, this is a good approach for what you're attempting. You're roughly trying to do what so many tiled / clustered lighting systems use, which is break down the number of objects per tile. Note that unless you're careful each tile will still be iterating over "300" points even if you break it down into individual quads! When using WebGL or GLES you're kind of limited as to what you can do since many of the dynamic branching / iterating options that desktop GPUs have access to aren't available. Specifically WebGL can't do dynamic iterators, so the loop is unrolled to a fixed length and does all of the work regardless of what the _Points_Length is set to. The real shader then looks a bit like this:

Code (csharp):

float h = 0;

float temp0 = calcH(_Point[0]);

if (0 < _Points_Length)

h += temp0;

float temp1 = calcH(_Point[1]);

if (1 < _Points_Length)

h += temp1;

float temp2 = calcH(_Point[2]);

if (2 < _Points_Length)

h += temp2;

// ...

float temp299 = calcH(_Point[299]);

if (299 < _Points_Length)

h += temp299;

You might be better off creating a version with a hard coded limit of 10 or 20 items, and then bin up your points by position and render scaled quads that cover the area those points will, rendering only those 10 or 20 at a time. Render that into an R8 format render texture and read from that when drawing the heat map.

You could even do it with a max count of say 50 and then draw to the RT over a few frames when the data changes rather than doing it all at once. Then you don't have to calculate it every update as the heatmap will then be cached.

AzeExMachina · Mar 8, 2019

Hey there, thank you for your answer, glad I was alreayd on the right track. In the end I did just that, divided all the points in buckets and also put the output on a RenderTexture to only update it when needed. Thanks for your help!

Search Unity

Unity ID

Useful Searches

Help with looping bottleneck in shader

AzeExMachina

bgolus

AzeExMachina