Super high level questions on understanding compute shaders limitations for mesh generation

Luthis · Oct 13, 2021

Hi all, I have just some basic general questions to clarify my understanding. I've been doing a course on Udemy and still haven't quite fit everything into place. Here's my understanding, could anyone tell me if I'm wrong?

1. Compute shaders: Exclusively does a bunch of [parallel] generic calculations (eg not real on-screen things) to create data that can be used elsewhere. Run once, get data, use it in cs scripts.
2. Frag/Vert shaders: Modifies screen pixels (eg real on-screen things)
3. C# + Editor: Does real on-screen things.

If I wanted to use the magical power of shaders for procedural terrain generation, I would use a compute shader to generate vertices/heightmaps etc, but I would still need to pass that data back to the CPU in order to actually use it, right? So to actually get a thing to show on the screen I would need to use the compute generated data to create a static mesh object and instantiate that, rather than the compute being able to do that all internally?

I've been playing with a project that uses a compute shader + frag and Graphics.DrawProceduralIndirectNow in OnPostRender() to draw a mesh, and I can modify that, but at medium sizes it lags out heavily. My guess is that the lag is due to the generation happening over and over, when it need only happen once for something static.

So to properly do this, would I use a RWStructuredBuffer in the compute to get back the vertices, and then create a mesh in C#, or is there a better way to do it?

Lastly, is the only use case for using a compute is to do a bunch of simple calculations to save on CPU time? Or am I missing a huge chunk of knowledge and it's possible to build the entire game entirely in HLSL?

Thanks!

bgolus · Oct 13, 2021

Luthis said: ↑

1. Compute shaders: Exclusively does a bunch of [parallel] generic calculations (eg not real on-screen things) to create data that can be used elsewhere. Run once, get data, use it in cs scripts.
Click to expand...

Lets skip this one for the moment.

Luthis said: ↑

2. Frag/Vert shaders: Modifies screen pixels (eg real on-screen things)
Click to expand...

This is mostly right. Shaders which you can assign to a material, which would be vertex (&) fragment shaders, surface shaders, shader graph, etc (the two of which I listed there are vertex fragment shader generators) are used to define where and how a mesh appears when rendered. That can be "on screen", or to another render texture which may or may not directly appear on screen. Shadows for example are done using shadow maps which render the mesh to a render texture that just needs a depth value, essentially the distance from the light, and which uses a special shadow caster shader pass. Or for post processing which takes a render texture of the screen as it's already been rendered and modifies it. But yes, most commonly they're used to render "real on-screen things". These all run on the GPU and are written in a language called HLSL which is explicitly designed for running on a GPU.

Luthis said: ↑

3. C# + Editor: Does real on-screen things.
Click to expand...

C# runs on the CPU, which can't put anything “on screen” itself. It can ask the GPU to do things to put stuff on the screen. And generally the GPU needs the CPU to ask it to do stuff as it can’t do anything on it’s own. But c# scripts can do a lot without having to put things on screen. Most of the game logic, keeping track of where things are, etc. is done on the CPU and in c# or the c++ systems in Unity that back them up. C# is a general purpose programming language, and the CPU does general purpose computation.

But that’s not to say the GPU can’t do a lot.

Let’s go back to compute shaders. These are written in shader code, just like vertex fragment shaders, and they can actually make things show on screen on their own if you tell them to. These are more like c# in that they’re more general purpose. If you wanted, you could use compute shaders to handle game code, to keep track of object positions, etc. You could even get that data back on the CPU if you wanted. But usually you use compute shaders to do stuff that only the GPU needs to know about. Or to do something that would be extremely expensive to do on the CPU. GPUs are very good at doing a lot of the same thing over and over.

The main problem is getting data back from the GPU to the CPU takes a while. It can be several milliseconds, potentially several hundreds of milliseconds, depending on the amount of data. So usually if you’re going to do something on the GPU and hand it back to the CPU, it’s to do something you only need to do once, like on level load or in the editor before hand. Or is an expensive operation that results in a very small amount of useful data. Or is something you can give to the GPU and get back a few frames later.

In the specific case of generating terrain, if you want your CPU side code to be able to interact with the terrain, like move a character over it and know what the height of the terrain is, then yes you’d need to copy that data back to the CPU to use. But for rendering, no. The compute shader can calculate the terrain, put it in that compute buffer, and then it can be rendered with a shader that knows to use the data in that buffer. The data never needs to come back to the CPU. In terms of what kind of commands you normally call on the CPU side for this, the “compute buffer” object you’re passing between the CPU side calls is more like a container for an identifier than actually moving the data around. When you issue the compute shader the c# code is assigning the compute buffer on it, which is just saying “write to buffer #2”. Then the buffer is assigned to a material and told to rendered, and that is just saying “the data you want is in buffer #2”. That’s it.

If you don’t need the data updated ever frame, you only need to run the compute shader once. The data is still there, and can be reused. Just like if you were to create a mesh in c# and upload the data, the vertex positions are just another set of data no different than the data in the compute shader.

bgolus · Oct 13, 2021

So the answer to “will moving this to the CPU make it faster” is … no idea. Maybe? Probably not? To me it sounds more like there are some unnecessary things your game is doing that could be optimized or entirely removed without causing problems. But you need to profile and find out where your slow downs are.

Luthis said: ↑

Or am I missing a huge chunk of knowledge and it's possible to build the entire game entirely in HLSL?
Click to expand...

And lastly, as mostly a “here’s something to mess with your head” … all of these games are running entirely in vertex fragment shaders. Because you can do general purpose code even in those if you wanted to, with some significant caveats.
https://shadertoyunofficial.wordpress.com/2017/11/11/playable-games-in-shadertoy/

Luthis · Oct 14, 2021

Thanks heaps! That's way more info than I was expecting. On the second read-through I actually got more out of it.

bgolus said: ↑

the “compute buffer” object you’re passing between the CPU side calls is more like a container for an identifier
Click to expand...

^^ That was really helpful.

I've looked at a few things in shadertoy, a lot of it goes over my head but if I read it slowly I can understand.

I'll get back to the Udemy course and keep learning, now that I have some context. Cheers!

Search Unity

Unity ID

Useful Searches

Super high level questions on understanding compute shaders limitations for mesh generation

Luthis

bgolus

bgolus

Luthis