Search Unity

  1. Megacity Metro Demo now available. Download now.
    Dismiss Notice
  2. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

Why use compute shaders for ray marching/tracing?

Discussion in 'Shaders' started by Rs, Aug 24, 2019.

  1. Rs

    Rs

    Joined:
    Aug 14, 2012
    Posts:
    74
    A good example is this Ray Tracing tutorial from gamasutra. I implemented some ray marching recently but without using any kernels. I simply do all the operations in the shader. My question is: isn't the shader already intrinsecally parallelized by the GPU as it's computing everything in the fragment function?
    Why would I gain anything (and how much would I gain) using a compute shader instead of a simpler fragment shader?
    Thanks
     
  2. 2youyou2

    2youyou2

    Joined:
    Oct 24, 2015
    Posts:
    1
    I have the same question. Any results?
     
    Rs likes this.
  3. andybak

    andybak

    Joined:
    Jan 14, 2017
    Posts:
    569
    I haven't delved too deeply myself but my understanding is that there are some things you can't do in a fragment shader.

    It's mainly around the lack of access to buffers or any ability to either precalculate stuff or store intermediate calculations for later reuse. A fragment shader is dumb and only knows about it's own single pixel. With a compute shader there is are many possibilities for optimization that wouldn't be possible in a fragment shader.

    Search Github for "raymarch" (or "sdf" etc) and "compute shader" - there's a few projects that might give you some ideas.
     
    Rs likes this.
  4. Invertex

    Invertex

    Joined:
    Nov 7, 2013
    Posts:
    1,539
    Actually you can bind buffers to the pixel/frag stage to read or write. You can even compute stuff in a compute shader and then bind that result to the frag to read from without transferring data around.

    Some benefits of compute off the top of my head:
    • Control over the computation resolution and hardware resource distribution instead of it simply being the pixels the triangle falls on.
    • Asynchronous or simple pre-computing of a result.
    • Can easily implement reductionist computations, where the output of one compute is a lower amount of elements to then be computed in another compute kernel, and so on... leading to more optimal calculations.
    • Can compute arbitrary data that may not relate to a specific pixel, such as vertex data (vertex pass only knows about its current vertex, geo/tess only can know up to 6 adjacency (3 in unity) and would be more wasteful in many circumstances) or any other computation that would benefit from highly parallel processing.
    • Is specifically designed for input/output to arbitrary buffers and allows for further optimizing through the use of thread/work groups and group-shared memory.

    And I'm sure there's much more.
     
    LarsTV, Rs and andybak like this.