Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Question Questions on multi-sampling

Discussion in 'General Graphics' started by VictorKs, Feb 12, 2023.

  1. VictorKs

    VictorKs

    Joined:
    Jun 2, 2013
    Posts:
    242
    So I've been reading about rasterization and sampling and have some questions on how we can achieve some things in Unity. Answer whichever you like, any help is appreciated :)

    1) How can we achieve SSAA? As far as I know we need to enable MSAA and setup per sample pixel shader invocations. Then render the scene on a Multisampling renderTexture and finally merge the samples and copy the renderTexture on the back buffer.

    2) Does MSAA and SSAA force blit behaviour on the swapchain buffers instead of flip?

    3) With multisampling enabled we have a coverage System Value in the pixel shaders. What can we do with this? I guess the rasterizer informs the pixel shader on which samples our triangle covers. But why would we change their values? Maybe is it the alpha to coverage?

    4) What can we do with conservative rasterization? Can we do custom blending in the pixel shaders with the conserved pixels?

    I have a feeling that all these features are connected with multisampling but I do not know what exactly to do with them.
     
  2. VictorKs

    VictorKs

    Joined:
    Jun 2, 2013
    Posts:
    242
    So after re-reading all of DX11 Documentation and looking around for AA implementations I came across this very helpful DX Guide https://microsoft.github.io/DirectX-Specs/d3d/archive/D3D11_3_FunctionalSpec.htm#1 Introduction
    It goes much deeper in DirectX11 and basically answered my questions, so for anyone wondering:

    1) When enabling MSAA modern GPUs allow per pixel fragment shader invocation or per sample. There is an HLSL semantic SV_SampleIndex if you use it in fragment shaders it forces per sample invocation, thus achieving SSAA. This method though is not exposed in Unity afaik. So an easier method is just to render the scene in a larger resolution texture and downsample it (with custom shader or blit). This method is also more modular since you can resize the target texture resolution with a SSAA factor.

    2) Technically yes since we downsample our render target texture and blit it to the back buffer we technically perform a manual blit. (Still not sure though maybe a graphics guru could provide more info)

    3) We can manually mask which samples get blended but if we do so we disable alpha to coverage. Tbh I still haven't found a use for it. A cool thing we can do in Unity though is enabling alpha to coverage, BGolus has a great post about it.
    https://bgolus.medium.com/anti-aliased-alpha-test-the-esoteric-alpha-to-coverage-8b177335ae4f

    4) Dunno yet maybe more accurate occlusion culling. I think it more useful in DX12 but I do not know enough about it
     
    c0d3_m0nk3y likes this.
  3. bgolus

    bgolus

    Joined:
    Dec 7, 2012
    Posts:
    12,238
    You can use
    SV_SampleIndex
    on the fragment input, or you can add the
    sample
    modifier to a fragment input interpolator.
    Code (csharp):
    1. struct v2f {
    2.   float4 pos : SV_POSITION;
    3.   sample float2 uv : TEXCOORD; // forces subpixel interpolation and shader execution
    4. };
    Note
    SV_SampleIndex
    will force all interpolators to return subpixel values. Using
    sample
    only those interpolators will return subpixel values, all others will still use the pixel center interpolated value.
    https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-struct

    Both are supported by Unity, if you're using a vertex fragment shader and targeting a recent enough API. They're not supported by Surface Shaders or Shader Graph, so (ab)using either of these options to get super sampling with MSAA isn't possible in those cases. It'll also only do per-subsample shading for those shaders using either of those.

    Using a higher resolution render texture and scaling down gives a lot more flexibility in the amount of super sampling you want, and allows you to do it without modifying every shader. However using MSAA to do it was very popular on consoles for various reasons, one of which was it essentially gave you variable rate shading at a time when that wasn't a feature hardware supported. Though consoles also give you control over the subsample pattern used by MSAA, so you could force an ordered grid that could cleanly be expanded out to a higher resolution, something that wasn't exposed until some versions of Vulkan & DX12. Otherwise the default sample patterns for MSAA aren't in a regular pattern and would cause artifacts if you treat each subsample as a single pixel in the final image.

    MSAA requires a "resolve" pass of some kind to combine the values of the subsamples to a single color. It's not quite a blit, and does not incur the same cost for the frame buffer flip. Though it does add a cost when doing a blit or if you need to sample from a MSAA render target. By default Unity will resolve an MSAA render texture so it can be sampled like a normal texture, though the latest versions have added support for subsample sampling that avoids that extra step (though it means you have to do the resolving yourself if it's wanted if you choose to enable that option).

    What that coverage value actually is is confusing for a lot of people. It's a bit packed list of bools for which subsamples a fragment should render to. For Shader Model 4.1+, the fragment shader can output a coverage value that lets you specify which subsamples to render to. For MSAA 4x, outputting 000 (uint 0) says to render to no fragments, equivalent to
    clip(-1)
    or
    discard
    . Outputting 1111 (uint 15) says to render to all visible fragments. For Shader Model 5.0+ you can get the coverage as an input to know what subsamples the current triangle is visible. Note, you cannot override whatever coverage has already been calculated by the rasterization, if you get an input coverage of 0111 (uint 7) and output 1000 (uint 8) nothing will render because the triangle wasn't visible in that subsample to begin with and the GPU &s the output coverage with the coverage calculated by the rasterization.

    The reason to use
    SV_Coverage
    over alpha to coverage is it gives manual control over the coverage. Alpha to coverage takes the output alpha value and calculates a coverage mask from that. On some GPUs it's a very straightforward conversion which ends up with obvious banding on any kind of gradients, and other GPUs apply a forced dither pattern, which might look better on gradients but worse on sharp edges. Using the manual coverage means you can override the alpha to coverage and supply your own dithering, or disable dithering entirely. It also means you can output alpha values that aren't directly related to the opacity, which is especially helpful if you want to use an MSAA render texture and alpha to coverage that is then composited with another texture. Ideally you'd want the alpha value output by the fragment to always be 1.0 since the alpha to coverage is what's handling the opacity and the resolved alpha value will end up too transparent otherwise.

    The main use case I know of for conservative rasterization is for small detail preservation. If a triangle or line is smaller than a single pixel, it disappears. With conservative rasterization no matter how small an element is, it'll always be at least 1 pixel on screen. That combined with output coverage, or just alpha blending, can mean you can calculate an approximate opacity to simulate anti-aliasing for sub-subsample details.
     
    ekakiya, VictorKs and c0d3_m0nk3y like this.
  4. VictorKs

    VictorKs

    Joined:
    Jun 2, 2013
    Posts:
    242
    While I spent last months studying DX11 I somehow totally forgot about HLSL. I got more reading it seems!

    Btw SV_SampleIndex is not mentioned in unity docs I believe, this is why I assumed that it is not supported. Actually I recently got into vertex/fragment shaders and yes I used to totally abuse Surface shaders :)

    1) Hmm so I guess per sample fragment invocations is more old school SSAA.

    2) About blitting I was referring to SSAA by using larger render target, but your right Multisampled textures need to be resolved first. I thought resolve was like sample averaging+blit to backbuffer, so behaving like a manual blit.

    3) Hmm so technically the rasterizer creates this Coverage bit mask and we can further mask the samples by creating a coverage mask output from the fragment shader. And in shader model 5+ we also have the option to read the initial mask created by the rasterizer, is this correct? And that way we can implement our own effects most likely alpha-to-coverage.
    A2C looks like a really promising technique especially since I usually work in forward rendering mode. Are there performance implications by using it? (I mean other that running MSAA) The only one I can think is that it forces late depth write but then again alpha tested geometry also forces late depth write. Also is it lighter than TAA? (I believe that TAA handles alpha tested aliasing quite well)

    Can you elaborate a little more on that use case?

    4) This sounds like a really good idea but running all those pixels and then alpha blending them sounds really bad for performance. But with A2C maybe it could work well, but 2/4/8 Coverage samples don't provide enough range for smooth blending. Is it possible somehow to increase the samples in the rasterizer but not pass them to the depth-test/output-merger just send them to the fragment shader to calculate a better coverage and then instead of writing multiple samples, write just one sample along with the number of covered samples. I don't really grasp how this would work but the main idea would be to increase rasterizer coverage samples only, is this supported or is it just a bad idea?
     
  5. bgolus

    bgolus

    Joined:
    Dec 7, 2012
    Posts:
    12,238
    It's not. Nor is any standard HLSL, because the stuff in Unity's documentation is only for Unity specific code, or HLSL that directly interfaces with Unity specific code.

    Depends on the hardware, but I would assume a lot of GPUs have fixed function hardware dedicated to resolving MSAA subsamples to the framebuffer. Others might use the same ALU execution units to handle it, in which case it would be very similar to a manual blit, but without the CPU side work that comes with a blit normally.

    The implication is that you may have greater overdraw with A2C, or manual coverage outputs. As they affect sub-sample coverage, with MSAAxN it's possible to have N triangles visible in every pixel where as
    clip()
    affects the entire pixel and, assuming you don't have a lot of micro triangles, usually most triangles pixel quads aren't seeing a lot of triangle edges (the other case where you can have N triangles visible in a pixel).

    Whether or not this is a performance problem depends on your scene. It's also another reason for using the sharpening technique I discuss in my article on A2C as it limits the areas of partial visibility.

    And no, it's much more expensive than TAA.* TAA is basically free in comparison, assuming you're already rendering out motion vectors for other reasons. MSAA is not free, and A2C is more expensive than straight alpha testing (though by how much again depends on the scene content, and also the GPU). The whole idea behind TAA was to get high sample MSAA like image quality without having to pay for the performance impact of it. TAA can reach MSAAx16 level quality under ideal situations for a cost less than even MSAAx2.

    But also TAA isn't perfect and has a lot of fail cases where it can either over blur or miss AA entirely. TAA also handles a lot of cases MSAA is incapable of anti-aliasing without going full SSAA.

    * This assumes the GPU in question is capable of doing a full screen blit without halving the framerate, which many mobile GPUs are not.

    Multi-res rendering, most commonly used for off screen particle rendering, but may also be used for artistic reasons, let you render some objects at a different resolution than the main scene with the idea that you composite them back together to produce a final image. If you're using A2C in the render texture you intend to composite on top, it'll have the wrong alpha value when sampling the resolved render texture.

    Basic example: You output an alpha of 0.5, which A2C interprets as half of the subsamples are visible, and then writes 0.5 to half the subsamples. The resolved render texture's alpha for that pixel will be 0.25, not 0.5, as only half of the subsamples were 0.5, and the rest were 0.0. What you want is for the alpha that's written to half the subsamples to be 1.0, so that the resolved alpha is 0.5.

    Well, AFAIK you can't use conservative rasterization with Unity at all as it requires setting the current rasterization state to allow for it, and Unity does not.
     
    VictorKs likes this.
  6. burningmime

    burningmime

    Joined:
    Jan 25, 2014
    Posts:
    845
    You can. Or at least ShaderLab says it can:
    https://docs.unity3d.com/Manual/SL-Conservative.html

    I used it in a custom SRP for light culling: https://forum.unity.com/threads/raster-based-light-culling-in-tiled-forward-renderer.1363728/
     
    VictorKs and bgolus like this.
  7. VictorKs

    VictorKs

    Joined:
    Jun 2, 2013
    Posts:
    242
    So I guess TAA is the way to go. Btw I looked around for increasing coverage samples and turns out it has already been implemented by both Nvidia and AMD (CSAA/EQAA).

    This looks really interesting but still beyond my knowledge I need a little more reading on HLSL and soon I will get into DX12 and SRPs.

    Thanks a lot BGolus this was a really informative thread, everything was a little fuzzy but it all makes sense now.