Search Unity

Optimizing normalmap sampling

Discussion in 'Shaders' started by opel_cobalt, May 25, 2021.

  1. opel_cobalt

    opel_cobalt

    Joined:
    May 18, 2021
    Posts:
    26
    I'm sampling albedos and normalmaps with the same UVs, will I get a measurable performance gain by using the same separate sampler twice on these textures?

    Will a separate shader variant be compiled for gles2, since separate samplers aren't supported there?

    I only need RG channels of normalmaps, how good is the idea of using the other two channels for linear albedo? (Will there be compression/filtering artifacts? Will it be significantly faster than reusing the sampler?)
     
  2. bgolus

    bgolus

    Joined:
    Dec 7, 2012
    Posts:
    12,350
    I'm not sure what you mean by "same separate sampler". You mean use a single inline sampler for both? No. Theoretically that'd be slower, though in reality it's unlikely you'd be able to measure a difference.

    On older GPUs the sampler is tied to an explicit bit of hardware on the GPU, the texture mapping unit. If you have two samplers, two different physical texture mapping units can sample textures in parallel. That means you can essentially sample as many textures as you have texture units at once for the same cost. If you have one sampler that you use for all of the textures then it's reusing the same physical texture mapping unit and it's doing it in parallel, and now it takes as many textures being sampled longer than using separate samplers. This wasn't actually ever true, as things like memory bandwidth would usually end up being a more significant factor than the actual sampling time. And on modern hardware and APIs the sampler and texture unit aren't tied directly to each other for the entire shader execution. Plus shader compilers are really good at hiding the cost of sampling textures by moving sampling as early as possible and then doing the other bits of math before it needs the data back.

    Yes and no. Unity will not compile a separate variant for GLES 2.0 automatically, you have to choose if a shader pass is going to be compiled for GLES 2.0 or not. And if not, it won't run on a GLES 2.0 device. However you can have a shader with two
    SubShader
    s with one set to run on GLES 3.0, and a second set to run on GLES 2.0. If you use Unity's macros for defining and sampling textures, you can use the same shader code for both and the macros will automagically make it all work.

    You can. And this can be faster, at least on modern GLES 3.0 and older desktop GPUs. I don't know any games that do this for linear albedo, but many games do this to store smoothness and some other mask (occlusion, height, anisotropic smoothness, etc) in the normal map texture. But it will absolutely affect the compression quality. The RGB channels of most compressed textures, including the desktop DXT5 format, are compressed together, so any data in one channel has an affect on the accuracy of the others. Unity defaults to using the green and alpha channel of a DXT5 for normals, as do most titles that use a DXT5 texture, as the alpha is compressed separately and the green channel is the highest quality of the RGB channels due to DXT storing color end points as 16 bit "RGB565" values.

    On GLES 3.0 devices, ASTC tends to handle this case a lot better as each block of that format can be compressed in a different way depending on what the contents are, but artifacts can still be apparent if a channel has a high contrast edge that doesn't appear in the other channel(s) and you're using all 4 of them.

    On GLES 2.0, the cost of reconstructing the z is likely to be significantly more than the cost of sampling a second texture, which is why Unity defaults to storing normal maps as is rather than using two channel normals. GLES 2.0 only guarantees ETC, which doesn't support alpha, so you only have 3 channels to play with anyway.


    The TLDR version is: Try it. If it looks bad to you, don't do it. And profile it yourself to make sure it's not slower.