Search Unity

Filtered VSM in Mobile, which RT format use?

Discussion in 'Shaders' started by levelappstudios, Apr 3, 2019.

  1. levelappstudios

    levelappstudios

    Joined:
    Nov 7, 2017
    Posts:
    104
    I have successfully implemented a Filtered Variance Shadow Map in Unity, the pipeline is like this:

    • Create render texture 'A' and render texture 'B' with same parameters
    • For each frame
    1. Render to texture 'A' depth moments values from light's view (2 x 16 bits float values per texel)
    2. Render to texture 'B' the horizontal blur of texture 'A' ("Ping")
    3. Render to texture 'A' the vertical blur of texture 'B' ("Pong")
    4. Render objects to the frame buffer using textura 'A' as input for depth comparison
    Render textures are created with RGHalf format (2 x 16 bits float channels). It works great in Unity Editor, problems come with Mobile platforms. Usually RGHalf format is not compatible in Android devices (at least it's not in my 4 devices). There is not much choice:

    ARGB1555: True
    ARGB2101010: True
    ARGB32: True
    ARGB4444: True
    ARGB64: False
    ARGBFloat: False
    ARGBHalf: False
    ARGBInt: True
    BGR101010_XR: False
    BGRA10101010_XR: False
    BGRA32: False
    Depth: True
    R8: True
    RFloat: False
    RG16: True
    RG32: False
    RGB111110Float: False
    RGB565: True
    RGBAUShort: True
    RGFloat: False
    RGHalf: False
    RGInt: True
    RHalf: False
    RInt: True
    Shadowmap: True

    So it seems that the only way to take is using a ARGB32 format to maximize compatibility and use Encode/Decode functions to transform RGBA -> float2 and viceversa. It could be a decent solution if not having to to decode 9 samples in blur shader for each fragment (I'm using a 3x3 kernel) and then encode the final result.

    Summarizing penalties:
    1. 1 x Encode in depth shader
    2. 9 x Decode + 1 x Encode in horizontal blur shader
    3. 9 x Decode + 1 x Encode in vertical blur shader
    4. 1 x Decode in final depth comparison
    Can I choose a better solution?
     
  2. bgolus

    bgolus

    Joined:
    Dec 7, 2012
    Posts:
    12,343
    ARGB2101010 might get you something usable. The alternative would be to use an RGInt which would still require some encoding / decoding, but might be cheaper (certainly less math, but the float to int conversion might be overall more expensive).

    Encoding float values into multiple channels of an ARGB8888 is pretty darn common for mobile though, and the decoding isn't really that expensive, obviously doing it that many times will start to have an impact. You may need to reduce the amount of blurring you're doing.

    Btw, a 3x3 kernel should only be 3 taps for each horizontal and vertical pass. "9 x decode" makes me think you're doing the full 3x3 grid each time, which is counter to the whole separable pass idea.
     
  3. levelappstudios

    levelappstudios

    Joined:
    Nov 7, 2017
    Posts:
    104
    You are right, it’s 3 taps on each pass, I typed it wrong :)

    Downsampling the ping pong RT to 1/4 original size works pretty nice and the final result is practically the same. 60 fps running on a ZTE A610. Got some light bleeding though, need to rework a bit the chebyshev function.

    Thanks for the help bgolud!
     
    bgolus likes this.