Search Unity

Read high precision float from render texture

Discussion in 'Shaders' started by not_a_valid_username, Dec 9, 2018.

  1. not_a_valid_username

    not_a_valid_username

    Joined:
    Jul 28, 2018
    Posts:
    23
    I'm trying to store a float from 0-1 with at least ushort precision into a LUT render texture that is created at runtime. If I use a rendertexture of type ARGBHalf everything works as expected, however my target mobile device does not support that texture format. It does support RGBAUShort, but I get all black when reading from it.

    My question is, how do I either read from an RGBAUShort texture, or how can I store a 16 bit value into two channels of an ARGB32 texture?

    If I try to read from a RGBAUShort created from Blit() via sampler2D, I get 0 all the time. I also can not seem to use usampler2D even though it's mentioned in the CG spec.

    If I use an ARGB32 texture to store the values, I think I'd have to be careful in how I do it, as I want to be able to leverage bilinear filtering in this LUT. As a result, I can not read two floats, convert them to uchars and do bit manipulation to create a ushort. If I did, the bilinear sampling would fail to interpolate between values correctly.
     
  2. bgolus

    bgolus

    Joined:
    Dec 7, 2012
    Posts:
    12,343
    If you want to store a float value with better precision than a single channel of an ARGB32 in an ARGB32, you will not be able to use hardware filtering. There just isn't a way to do it that's hardware filtering friendly.

    UShort is an integer format, any float value between 0.0 and 0.9999 will be stored and read as 0.

    If you want to store a float value in a UShort you'll need to multiply it by a large value like 65535 (the latest integer a UShort can hold, afaik), then divide by that value after you sample from the texture.

    However if I remember correctly, some mobile devices don't support bilinear filtering of integer format textures, so you may only every get point sampled values regardless of what type of filtering you select. If that's the case, then you'll have to do the filtering yourself by sampling 4 points and lerping between them. If you're going to go that, you might as well use an ARGB32 instead as it'll use far less memory, and those 4 samples will be cheaper to do.

    Unity actually has functions for encoding and decoding a float into an RGBA32 (or ARGB32).

    https://docs.unity3d.com/Manual/SL-BuiltinFunctions.html
    https://aras-p.info/blog/2009/07/30/encoding-floats-to-rgba-the-final/

    One last thing. Unity doesn't use Cg anymore. Hasn't really for almost the last decade, regardless of what the shader files imply and what some of the documentation says. Unity is pure HLSL. There's a few things unique to Cg vs HLSL that don't work because Unity doesn't use Cg anymore.
     
  3. not_a_valid_username

    not_a_valid_username

    Joined:
    Jul 28, 2018
    Posts:
    23
    Thank you very much for that insight.
    So, to summarize, you think that my best option is to encode my data into 4 channels of an ARGB32, then in my shader sample it 4 times and bilinearly filter it myself? I have to store ~10 values of at least 16 bit precision and read it back with filtering in a fragment shader for a skybox, on my mobile device. I fear that sampling textures 40 times will be too much for my application to hit 60fps.
     
  4. bgolus

    bgolus

    Joined:
    Dec 7, 2012
    Posts:
    12,343
    I’d confirm that UShort doesn’t do hardware filtering on the device you’re targeting first, but otherwise I’d say that’s your best bet if those are your requirements.

    Now, if you only need a 2D lookup, then you only need two samples. And if you only need 16 bit precision I’d look to pack two numbers per RGBA, especially if the interpolated position is the same for those values.

    However I’d also say for the hardware you’re targeting, maybe find a way to not need ten 16bit interpolated values. Because, yes, 40 samples plus the math needed to unpack and interpolate ... you’re not hitting 60 fps on OpenGL ES 2.0. Heck, sampling a texture just 10 times on an OpenGL ES 2.0 device and expecting 60 fps is kind of unrealistic. A max of maybe 4 or 5 independent texture samples (UVs come from the xy components of a texcoord semantic), and no more than 50 instructions is likely the realistic upper bound for getting 60 consistently for that level device. More than that and it might get you 60 for a bit, and then the device will thermally throttle down and you’ll be looking at far lower frame rates quickly.
     
  5. jvo3dc

    jvo3dc

    Joined:
    Oct 11, 2013
    Posts:
    1,520
    Are you sure that is really needed? That is way more information than you can finally display, so is there any way to either not store it or maybe store it less frequently than for every pixel? If this is for a skybox, I imagine this has to do with cloud formation maybe, but I might be wrong.
     
  6. not_a_valid_username

    not_a_valid_username

    Joined:
    Jul 28, 2018
    Posts:
    23
    I was actually trying to store texture coordinates, to simplify some complex distortion maths. I'm now thinking that I'll just have to use vertices to accomplish what I want, and accept a subtle drop in output quality.
    @bgolus Thank you very much for the informative replies