Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. We have updated the language to the Editor Terms based on feedback from our employees and community. Learn more.
    Dismiss Notice

Could it be worthwhile for Unity to support half precision (fp16) variables?

Discussion in 'General Discussion' started by Arowx, Mar 4, 2020.

  1. Neto_Kokku

    Neto_Kokku

    Joined:
    Feb 15, 2018
    Posts:
    1,751
    On PC, XB1 and PS4, all halfs in shaders are compiled as floats. On Switch the halfs in your shaders are actually compiled as halfs, which are computed at double rate but can blow up in your face if you use them without considering the magnitude of the values you're dealing with.
     
  2. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Could this idea benefit game physics, is there a sweat spot where FP16 could be accurate enough because the movement between physics steps is small enough for most of the moving things in your game?
     
  3. MadeFromPolygons

    MadeFromPolygons

    Joined:
    Oct 5, 2013
    Posts:
    3,886
    I think your timestep would have to be so small that you would lose performance on the high amount of steps instead.
     
  4. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    6,469
    Moving things and computing things are different though, you can use fp16 to store and move, but computing is back at 32 in the gpu.

    Anyway meshlet shader and compute compression/generation take care of that in the future ... I'm not sure there is any use.
     
  5. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Are you sure as character movement is normally under 10 m/s or 0.01 m/ms (meters a millisecond).
    And even bullets tend to fly in the 100 to 1500 m/s velocity range or 0.1 to 15 m/ms range.

    So as long as your physics LOD tick happens within a precision range that fits well within FP16 then you should be able to gain from the doubling of bandwidth whilst maintaining precision.*

    *Note, you may have to tailor the speeds and LODs of your game to take best advantage of FP16
     
  6. bgolus

    bgolus

    Joined:
    Dec 7, 2012
    Posts:
    12,256
    Vertex count is irrelevant here. You animate the object’s position, not its individual vertices. For more complex objects you animate bones that drive the vertices on the GPU. A really complex skinned mesh rig might have 200 bones. More than that and you're going to start having CPU side problems. If you're doing a lot of CPU side mesh generation, you're going to be limited by the CPU performance before you're limited by bandwidth.

    Water ... this could be an example of bandwidth limitations, but not necessarily in the way you were thinking. For AAA games they do water on the GPU 100% of the time. If they need the calculations CPU side it’s cheaper to do it twice and not send the data back and forth, usually the CPU only needs a few points rather than the whole simulation. So often there's a low resolution simulation that affects buoyancy and other aspects a higher resolution GPU side only simulation or just fake noise that's added on top for the visuals. Small details are probably 100% GPU side animation driven by the high resolution sim & noise. This’ll be a problem even with much, much faster PCIe versions, and the performance is again more limited by the CPU performance itself which is why people do it on the GPU. Even then this is more a GPU to CPU issue than the other way around. Consoles get to make use of shared memory to mostly avoid this problem.


    Yep, the switch is a Tegra X1, which like most mobile GPUs, supports half precision floats in hardware. I highly recommend people make use of half precision for anything they can on mobile devices. Generally that's lighting and other color related calculations. UVs sometimes, but only for textures smaller than 1024x1024 as that resolution or above there's not enough precision in a half to avoid the texture looking point sampled. At 4096x4096 half doesn't even have enough precision to show every texel.


    Physics is super susceptible to precision issues. Some physics system do all of the actual calculation as doubles and store as 32 bit floats. Other systems use integer math for physics so the precision is constant across the entire space.

    The thing about physics is it's rarely just about the precision of how much sometime moves in a single tick only relative to it's previous position. It's about that position relative to many other things in some arbitrary space. Those are the calculations that are expensive, and those all need to be done with at least 32 bit precision. Basic velocity calculations are cheap.
     
    neoshaman and Ryiah like this.