Search Unity

Should I pack four float properties to one float4?

Discussion in 'Shaders' started by dongch007, Apr 8, 2021.

  1. dongch007

    dongch007

    Joined:
    Jan 8, 2014
    Posts:
    22
    case 1:
    float param0;
    float param1;
    float param2;
    float param3;

    case 2:
    float4 params;

    Is case 1 faster case 2?
    I have some concept but dont know whether it is correct.
    For example in GLES, case 1 will call glUniform1f four times, case 2 call glUniform4f one time, so cpu time case 2 should faster than case 1, Is that right?
    Is case 2 use less registers so when texturefetch stall gpu can do more context switch?

    Even this two point above is correct.
    I read cbuffer pack rule https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-packing-rules
    In URP with cbuffer, case1 and case2 use same size, so these's no need pack 4 float to one float, is that right?
     
  2. Invertex

    Invertex

    Joined:
    Nov 7, 2013
    Posts:
    1,177
    If you can pack it yourself, it's generally not a bad idea, assuming you understand the packing rules so that you are not creating unnecessary bloat in the struct. Compiler optimizations are nice, but they should be thought of as secondary, to try and pick up the slack where the user might miss.

    When packing yourself you also clearly outline which values are going to be frequently used at the same time and get a better understanding of the data flow in your program. Otherwise you might end up with a value getting packed with a different value that's already been read once, resulting in an unnecessary address read when you then access that other value.

    GPUs are vector processing units for the most part, working with a vector versus a single value is generally ideal.
     
  3. Camarent

    Camarent

    Joined:
    Feb 19, 2014
    Posts:
    165
    What rules do you mean?

    In general, does this mean it is good to pack data to be used in the same place and at roughly the same frequency? In other cases, it is better to use it separately.
     
unityunity