Search Unity

Emulating SetViewport inside the shader

Discussion in 'Shaders' started by customphase, Oct 30, 2019.

  1. customphase

    customphase

    Joined:
    Aug 19, 2012
    Posts:
    246
    I have a shadow map atlas, with which i need to render an object into each cascade. I could just call SetViewport and Draw for each cascade, but im trying to implement it using instancing, and i cant call SetViewport per instance, therefore i need to emulate viewport functionality inside the shader. I managed to do it for directional lights (i.e. non-perspective lights), but for punctual lights it doesnt work. Heres the code for directional lights:

    Vert:
    Code (CSharp):
    1. o.position = mul(_ShadowmapAtlasViewProjMatrixArray[instanceID], float4(posWS,1));
    2.  
    3. //Easier to work in 0;1 space, remap back into -1;1 in the end
    4. o.position.xy = o.position.xy*0.5+0.5;
    5.  
    6. o.positionInsideViewport = o.position.xy;
    7.  
    8. float4 viewport = _ShadowmapViewportOffsetMultiplierArray[instanceID];
    9. /*Viewport is :
    10. Vector4(
    11.       shadowRequest.atlasViewport.xMin,
    12.       shadowRequest.atlasViewport.yMin,
    13.       (1f / atlasWidth) * (atlasWidth/ shadowRequest.atlasViewport.width),
    14.       (1f / atlasHeight) * (atlasHeight/ shadowRequest.atlasViewport.height)
    15. );
    16. */
    17.  
    18. //This gives us viewport.zw = (viewportSize / atlasSize)
    19. viewport.zw /= _ShadowmapAtlasSize.zw;
    20. viewport.zw = 1 / viewport.zw;
    21.  
    22. o.position.xy *= viewport.zw;
    23.  
    24. float2 offset = viewport.xy*_ShadowmapAtlasSize.zw;
    25. offset.y = 1-offset.y-viewport.w;
    26. o.position.xy += offset;
    27.  
    28. //Remap back into -1;1
    29. o.position.xy = o.position.xy*2-1;

    and in Frag i just clip based on positionInsideViewport:
    Code (CSharp):
    1. clip(input.positionInsideViewport);       // clip negative value
    2. clip(1.0 - input.positionInsideViewport); // Clip value above one

    I tried doing perspective divide before all the transformations and inverse after transformations, e.g.
    Code (CSharp):
    1. o.position.xy /= o.position.w;
    2. o.position.xy = o.position.xy*0.5+0.5;
    3. ...transformations here...
    4. //Remap back into -1;1
    5. o.position.xy = o.position.xy*2-1;
    6. o.position.xy *= o.position.w;
    but this gives incorrect results. Any help is appreciated.
     
    Last edited: Oct 30, 2019
  2. bgolus

    bgolus

    Joined:
    Dec 7, 2012
    Posts:
    12,342
    This isn't safe to interpolate only two components of the clip space projection in a perspective projection, which is part of why works for directional lights, but not for punctual lights. But....

    This is indeed the other reason. In an orthographic projection, the w component if always 1, so there's no need to do the perspective divide, or rather there's no difference between doing it or not. To make the modifications to the projection work with a perspective, you do need to do the perspective divide, or at least take into account the w component, when modifying the clip space xy.

    However the o.positionInsideViewport is even easier. It should be using the o.position.xyw before any modifications!
    Code (csharp):
    1. Shader "Unlit/ViewportModification"
    2. {
    3.     Properties
    4.     {
    5.         _MainTex ("Texture", 2D) = "white" {}
    6.         _ProjectionModification ("Projection Modification", Vector) = (1,1,0,0)
    7.     }
    8.     SubShader
    9.     {
    10.         Tags { "RenderType"="Opaque" }
    11.         LOD 100
    12.  
    13.         Pass
    14.         {
    15.             CGPROGRAM
    16.             #pragma vertex vert
    17.             #pragma fragment frag
    18.             #include "UnityCG.cginc"
    19.  
    20.             struct appdata
    21.             {
    22.                 float4 vertex : POSITION;
    23.                 float2 uv : TEXCOORD0;
    24.             };
    25.  
    26.             struct v2f
    27.             {
    28.                 float4 position : SV_POSITION;
    29.                 float2 uv : TEXCOORD0;
    30.                 float4 positionInsideViewport : TEXCOORD1;
    31.             };
    32.  
    33.             sampler2D _MainTex;
    34.             float4 _MainTex_ST;
    35.             float4 _ProjectionModification;
    36.  
    37.             v2f vert (appdata v)
    38.             {
    39.                 v2f o;
    40.                 o.position = UnityObjectToClipPos(v.vertex);
    41.                 o.uv = TRANSFORM_TEX(v.uv, _MainTex);
    42.  
    43.                 o.positionInsideViewport = o.position; // could be a float3 with o.position.xyw
    44.  
    45.                 o.position.xy /= o.position.w;
    46.                 o.position.xy = o.position.xy * _ProjectionModification.xy + _ProjectionModification.zw;
    47.                 o.position.xy *= o.position.w;
    48.  
    49.                 return o;
    50.             }
    51.  
    52.             fixed4 frag (v2f i) : SV_Target
    53.             {
    54.                 float2 screenPos = i.positionInsideViewport.xy / i.positionInsideViewport.w;
    55.                 clip(1 - abs(screenPos));
    56.  
    57.                 fixed4 col = tex2D(_MainTex, i.uv);
    58.                 return col;
    59.             }
    60.             ENDCG
    61.         }
    62.     }
    63. }
    upload_2019-10-30_14-7-13.png
    And we're ... not done? Unfortunately there is some noise around the center axis clip that I don't fully grok, since it's perfect if the clipped edge doesn't fall precisely on that edge. Likely some fun floating point funkiness.
    upload_2019-10-30_14-10-48.png
    Luckily this only appears to be a problem if the render target resolution is an odd number. Even number resolutions the edge is straight as you'd expect, so in your case it might not be a problem.
     
    customphase likes this.
  3. customphase

    customphase

    Joined:
    Aug 19, 2012
    Posts:
    246
    Thanks! That fixed it. The odd number resolutions shouldnt be a problem since all the shadowmaps are power of 2 anyway.
     
  4. bgolus

    bgolus

    Joined:
    Dec 7, 2012
    Posts:
    12,342
    Not that it matters for your use case, but it finally clicked in my head why it was having problems with odd number resolutions. Floating point interpolation means the center axis pixels are going to be equal to 0.0 +/- a tiny bit. That tiny bit means it’s basically up to chance if it’s going to exactly match 0.0 or be slightly greater / less than 0.0. The best fix is probably to adjust the scale & offsets to align to pixel dimensions for each viewport, or to convert to floored pixel space in the shader to do the clip.
     
    customphase likes this.
  5. customphase

    customphase

    Joined:
    Aug 19, 2012
    Posts:
    246
    If anyone is looking this up in the future - dont use fragment shader clipping/discarding as was initially proposed. Its terribly slow, since youre wasting threads on simply checking if pixel is visible or not (at least thats how i think it works). But thankfully theres a much better way - SV_ClipDistance semantic (https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-semantics). The idea is that you output distances to any arbitrary planes from your vertex shader, and hardware automatically does clipping based on those distances (clipping where the distance is <0) without wasting precious threads.

    Declare an output parameter in your vertex output struct with SV_ClipDistance semantic of size equal to the count of clip planes (i use 4 cause i dont need near/far clips in this case, hence float4):

    Code (CSharp):
    1.  
    2. struct v2f
    3. {
    4.     ....
    5.     float4 clipDistances : SV_ClipDistance;
    6.     //Or, if you want more planes
    7.     //float4 clipDistances0 : SV_ClipDistance0;
    8.     //float4 clipDistances1 : SV_ClipDistance1;
    9.     //etc...
    10. };
    11.  
    Then in your vertex shader write distances to the planes. In my case i convert world positions to view space and do a dot product against a frustum clip plane normal (my clip planes are in view space as well):

    Code (CSharp):
    1. v2f vert (appdata v)
    2. {
    3.     v2f o;
    4.     ...
    5.     o.clipDistances = float4(
    6.         dot(viewSpacePos, _ProbeViewSpaceFrustumPlanes[0].xyz),
    7.         dot(viewSpacePos, _ProbeViewSpaceFrustumPlanes[1].xyz),
    8.         dot(viewSpacePos, _ProbeViewSpaceFrustumPlanes[2].xyz),
    9.         dot(viewSpacePos, _ProbeViewSpaceFrustumPlanes[3].xyz)
    10.     );
    11.     ...
    12.     return o;
    13. }
    Switching from manual fragment discarding to this yielded about 4-5x(!) performance improvement in my case.
     
    Last edited: Sep 8, 2021
    bgolus likes this.
  6. bgolus

    bgolus

    Joined:
    Dec 7, 2012
    Posts:
    12,342
    Small warning: this option is supposedly super bad on some hardware. Probably not worse than the clip method, but I have seen some discussions on Twitter between people complaining that it’s uselessly bad on some hardware… but I can’t remember which.
     
    customphase likes this.
  7. customphase

    customphase

    Joined:
    Aug 19, 2012
    Posts:
    246
    I see. It should be fine for any relatively modern desktop GPU though, right?
     
  8. bgolus

    bgolus

    Joined:
    Dec 7, 2012
    Posts:
    12,342
    My vague memory is they were complaining about AMD GCN GPUs.

    *rummages around twitter*

    Looks like AMD is fine. It looks like the tweet that calls out the specific GPU that they were having problems with in the thread I remembered has been deleted (specifically the author deleted their twitter account). So *shrug*. Seems like 2019 era AMD is fine, presumably Nvidia is fine, so maybe it was some mobile device.
     
    Last edited: Sep 8, 2021
    customphase likes this.