Search Unity

Why is this shader slow?

Discussion in 'Shaders' started by customphase, Oct 1, 2019.

  1. customphase

    customphase

    Joined:
    Aug 19, 2012
    Posts:
    246
    Im not a total newbie when it comes to shaders, however for the past week ive been struggling with the water shader performance ive been working on. There doesnt seem to be anything heavy going on in the pixel shader, yet it takes about 3ms on my 1050ti to render at 1080p. Shader also uses tesselation and vertex processing, but even if i disable tesselation and bypass vertex processing its still takes the same amount to render. Ive found this presentation (http://developer.download.nvidia.co...vents/cgdc15/CGDC2015_ocean_simulation_en.pdf), where they have even more features in their water shader, yet it renders in 0.5ms(!) on gtx770 (which is just slightly faster than 1050ti), so theres definetly something wrong with my code somewhere (although they dont mention the resolution for which those timings were taken for, so i might be actually wrong and its adequate performance for my GPU). Heres the relevant pixel processing code:

    Code (CSharp):
    1. void Frag(v2f input, out float4 outColor : SV_Target0) {
    2.         outColor = 1;
    3.         const float3 geomTangentWS = normalize(input.tangentWS);
    4.         const float3 geomBinormalWS = normalize(input.binormalWS);
    5.         const float3 geomNormalWS = normalize(cross(geomBinormalWS, geomTangentWS));
    6.         const float3x3 local2WorldTranspose = float3x3(
    7.                 geomTangentWS,
    8.                 geomBinormalWS,
    9.                 geomNormalWS
    10.         );
    11.         const float3 blendedPos = (input.positionWSForUV + input.positionWS)*0.5;
    12.         const float3 V = GetWorldSpaceNormalizeViewDir(input.positionRWS);
    13.         float2 normalVelocity = 0;
    14.         const float3 normalRaw = GetCombinedNormals(blendedPos, V, geomNormalWS, normalVelocity);
    15.  
    16.         const float3 normal = ScaleNormalizeNormal(normalRaw, 0.12*_NormalStrength);
    17.         const float3 normalWS = TransformNormal(normal, local2WorldTranspose);
    18.         const float sceneDepth = LinearEyeDepth(SampleCameraDepth((input.screenPosition.xy+normalWS.xz*0.18)/input.screenPosition.w), _ZBufferParams);
    19.         const float depthDiff = sceneDepth-input.position.w;
    20.         const float depthNormalCurl = max(0, 1-exp2(-max(0, depthDiff)*4));
    21.  
    22.         const int probeIndexFixed = abs(_PlanarReflectionProbeIndex) - 1;
    23.         const float3 R = reflect (V, normalize(lerp(normalWS, normalize(float3(normalWS.x, -9.0, normalWS.z)), 1-depthNormalCurl)));
    24.         const float3 reflPos = (input.positionWSForUV + 9999999 * R);
    25.         const float3 ndc = ComputeNormalizedDeviceCoordinatesWithZ(reflPos, _Env2DCaptureVP[probeIndexFixed]);
    26.         const float distReflRough = 1-exp(-length(input.positionRWS.xyz)*0.0031);
    27.         const float3 reflection = SAMPLE_TEXTURE2D_ARRAY_LOD(_Env2DTextures, s_trilinear_clamp_sampler, ndc.xy, probeIndexFixed, distReflRough*8).rgb * GetCurrentExposureMultiplier() * _ReflectionBoost;
    28.         const float fresnel = pow(saturate(1-max(0, dot(V, normalWS))), _FresnelPower)*0.98+0.02;
    29.      
    30.         const float depthFade = max(0, 1-exp2(-max(0, depthDiff)*_DepthFactor*0.8));
    31.         const float depthFadeDist = max(0, 1-exp2(-max(0, depthDiff)*1));
    32.         const float3 sceneColorA = SampleCameraColor((input.screenPosition.xy+normalWS.xz*0.28*1*depthFadeDist)/input.screenPosition.w, 0) * float3(0.0,0.3,1);
    33.         const float3 sceneColorB = SampleCameraColor((input.screenPosition.xy+normalWS.xz*0.28*1.8*depthFadeDist)/input.screenPosition.w, 0) * float3(1.0,0.7,0);
    34.         const float depthFadeLightFilter = max(0, 1-exp2(-max(0, depthDiff)*_DepthFactor*3.5));
    35.         const float3 sceneColor = lerp(sceneColorA + sceneColorB, (sceneColorA+sceneColorB)*(0.2+normalize(_SubsurfaceColor.rgb)*0.8), depthFadeLightFilter);
    36.         const float3 diffuse = input.albedo*input.diffuse;
    37.         const float3 refraction = lerp(sceneColor, diffuse, depthFade);
    38.         outColor.rgb = lerp(refraction, reflection, fresnel);
    39.         uint2 tileIndex = uint2(input.position.xy) / GetTileSize();
    40.         PositionInputs posInput = GetPositionInput_Stereo(input.position.xy, _ScreenSize.zw, input.position.z, input.position.w, input.positionRWS.xyz, tileIndex, unity_StereoEyeIndex);
    41.         float4 fog = EvaluateAtmosphericScattering(posInput, V);
    42.         outColor.rgb = outColor.rgb * saturate((1 - fog.a) + (1-depthFade)) + fog.rgb * depthFade;
    43. }
    And heres PIX analysis of this shader/drawcall (i honestly dont know how to interpret that - good? bad?):
    upload_2019-10-2_2-48-54.png

    GetCombinedNormals just samples and unpacks ONE normal, nothing fancy there.

    And its for HDRP, so it uses some macros' from HDRP, like EvaluateAtmosphericScattering, SampleCameraColor, ComputeNormalizedDeviceCoordinatesWithZ etc. But those are not the source of the problem.

    Im pretty much on a verge of mental breakdown, please help :)
     
    Last edited: Oct 2, 2019
  2. BattleAngelAlita

    BattleAngelAlita

    Joined:
    Nov 20, 2016
    Posts:
    400
    You have around 10 normalize's in you shader. I think it's a bit owerkill.
     
  3. customphase

    customphase

    Joined:
    Aug 19, 2012
    Posts:
    246
    Thanks. Removed pretty much all of the normalizes, except one. Slightly better (-0.4 ms from the previous one), but still 2.8 ms

    upload_2019-10-2_14-7-16.png
     
  4. customphase

    customphase

    Joined:
    Aug 19, 2012
    Posts:
    246
    Even removing all the texture reads just knocks it down to 2ms. Im absolutely baffled by this, why is it performing so poorly.
     
    Last edited: Oct 2, 2019
  5. bgolus

    bgolus

    Joined:
    Dec 7, 2012
    Posts:
    12,352
    What happens if you just output a solid color?
     
  6. customphase

    customphase

    Joined:
    Aug 19, 2012
    Posts:
    246
    0.35 ms