Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Is Unity’s new standard shader inefficient in deferred?

Discussion in 'Unity 5 Pre-order Beta' started by larsbertram1, Feb 8, 2015.

  1. larsbertram1

    larsbertram1

    Joined:
    Oct 7, 2008
    Posts:
    6,848
    hi there,
    while working on lux deferred snow decals (http://forum.unity3d.com/threads/lux-snow-shader-for-unity-5-beta-22.294821) i stumbled across this line in the "UnityStandardCore.cginc" file.
    Code (csharp):
    1. half3 color = UNITY_BRDF_PBS (s.diffColor, s.specColor, s.oneMinusReflectivity, s.oneMinusRoughness, s.normalWorld, -s.eyeVec, gi.light, gi.indirect).rgb;
    color then finally is written to render target 3 which samples ambient diffuse lighting and ambient specular reflections:
    Code (csharp):
    1. outEmission = half4(color, 1);
    so if i get it right the shader performs a full pbs lighting pass (!) although it only has to calculate
    1) diffuse ambient lighting which is:
    Code (csharp):
    1. half3 ambient = col.rgb * ShadeSH9(half4(worldNormal, 1.0));
    and 2) specular ambient reflections which would be:
    Code (csharp):
    1. half3 env0 = Unity_GlossyEnvironment (UNITY_PASS_TEXCUBE(unity_SpecCube0), unity_SpecCube0_HDR, worldNormalRefl, 1-smoothness);
    2. half grazingTerm = saturate(smoothness + (1-oneMinusReflectivity));
    3. half nv = DotClamped (worldNormal, -i.eyeVec);
    4. ambient += env0 * FresnelLerpFast (specColor, grazingTerm, nv);
    except from the fact that my code would not let you to blend between 2 different reflection probes – i do not see any need to do all the other calculations e.g. BRDF1_Unity_PBS does.

    i may be totally missing something here but reducing the calculations to those mentioned above gives me pretty fine results in the gbuffers as far as i can say.

    nevertheless i have filed a bug report: Case 670639

    cheers, lars
     
    twobob and Alex-Some like this.
  2. larsbertram1

    larsbertram1

    Joined:
    Oct 7, 2008
    Posts:
    6,848
    here is a preview of render target 3 which holds ambient diffuse lighting and ambient specular reflections – using my simplified version of calculating the ambient output:

    decalblending.gif
     
  3. ReJ

    ReJ

    Unity Technologies

    Joined:
    Nov 1, 2008
    Posts:
    378
    Shader compiler should be able to optimize all that BRDF code out because everything except (1) and (2) is multiplied by 0 - check out the DummyLight setup.

    However, if you see actual performance difference (in Unity GPU profiler or external tool like GPA) - please scream and shout!
     
  4. larsbertram1

    larsbertram1

    Joined:
    Oct 7, 2008
    Posts:
    6,848
    hi there,

    thanks a lot for your answer.
    unfortunately the gpu profiler does not tell me that much as it displays pretty random values...
    so i had a look into the generated shader code (standard) – at least into one pass:
    Keywords { "DIRECTIONAL""SHADOWS_OFF""LIGHTMAP_ON""DIRLIGHTMAP_OFF""DYNAMICLIGHTMAP_OFF" }
    if i get it right this subprogram should simple pick up lighting from the lightmaps.
    but i can find some lines that look just as if they would come from the standard brdf (see examples below).
    do these lines get optimized out later on? or are they needed when using lightmaps?

    btw. the standard shader does not work for me anymore using any kind of baked lighting (RC2, mac mini, intel hd graphics 3000). see image attached at the bottom.

    lars


    example code:

    line 245:
    tmpvar_44 = ((tmpvar_40 * tmpvar_40) * unity_LightGammaCorrectionConsts.w);
    looks like:
    // Smith-Schlick derived for Beckmann
    inline half SmithBeckmannVisibilityTerm (half NdotL, half NdotV, half roughness)
    {
    // c = sqrt(2 / Pi)
    half c = unity_LightGammaCorrectionConsts_SqrtHalfPI;
    half k = roughness * roughness * c;
    return SmithVisibilityTerm (NdotL, NdotV, k);
    }

    or line 248:
    tmpvar_46 = (10.0 / log2((
    ((1.0 - tmpvar_40) * 0.968)
    + 0.03)));
    which looks like:
    inline half RoughnessToSpecPower (half roughness)
    {
    #if UNITY_GLOSS_MATCHES_MARMOSET_TOOLBAG2
    // from https://s3.amazonaws.com/docs.knaldtech.com/knald/1.0.0/lys_power_drops.html
    half n = 10.0 / log2((1-roughness)*0.968 + 0.03);


    here is the complete subshader:

    SubProgram "opengl " {
    // Stats: 150 math, 5 textures, 9 branches
    Keywords { "DIRECTIONAL" "SHADOWS_OFF" "LIGHTMAP_ON" "DIRLIGHTMAP_OFF" "DYNAMICLIGHTMAP_OFF" }
    "!!GLSL
    #ifdef VERTEX
    uniform vec3 _WorldSpaceCameraPos;

    uniform mat4 _Object2World;
    uniform mat4 _World2Object;
    uniform vec4 unity_LightmapST;
    uniform vec4 _MainTex_ST;
    uniform vec4 _DetailAlbedoMap_ST;
    uniform float _UVSec;
    varying vec4 xlv_TEXCOORD0;
    varying vec3 xlv_TEXCOORD1;
    varying vec4 xlv_TEXCOORD2;
    varying vec4 xlv_TEXCOORD2_1;
    varying vec4 xlv_TEXCOORD2_2;
    varying vec4 xlv_TEXCOORD5;
    varying vec3 xlv_TEXCOORD8;
    void main ()
    {
    vec2 tmpvar_1;
    tmpvar_1 = gl_MultiTexCoord0.xy;
    vec2 tmpvar_2;
    tmpvar_2 = gl_MultiTexCoord1.xy;
    vec4 tmpvar_3;
    vec4 tmpvar_4;
    vec4 tmpvar_5;
    vec4 tmpvar_6;
    vec4 tmpvar_7;
    tmpvar_7 = (_Object2World * gl_Vertex);
    vec3 tmpvar_8;
    tmpvar_8 = tmpvar_7.xyz;
    vec4 tmpvar_9;
    tmpvar_9 = (gl_ModelViewProjectionMatrix * gl_Vertex);
    vec4 texcoord_10;
    texcoord_10.xy = ((gl_MultiTexCoord0.xy * _MainTex_ST.xy) + _MainTex_ST.zw);
    vec2 tmpvar_11;
    if ((_UVSec == 0.0)) {
    tmpvar_11 = tmpvar_1;
    } else {
    tmpvar_11 = tmpvar_2;
    };
    texcoord_10.zw = ((tmpvar_11 * _DetailAlbedoMap_ST.xy) + _DetailAlbedoMap_ST.zw);
    vec4 v_12;
    v_12.x = _World2Object[0].x;
    v_12.y = _World2Object[1].x;
    v_12.z = _World2Object[2].x;
    v_12.w = _World2Object[3].x;
    vec4 v_13;
    v_13.x = _World2Object[0].y;
    v_13.y = _World2Object[1].y;
    v_13.z = _World2Object[2].y;
    v_13.w = _World2Object[3].y;
    vec4 v_14;
    v_14.x = _World2Object[0].z;
    v_14.y = _World2Object[1].z;
    v_14.z = _World2Object[2].z;
    v_14.w = _World2Object[3].z;
    tmpvar_3.xyz = vec3(0.0, 0.0, 0.0);
    tmpvar_4.xyz = vec3(0.0, 0.0, 0.0);
    tmpvar_5.xyz = normalize(((
    (v_12.xyz * gl_Normal.x)
    +
    (v_13.xyz * gl_Normal.y)
    ) + (v_14.xyz * gl_Normal.z)));
    tmpvar_6.xy = ((gl_MultiTexCoord1.xy * unity_LightmapST.xy) + unity_LightmapST.zw);
    tmpvar_6.zw = vec2(0.0, 0.0);
    gl_Position = tmpvar_9;
    xlv_TEXCOORD0 = texcoord_10;
    xlv_TEXCOORD1 = (tmpvar_7.xyz - _WorldSpaceCameraPos);
    xlv_TEXCOORD2 = tmpvar_3;
    xlv_TEXCOORD2_1 = tmpvar_4;
    xlv_TEXCOORD2_2 = tmpvar_5;
    xlv_TEXCOORD5 = tmpvar_6;
    xlv_TEXCOORD8 = tmpvar_8;
    }


    #endif
    #ifdef FRAGMENT
    #extension GL_ARB_shader_texture_lod : enable
    uniform sampler2D unity_Lightmap;
    uniform samplerCube unity_SpecCube0;
    uniform samplerCube unity_SpecCube1;
    uniform vec4 unity_SpecCube0_BoxMax;
    uniform vec4 unity_SpecCube0_BoxMin;
    uniform vec4 unity_SpecCube0_ProbePosition;
    uniform vec4 unity_SpecCube0_HDR;
    uniform vec4 unity_SpecCube1_BoxMax;
    uniform vec4 unity_SpecCube1_BoxMin;
    uniform vec4 unity_SpecCube1_ProbePosition;
    uniform vec4 unity_SpecCube1_HDR;
    uniform vec4 unity_ColorSpaceDielectricSpec;
    uniform vec4 unity_Lightmap_HDR;
    uniform vec4 unity_LightGammaCorrectionConsts;
    uniform vec4 _Color;
    uniform sampler2D _MainTex;
    uniform float _Metallic;
    uniform float _Glossiness;
    uniform sampler2D _OcclusionMap;
    uniform float _OcclusionStrength;
    varying vec4 xlv_TEXCOORD0;
    varying vec3 xlv_TEXCOORD1;
    varying vec4 xlv_TEXCOORD2_2;
    varying vec4 xlv_TEXCOORD5;
    varying vec3 xlv_TEXCOORD8;
    void main ()
    {
    vec4 c_1;
    vec3 tmpvar_2;
    tmpvar_2 = normalize(xlv_TEXCOORD2_2.xyz);
    vec3 tmpvar_3;
    tmpvar_3 = normalize(xlv_TEXCOORD1);
    vec3 tmpvar_4;
    tmpvar_4 = (_Color.xyz * texture2D (_MainTex, xlv_TEXCOORD0.xy).xyz);
    vec3 tmpvar_5;
    vec3 tmpvar_6;
    tmpvar_6 = mix (unity_ColorSpaceDielectricSpec.xyz, tmpvar_4, vec3(_Metallic));
    float tmpvar_7;
    tmpvar_7 = (unity_ColorSpaceDielectricSpec.w - (_Metallic * unity_ColorSpaceDielectricSpec.w));
    tmpvar_5 = (tmpvar_4 * tmpvar_7);
    float tmpvar_8;
    tmpvar_8 = ((1.0 - _OcclusionStrength) + (texture2D (_OcclusionMap, xlv_TEXCOORD0.xy).y * _OcclusionStrength));
    vec3 tmpvar_9;
    vec3 tmpvar_10;
    float tmpvar_11;
    vec3 tmpvar_12;
    vec3 tmpvar_13;
    tmpvar_13 = vec3(0.0, 0.0, 0.0);
    vec4 tmpvar_14;
    tmpvar_14 = texture2D (unity_Lightmap, xlv_TEXCOORD5.xy);
    tmpvar_12 = ((unity_Lightmap_HDR.x * pow (tmpvar_14.w, unity_Lightmap_HDR.y)) * tmpvar_14.xyz);
    tmpvar_12 = (tmpvar_12 * tmpvar_8);
    vec3 tmpvar_15;
    tmpvar_15 = (tmpvar_3 - (2.0 * (
    dot (tmpvar_2, tmpvar_3)
    * tmpvar_2)));
    vec3 worldNormal_16;
    worldNormal_16 = tmpvar_15;
    if ((unity_SpecCube0_ProbePosition.w > 0.0)) {
    vec3 tmpvar_17;
    tmpvar_17 = normalize(tmpvar_15);
    vec3 tmpvar_18;
    tmpvar_18 = ((unity_SpecCube0_BoxMax.xyz - xlv_TEXCOORD8) / tmpvar_17);
    vec3 tmpvar_19;
    tmpvar_19 = ((unity_SpecCube0_BoxMin.xyz - xlv_TEXCOORD8) / tmpvar_17);
    bvec3 tmpvar_20;
    tmpvar_20 = greaterThan (tmpvar_17, vec3(0.0, 0.0, 0.0));
    float tmpvar_21;
    if (tmpvar_20.x) {
    tmpvar_21 = tmpvar_18.x;
    } else {
    tmpvar_21 = tmpvar_19.x;
    };
    float tmpvar_22;
    if (tmpvar_20.y) {
    tmpvar_22 = tmpvar_18.y;
    } else {
    tmpvar_22 = tmpvar_19.y;
    };
    float tmpvar_23;
    if (tmpvar_20.z) {
    tmpvar_23 = tmpvar_18.z;
    } else {
    tmpvar_23 = tmpvar_19.z;
    };
    vec3 tmpvar_24;
    tmpvar_24 = ((unity_SpecCube0_BoxMax.xyz + unity_SpecCube0_BoxMin.xyz) * 0.5);
    worldNormal_16 = (((
    (tmpvar_24 - unity_SpecCube0_ProbePosition.xyz)
    + xlv_TEXCOORD8) + (tmpvar_17 *
    min (min (tmpvar_21, tmpvar_22), tmpvar_23)
    )) - tmpvar_24);
    };
    vec4 tmpvar_25;
    tmpvar_25.xyz = worldNormal_16;
    tmpvar_25.w = (pow ((1.0 - _Glossiness), 0.75) * 7.0);
    vec4 tmpvar_26;
    tmpvar_26 = textureCubeLod (unity_SpecCube0, worldNormal_16, tmpvar_25.w);
    vec3 tmpvar_27;
    tmpvar_27 = ((unity_SpecCube0_HDR.x * pow (tmpvar_26.w, unity_SpecCube0_HDR.y)) * tmpvar_26.xyz);
    if ((unity_SpecCube0_BoxMin.w < 0.99999)) {
    vec3 worldNormal_28;
    worldNormal_28 = tmpvar_15;
    if ((unity_SpecCube1_ProbePosition.w > 0.0)) {
    vec3 tmpvar_29;
    tmpvar_29 = normalize(tmpvar_15);
    vec3 tmpvar_30;
    tmpvar_30 = ((unity_SpecCube1_BoxMax.xyz - xlv_TEXCOORD8) / tmpvar_29);
    vec3 tmpvar_31;
    tmpvar_31 = ((unity_SpecCube1_BoxMin.xyz - xlv_TEXCOORD8) / tmpvar_29);
    bvec3 tmpvar_32;
    tmpvar_32 = greaterThan (tmpvar_29, vec3(0.0, 0.0, 0.0));
    float tmpvar_33;
    if (tmpvar_32.x) {
    tmpvar_33 = tmpvar_30.x;
    } else {
    tmpvar_33 = tmpvar_31.x;
    };
    float tmpvar_34;
    if (tmpvar_32.y) {
    tmpvar_34 = tmpvar_30.y;
    } else {
    tmpvar_34 = tmpvar_31.y;
    };
    float tmpvar_35;
    if (tmpvar_32.z) {
    tmpvar_35 = tmpvar_30.z;
    } else {
    tmpvar_35 = tmpvar_31.z;
    };
    vec3 tmpvar_36;
    tmpvar_36 = ((unity_SpecCube1_BoxMax.xyz + unity_SpecCube1_BoxMin.xyz) * 0.5);
    worldNormal_28 = (((
    (tmpvar_36 - unity_SpecCube1_ProbePosition.xyz)
    + xlv_TEXCOORD8) + (tmpvar_29 *
    min (min (tmpvar_33, tmpvar_34), tmpvar_35)
    )) - tmpvar_36);
    };
    vec4 tmpvar_37;
    tmpvar_37.xyz = worldNormal_28;
    tmpvar_37.w = (pow ((1.0 - _Glossiness), 0.75) * 7.0);
    vec4 tmpvar_38;
    tmpvar_38 = textureCubeLod (unity_SpecCube1, worldNormal_28, tmpvar_37.w);
    tmpvar_13 = mix (((unity_SpecCube1_HDR.x *
    pow (tmpvar_38.w, unity_SpecCube1_HDR.y)
    ) * tmpvar_38.xyz), tmpvar_27, unity_SpecCube0_BoxMin.www);
    } else {
    tmpvar_13 = tmpvar_27;
    };
    tmpvar_13 = (tmpvar_13 * tmpvar_8);
    vec3 viewDir_39;
    viewDir_39 = -(tmpvar_3);
    float tmpvar_40;
    tmpvar_40 = (1.0 - _Glossiness);
    vec3 tmpvar_41;
    tmpvar_41 = normalize((tmpvar_10 + viewDir_39));
    float tmpvar_42;
    tmpvar_42 = max (0.0, dot (tmpvar_2, viewDir_39));
    float tmpvar_43;
    tmpvar_43 = max (0.0, dot (tmpvar_10, tmpvar_41));
    float tmpvar_44;
    tmpvar_44 = ((tmpvar_40 * tmpvar_40) * unity_LightGammaCorrectionConsts.w);
    float tmpvar_45;
    float tmpvar_46;
    tmpvar_46 = (10.0 / log2((
    ((1.0 - tmpvar_40) * 0.968)
    + 0.03)));
    tmpvar_45 = (tmpvar_46 * tmpvar_46);
    float x_47;
    x_47 = (1.0 - tmpvar_11);
    float x_48;
    x_48 = (1.0 - tmpvar_42);
    float tmpvar_49;
    tmpvar_49 = (0.5 + ((
    (2.0 * tmpvar_43)
    * tmpvar_43) * tmpvar_40));
    float x_50;
    x_50 = (1.0 - tmpvar_43);
    float x_51;
    x_51 = (1.0 - tmpvar_42);
    vec3 tmpvar_52;
    tmpvar_52 = (((tmpvar_5 *
    (tmpvar_12 + (tmpvar_9 * ((
    (1.0 + ((tmpvar_49 - 1.0) * ((
    ((x_47 * x_47) * x_47)
    * x_47) * x_47)))
    *
    (1.0 + ((tmpvar_49 - 1.0) * ((
    ((x_48 * x_48) * x_48)
    * x_48) * x_48)))
    ) * tmpvar_11)))
    ) + (
    (max (0.0, ((
    ((1.0/(((
    ((tmpvar_11 * (1.0 - tmpvar_44)) + tmpvar_44)
    *
    ((tmpvar_42 * (1.0 - tmpvar_44)) + tmpvar_44)
    ) + 0.0001))) * (pow (max (0.0,
    dot (tmpvar_2, tmpvar_41)
    ), tmpvar_45) * ((tmpvar_45 + 1.0) * unity_LightGammaCorrectionConsts.y)))
    * tmpvar_11) * unity_LightGammaCorrectionConsts.x)) * tmpvar_9)
    *
    (tmpvar_6 + ((1.0 - tmpvar_6) * ((
    ((x_50 * x_50) * x_50)
    * x_50) * x_50)))
    )) + (tmpvar_13 * mix (tmpvar_6, vec3(
    clamp ((_Glossiness + (1.0 - tmpvar_7)), 0.0, 1.0)
    ), vec3(
    ((((x_51 * x_51) * x_51) * x_51) * x_51)
    ))));
    vec4 tmpvar_53;
    tmpvar_53.w = 1.0;
    tmpvar_53.xyz = tmpvar_52;
    c_1.w = tmpvar_53.w;
    c_1.xyz = tmpvar_52;
    c_1.xyz = c_1.xyz;
    vec4 xlat_varoutput_54;
    xlat_varoutput_54.xyz = c_1.xyz;
    xlat_varoutput_54.w = 1.0;
    gl_FragData[0] = xlat_varoutput_54;
    }


    #endif


    brokenstandardshader.png