Search Unity

Optimizing and Improving a Continuous Scroll Shader

Discussion in 'Shaders' started by dogandream, May 13, 2022.

  1. dogandream

    dogandream

    Joined:
    Sep 10, 2021
    Posts:
    2
    Hi everyone,

    For a brief intro:
    I work on a mobile match-3 game where I've developed the following shader. It leans heavily on stuff I've learned from places like catlikecoding and ronja's shader tutorials, as well as Ben Golus' always insightful answers on the posts here. Our game is played on a 2D board (as any match-3) and one item placed into the game necessitated that I coordinate a continuous texture scroll across multiple quads, including corners which are done by sampling the same texture in polar coordinates. The shader works well enough to make it into production, but being an intermediate level shader person, I'm certain there are things to improve/optimize here.

    So one part of the ask on this post is a general critique of the shader, me being open to performance gains anywhere and everywhere, as well as suggestions that help me develop a better and more nuanced understanding of shader use. I'm also aware of the less than ideal branching in a few places, but had to make do within time budgets.

    The second part of the ask is, as the shader below will show, I've managed to find a way to bevel the corners that are sampled on polar coordinates. I accomplished this by doing a hacky mirroring on the quad across a diagonal and sampling the length from different exponentiated uv pairs to achieve a nice outer bevel. However, this mirroring naturally creates a sharp corner on the inside of the sampled texture. (We have a texture that alpha fades towards the lateral edges, so the internal edge of the polar sample should have a matching bevel to the outer corner) I was unable to achieve this analytically so I went with the use what works approach.

    Lastly, I'd initially written this shader with texture arrays but had to back off because of poor implementations on some mobile GPUs causing major frame drops. I then tried to pivot to an atlas based approach but I was unable to make the atlassed uv's play nice with the fragment shader so I had to go back to using individual textures. My issue was not with situating the quads in atlas space, but rather managing the scroll offsets that are distributed across the entire instanced path when using atlassed uvs to achieve a continuous scroll.

    As this has become rather lengthy,
    An early thanks to those who find the time to take a gander and comment.

    Code (CSharp):
    1. Shader "Royal/TextureScrollShader"
    2. {
    3.     Properties
    4.     {
    5.         // Texture Array and Scroll Data
    6.         [HideInInspector] _TextureGroupCount ("Texture Group Count", int) = 0
    7.         [HideInInspector] _IsCurved ("Is Curved", int) = 0
    8.         [HideInInspector] _XFlip ("X Flip", int) = 0
    9.         [HideInInspector] _VOffsetMultiplier ("V Offset", Vector) = (0., 0., 0., 0.)
    10.         [HideInInspector] _FlowSpeedArray ("Scroll Speeds", Vector) = (0., 0., 0., 0.)
    11.         [HideInInspector] _FlowDirectionArray ("Flow Directions", Vector) = (0., 0., 0., 0.)
    12.         [HideInInspector] _FramesPerSecond ("Frames Per Second", Vector) = (0., 0., 0., 0.)
    13.         [HideInInspector] _RandomTimeOffset ("Random Time Offset", Float) = 0.
    14.        
    15.         // texture atlas groups
    16.         [HideInInspector] _TextureAtlas1 ("Texture Atlas 1", 2D) = "white" {}
    17.         [HideInInspector] _TextureAtlas2 ("Texture Atlas 2", 2D) = "white" {}
    18.         [HideInInspector] _TextureAtlas3 ("Texture Atlas 3", 2D) = "white" {}
    19.         [HideInInspector] _TextureAtlas4 ("Texture Atlas 4", 2D) = "white" {}
    20.         [HideInInspector] _TextureSampleLengths ("Texture Atlas Data 1", Vector) = (0., 0., 0., 0.)
    21.        
    22.         // UV scalar for rounded corners
    23.         _BevelScalar ("Bevel Scalar", Range(1., 19.)) = 1.
    24.        
    25.         // Head Mask and Move
    26.         [NoScaleOffset] _MagnetMaskTex ("Magnet Mask Texture", 2D) = "white" {}
    27.         [HideInInspector] _ScrollFadeStartTime ("Head is moving", Float) = 0.
    28.         [HideInInspector] _FadeCell ("Fade Cell", Float) = 0.
    29.        
    30.         // Pathing
    31.         [Toggle(PATH)] _PathingEnabled ("Enable Pathing", Float) = 0.
    32.         [HideInInspector] _MyPathIndex ("My Path Index", Float) = 0
    33.         [HideInInspector] _PathStartOffset ("Path Start Offset", Float) = 0.
    34.         [HideInInspector] _PathLength ("Path Length", Float) = 0
    35.         [HideInInspector] _PathDataA ("Path Data A", Vector) = (0., 0., 0., 0.)
    36.         [HideInInspector] _PathDataB ("Path Data B", Vector) = (0., 0., 0., 0.)
    37.         [HideInInspector] _PathDataC ("Path Data C", Vector) = (0., 0., 0., 0.)
    38.         [HideInInspector] _PathDataD ("Path Data D", Vector) = (0., 0., 0., 0.)
    39.        
    40.         // FlowMap
    41.         [Toggle(FLOWMAP)] _FlowMapEnabled ("Enable Flow Map", Float) = 0.
    42.         [NoScaleOffset] _FlowMap ("Flow Map", 2D) = "black" {}
    43.         _MinFlowThreshold ("Minimum Flow Threshold", Float) = 0.
    44.     }
    45.     SubShader
    46.     {
    47.         Tags
    48.         {
    49.             "Queue"="Transparent"
    50.             "IgnoreProjector"="True"
    51.             "RenderType"="Transparent"
    52.             "PreviewType"="Quad"
    53.             "CanUseSpriteAtlas"="True"
    54.         }
    55.  
    56.         Lighting Off
    57.         ZWrite Off
    58.         Blend One OneMinusSrcAlpha
    59.  
    60.         Pass
    61.         {
    62.             CGPROGRAM
    63.             #pragma vertex vert
    64.             #pragma fragment frag
    65.             #pragma shader_feature CURVE
    66.             #pragma shader_feature FLOWMAP
    67.             #pragma shader_feature PATH
    68.             #pragma multi_compile_instancing
    69.  
    70.             #include "UnityCG.cginc"
    71.             #include "hash.hlsl"
    72.  
    73.             struct appdata
    74.             {
    75.                 float4 vertex : POSITION;
    76.                 float2 uv : TEXCOORD0;
    77.                 UNITY_VERTEX_INPUT_INSTANCE_ID
    78.             };
    79.  
    80.             struct v2f
    81.             {
    82.                 float2 uv : TEXCOORD0;
    83.                 float4 vertex : SV_POSITION;
    84.                 UNITY_VERTEX_INPUT_INSTANCE_ID
    85.             };
    86.  
    87.             sampler2D _FlowMap;
    88.             float4 _FlowMap_ST;
    89.             float _MinFlowThreshold;
    90.            
    91.             fixed _TextureGroupCount;
    92.             sampler2D _TextureAtlas1, _TextureAtlas2, _TextureAtlas3, _TextureAtlas4, _MagnetMaskTex;
    93.             fixed4 _TextureSampleLengths;
    94.  
    95.             UNITY_INSTANCING_BUFFER_START(UnityPerMaterial)
    96.                 UNITY_DEFINE_INSTANCED_PROP(fixed2, _VOffsetMultiplier)
    97.                 UNITY_DEFINE_INSTANCED_PROP(float4, _FlowSpeedArray)
    98.                 UNITY_DEFINE_INSTANCED_PROP(float4, _FlowDirectionArray)
    99.                 UNITY_DEFINE_INSTANCED_PROP(fixed4, _FramesPerSecond)
    100.                 UNITY_DEFINE_INSTANCED_PROP(fixed, _XFlip)
    101.                 UNITY_DEFINE_INSTANCED_PROP(fixed, _IsCurved)
    102.                 UNITY_DEFINE_INSTANCED_PROP(fixed, _BevelScalar)
    103.                 UNITY_DEFINE_INSTANCED_PROP(fixed, _MyPathIndex)
    104.                 UNITY_DEFINE_INSTANCED_PROP(fixed, _PathStartOffset)
    105.                 UNITY_DEFINE_INSTANCED_PROP(fixed, _PathLength)
    106.                 UNITY_DEFINE_INSTANCED_PROP(float, _ScrollFadeStartTime)
    107.                 UNITY_DEFINE_INSTANCED_PROP(fixed, _FadeCell)
    108.                 UNITY_DEFINE_INSTANCED_PROP(fixed, _RandomTimeOffset)
    109.            
    110.                 #if defined(PATH)
    111.                     UNITY_DEFINE_INSTANCED_PROP(float4, _PathDataA)
    112.                     UNITY_DEFINE_INSTANCED_PROP(float4, _PathDataB)
    113.                     UNITY_DEFINE_INSTANCED_PROP(float4, _PathDataC)
    114.                     UNITY_DEFINE_INSTANCED_PROP(float4, _PathDataD)
    115.                 #endif
    116.            
    117.             UNITY_INSTANCING_BUFFER_END(UnityPerMaterial)
    118.  
    119.             inline float3 FlowUVW (float2 uv, float2 flowVector, float time, float flowB)
    120.             {
    121.                 float phaseOffset = flowB / 2;
    122.                 float progress = frac(time + phaseOffset);
    123.                 float3 uvw;
    124.                 uvw.xy = uv - flowVector * progress;
    125.                 uvw.z = 1 - abs(1 - 2 * progress);
    126.                 return uvw;
    127.             }
    128.  
    129.             float invLerp(float from, float to, float value){
    130.                 return (value - from) / (to - from);
    131.             }
    132.  
    133.             float remap(float origFrom, float origTo, float targetFrom, float targetTo, float value)
    134.             {
    135.                  float rel = invLerp(origFrom, origTo, value);
    136.                  return lerp(targetFrom, targetTo, rel);
    137.             }
    138.  
    139.             float ClipInBounds(float clipMin, float clipMax, float value)
    140.             {
    141.                 return max(clipMin, min(value, clipMax));
    142.             }
    143.  
    144.             v2f vert (appdata v)
    145.             {
    146.                 v2f o;
    147.                 UNITY_SETUP_INSTANCE_ID(v);
    148.                 UNITY_TRANSFER_INSTANCE_ID(v,o);
    149.                 o.vertex = UnityObjectToClipPos(v.vertex);
    150.                 o.uv = v.uv;
    151.  
    152.                 return o;
    153.             }
    154.  
    155.             fixed4 frag(v2f i) : SV_Target
    156.             {
    157.                 const float QuarterArcLength = 1.775; // V distance is greater than 1 on arc
    158.                 const float ReverseDirectionSamplingOffset = 1.215;
    159.                 const float MagnetHeadAnimationDuration = 0.26;
    160.                 const float MagnetHeadRotationAnimationOffset = 0.025;
    161.                 UNITY_SETUP_INSTANCE_ID(i);
    162.  
    163.                 // Set up instanced variables for iteration
    164.                 float sampleLenghtsArray[4] = { _TextureSampleLengths.x, _TextureSampleLengths.y, _TextureSampleLengths.z, _TextureSampleLengths.w };
    165.                
    166.                 fixed2 vOffsetMultiplier = UNITY_ACCESS_INSTANCED_PROP(UnityPerMaterial, _VOffsetMultiplier);
    167.                 float4 scrollSpeeds = UNITY_ACCESS_INSTANCED_PROP(UnityPerMaterial, _FlowSpeedArray);
    168.                 float scrollSpeedArray[4] = { scrollSpeeds.x, scrollSpeeds.y, scrollSpeeds.z, scrollSpeeds.w };
    169.                 float4 flowDirections = UNITY_ACCESS_INSTANCED_PROP(UnityPerMaterial, _FlowDirectionArray);
    170.                 float flowDirectionArray[4] = { flowDirections.x, flowDirections.y, flowDirections.z, flowDirections.w };
    171.  
    172.                 fixed isCurved = UNITY_ACCESS_INSTANCED_PROP(UnityPerMaterial, _IsCurved);
    173.                 fixed xFlip = UNITY_ACCESS_INSTANCED_PROP(UnityPerMaterial, _XFlip);
    174.                 fixed bevelScalar = UNITY_ACCESS_INSTANCED_PROP(UnityPerMaterial, _BevelScalar);
    175.  
    176.                 fixed myPathIndex = UNITY_ACCESS_INSTANCED_PROP(UnityPerMaterial, _MyPathIndex);
    177.                 fixed pathLength = UNITY_ACCESS_INSTANCED_PROP(UnityPerMaterial, _PathLength);
    178.                 fixed pathStartOffset = UNITY_ACCESS_INSTANCED_PROP(UnityPerMaterial, _PathStartOffset);
    179.                 float randomTimeOffset = UNITY_ACCESS_INSTANCED_PROP(UnityPerMaterial, _RandomTimeOffset);
    180.                 #if defined(PATH)
    181.                     fixed4 pathDataA = UNITY_ACCESS_INSTANCED_PROP(UnityPerMaterial, _PathDataA);
    182.                     fixed4 pathDataB = UNITY_ACCESS_INSTANCED_PROP(UnityPerMaterial, _PathDataB);
    183.                     fixed4 pathDataC = UNITY_ACCESS_INSTANCED_PROP(UnityPerMaterial, _PathDataC);
    184.                     fixed4 pathDataD = UNITY_ACCESS_INSTANCED_PROP(UnityPerMaterial, _PathDataD);
    185.                     fixed pathLengthArray[4] = { pathDataA.x, pathDataB.x, pathDataC.x, pathDataD.x};
    186.                     fixed pathSpeedArray[4] = { pathDataA.y, pathDataB.y, pathDataC.y, pathDataD.y};
    187.                     fixed pathSizeArray[4] = { pathDataA.z, pathDataB.z, pathDataC.z, pathDataD.z};
    188.                     fixed pathRandomizationArray[4] = { pathDataA.w, pathDataB.w, pathDataC.w, pathDataD.w};
    189.                 #endif
    190.  
    191.                 float len = 0.;
    192.                 float curveAlpha = 0.;
    193.                 float arclen = 0.;
    194.                 fixed flip = 1;
    195.                
    196.                 if(isCurved == 1)
    197.                 {
    198.                     if(xFlip == 1)
    199.                     {
    200.                         i.uv.x = 1 - i.uv.x;
    201.                         flip = -1;
    202.                     }
    203.  
    204.                     float uSqrd = pow(i.uv.x, bevelScalar);
    205.                     float vSqrd = pow(i.uv.y, bevelScalar);
    206.                    
    207.                     len = i.uv.x - i.uv.y < 0. ? length(float2(uSqrd, i.uv.y)) : length(float2(i.uv.x, vSqrd));
    208.                     curveAlpha = 1 - step(1, len);
    209.                
    210.                     float r = 0.5;
    211.                     float theta = atan2(i.uv.y, i.uv.x);
    212.                     arclen = theta * r;
    213.                 }
    214.                
    215.                 fixed4 col = fixed4(0.,0.,0.,0.);
    216.                 float2 baseUV;
    217.                 baseUV.x = isCurved * len * flip + (1 - isCurved) * i.uv.x;
    218.                 baseUV.y = isCurved * arclen + (1 - isCurved) * i.uv.y;
    219.                 baseUV.y *= isCurved * ReverseDirectionSamplingOffset + (1 - isCurved) * 1.;
    220.  
    221.                 for(int j = 0; j < _TextureGroupCount; j++)
    222.                 {
    223.                     sampler2D textureAtlasArray[4] = { _TextureAtlas1, _TextureAtlas2, _TextureAtlas3, _TextureAtlas4 };
    224.                     float timeOffset = fmod(_Time.y * scrollSpeedArray[j], 1.) * flowDirectionArray[j];
    225.                     float2 uv = baseUV;
    226.                    
    227.                     #if defined(PATH)
    228.                         float length = pathLengthArray[j];
    229.                         float speed = pathSpeedArray[j];
    230.                         float size = pathSizeArray[j];
    231.                         float deviation = pathRandomizationArray[j];
    232.                         float fadeThreshold = size * 0.2;
    233.  
    234.                         float timedHash = hash11(floor(_Time.y / length * speed));
    235.                         float direction = step(0.5, frac(timedHash)) * 2 - 1;
    236.                         float start = fmod(floor(timedHash * 1000.), pathLength);
    237.  
    238.                         clip((myPathIndex - start) * direction);
    239.                         float end = start + length * direction;
    240.                         end = max(0, min(end, pathLength));
    241.                         clip((end - myPathIndex) * direction);
    242.  
    243.                         float current = start + fmod(_Time.y * speed, length) * direction;
    244.                         float currentEnd = current + size * direction;
    245.  
    246.                         float myPathAlphaThreshold = myPathIndex + uv.y;
    247.                         clip((myPathAlphaThreshold - current) * direction);
    248.                         clip((currentEnd - myPathAlphaThreshold) * direction);
    249.  
    250.                         float pathMaskAlpha = 1.;
    251.                         pathMaskAlpha *= smoothstep(current, current + fadeThreshold * direction, myPathAlphaThreshold);
    252.                         pathMaskAlpha *= 1 - smoothstep(currentEnd - fadeThreshold * direction, currentEnd, myPathAlphaThreshold);
    253.                     #endif
    254.  
    255.                    
    256.                     #if !defined(PATH)
    257.                     uv.y = (myPathIndex + pathStartOffset + uv.y) / sampleLenghtsArray[j];
    258.                     #endif
    259.                    
    260.                     uv.y -= timeOffset + randomTimeOffset;
    261.  
    262.                     #if defined(FLOWMAP)
    263.                         float2 flowVector = tex2D(_FlowMap, baseUV).rg * 2 - 1;
    264.                         flowVector *= _MinFlowThreshold;
    265.                         float noise = tex2D(_FlowMap, baseUV).a;
    266.                         float time = _Time.y + noise;
    267.                         float3 uvwA = FlowUVW(uv, flowVector, time, 0);
    268.                         float3 uvwB = FlowUVW(uv, flowVector, time, 1);
    269.                         fixed4 sampleA = tex2D(textureAtlasArray[j], uvwA.xy) * uvwA.z;
    270.                         fixed4 sampleB = tex2D(textureAtlasArray[j], uvwB.xy) * uvwB.z;
    271.                         fixed4 sample = sampleA + sampleB;
    272.                     #else
    273.                         fixed4 sample = tex2D(textureAtlasArray[j], uv);
    274.                     #endif
    275.  
    276.                     #if defined(PATH)
    277.                         sample *= pathMaskAlpha;
    278.                     #endif
    279.                    
    280.                     sample *= sample.a;
    281.                    
    282.                     col += sample;
    283.                 }
    284.  
    285.                 half isCurvedFlag = 1 == isCurved;
    286.                 col *= curveAlpha + (1 - isCurvedFlag) * (1 - curveAlpha);              
    287.                 const half myPathZeroIndexFlag = myPathIndex == 0;
    288.                 const half myPathLengthIndexFlag = myPathIndex == pathLength - 1 && myPathIndex != 0;
    289.                
    290.                 fixed maskUvFlip = -2 * myPathLengthIndexFlag + 1;
    291.                 float smoothStepValue = smoothstep(0.495, 0.499, tex2D(_MagnetMaskTex, myPathLengthIndexFlag + maskUvFlip * i.uv).r);
    292.                 col *= smoothStepValue + (1 - saturate(myPathZeroIndexFlag + myPathLengthIndexFlag)) * (1 - smoothStepValue);
    293.  
    294.                 fixed fadeCell = UNITY_ACCESS_INSTANCED_PROP(UnityPerMaterial, _FadeCell);
    295.                 float scrollFadeStartTime = UNITY_ACCESS_INSTANCED_PROP(UnityPerMaterial, _ScrollFadeStartTime);
    296.  
    297.                 half fadeNotZero = fadeCell != 0;
    298.                 half fadeCellStep = fadeCell != 0 && step(0, fadeCell) == 0;
    299.                 fixed fadeFlip = -2 * fadeCellStep + 1;
    300.  
    301.                 float timerFmod = fmod((_Time.y - scrollFadeStartTime) / (MagnetHeadAnimationDuration + isCurved * MagnetHeadRotationAnimationOffset), 1.);
    302.                 float timer = fadeCellStep + fadeFlip * timerFmod;
    303.  
    304.                 float fadeSmoothStep = smoothstep(timer, timer + 0.1, baseUV.y);
    305.                 float fadeMultiplier = fadeCellStep + fadeFlip * fadeSmoothStep;
    306.                
    307.                 col *= fadeMultiplier + (1 - fadeNotZero) * (1 - fadeMultiplier);
    308.  
    309.                 return col;
    310.             }
    311.             ENDCG
    312.         }
    313.     }
    314. }
    315.  
    I've attached an image of the shader in use

    Thanks again,
    Dogan
     

    Attached Files: