Search Unity

  1. Get all the Unite Berlin 2018 news on the blog.
    Dismiss Notice
  2. Unity 2018.2 has arrived! Read about it here.
    Dismiss Notice
  3. Improve your Unity skills with a certified instructor in a private, interactive classroom. Learn more.
    Dismiss Notice
  4. ARCore is out of developer preview! Read about it here.
    Dismiss Notice
  5. Magic Leap’s Lumin SDK Technical Preview for Unity lets you get started creating content for Magic Leap One™. Find more information on our blog!
    Dismiss Notice
  6. Want to see the most recent patch releases? Take a peek at the patch release page.
    Dismiss Notice

2D / 3D / 4D optimised Perlin Noise Cg/HLSL library (cginc)

Discussion in 'Shaders' started by Lex-DRL, Dec 18, 2013.

  1. Lex-DRL

    Lex-DRL

    Joined:
    Oct 10, 2011
    Posts:
    132
    Hey guys. :)

    As many of you know, HLSL's noise() function supposed to return Perlin noise. However, it simply doesn't work because none of GPU vendors support it in the average graphics card.
    But in my currently developed shader, I absolutely needed it. So...

    I've just finished porting the awesome simplex noise library to make it compatible with Unity's CgFx/HLSL. Basically, it's usual Perlin Noise we all are so familiar with. But it's optimized to be as fast as possible to use it in real-time GPU rendering.
    It doesn't need any pre-computed arrays, textures or any other kind of external resourses.
    It supports all noise generation types: by 2D, 3D and 4D vectors.

    Here's a screenshot:


    And, as you might guess, I'm here not to just brag about it. ;)
    With no further words, here's the archive:
    View attachment $noise-simplex.7z

    It contains the cginc itself and a test shader which generates animated 3D noise in world space.
    Enjoy!

    Usage is as simple as possible:
    1. include the library:
      Code (csharp):
      1.  
      2. #include "noiseSimplex.cginc"
      3.  
    2. then, in your fragment or vertex shader, call the function:
      Code (csharp):
      1.  
      2. float ns = snoise(v);
      3.  
      v is float vector of any dimention (2D / 3D / 4D). The returned result is float scalar in [-1,1] range.
    3. Due to the number of operations, you need to have shader model 3.0 at least.
      So don't forget to add the following line right after CGPROGRAM:
      Code (csharp):
      1. #pragma target 3.0
     
    Last edited: Dec 18, 2013
    Lipoly and chrismarch like this.
  2. reefwirrax

    reefwirrax

    Joined:
    Sep 20, 2013
    Posts:
    138
    Hi there, That's very nice. i feel i should also mention that there is a version of improved perlin which is very efficient on gpu, you can get the code on www.scrawkblog.com
     
  3. Lex-DRL

    Lex-DRL

    Joined:
    Oct 10, 2011
    Posts:
    132
    Yeah, I've seen this blog, too. And tried "improved perlin noise" from it.

    Maybe, his system is even faster then the library I posted above. But it's also more complex. Much more.

    I personally just couldn't figure out how exactly it works.
    There's a tricky mix of shaders and scripts. As a shader writer, I was already confused by lack of any parameters in the shaders. Yes, there are example scenes which show how to apply this pairs of shader and script. But then I noticed that changing "seed" does absolutely nothing. And it's completely unclear how exactly script interacts with shader.
    That means, each time I'll need a noise in a shader, I'll have to ask our programmer to write some scripts for me. And even if these scripts would be pretty simple, it's usually more difficult to explain what exactly I need. And it's still extra work.

    So yes, I believe, that guy's noise system could be much more performance-efficient. But from production side, for my current task it just doesn't worth it.
     
  4. reefwirrax

    reefwirrax

    Joined:
    Sep 20, 2013
    Posts:
    138
    Oh brilliant, you translated gustavson version on GPU? Oh nice one, yeS simplex noise is great because it's all in one text. There is a mystery if simplex noise and improved noise are both the same function, many people think simplex is less advanced and quality readout than old perlin as it sounds like simple version, except that it is more quality readout and 2x faster. Perhaps that improved noise is a sleepy version that was rewritten on and is alledgedly faster, with the wrong name that simplex noise!!! arg! with perhaps there is "Branches, LUT sizes, memory patterns, var numbers, live variables". Except probably your version is faster because it is all on CG and and also the only one available so far. I think the skrawc version has a code for the perlin arrays, a code for passing the arrays to the graphics, and a CG the maths, s0 5-7 different codes so it may be possible to copy it to 2 codes, i think it written by cg book writer, you can do a 20mb test if you had to for speeds. i think there is also a simplex by J gotlen for unity cg lol. people have made comparison of value noise and graident noise from libnoise, gradient noise is slower and more round. it's pretty good after programming synthesizers because synths are all 2d noise and 3d is al 3d noise :D
     
    Last edited: Dec 29, 2013
    Lipoly likes this.
  5. Dolkar

    Dolkar

    Joined:
    Jun 8, 2013
    Posts:
    576
    Just noticed this... and I noticed there's a room for optimization there. You're not using built-in cg functions that are optimized by GPU manufacturers, but instead implementing your own...

    Code (csharp):
    1.  
    2. mod289(x) = fmod(x, 289.0)
    3. taylorInvSqrt(x) = rsqrt(x)
    4.  
    Code (csharp):
    1.  
    2. // GLSL: lessThan(x, y) = x < y
    3. // HLSL: 1 - step(y, x) = x < y
    4. s = float4(
    5.     1 - step(0.0, p)
    6. );
    7. p.xyz = p.xyz + (s.xyz * 2 - 1) * s.www;
    8.  
    9. // Which is equivalent to:
    10. p.xyz -= sign(p.xyz) * (p.w < 0);
    11.  
    Code (csharp):
    1.  
    2. // float2 i1 = (x0.x > x0.y) ? float2(1.0, 0.0) : float2(0.0, 1.0);
    3. // Lex-DRL: afaik, step() in GPU is faster than if(), so:
    4. // step(x, y) = x <= y
    5. int xLessEqual = step(x0.x, x0.y); // x <= y ?
    6. int2 i1 =
    7.     int2(1, 0) * (1 - xLessEqual) // x > y
    8.     + int2(0, 1) * xLessEqual // x <= y
    9. ;
    10. float4 x12 = x0.xyxy + C.xxzz;
    11. x12.xy -= i1;
    12.  
    13. // Actually, a simple conditional without branching is faster than that madness :)
    14. float4 x12 = x0.xyxy + C.xxzz;
    15. x12.xy -= (x0.x > x0.y) ? float2(1.0, 0.0) : float2(0.0, 1.0);
    16.  
    ... and I just ran out of time, but you see what I mean...
     
    Lipoly likes this.
  6. Lex-DRL

    Lex-DRL

    Joined:
    Oct 10, 2011
    Posts:
    132
    reefwirrax,
    one important remark: it's not my version of noise. I only ported it from GLSL to Cg/HLSL. All the credits are in the file itself.

    Dolkar,
    I'm newbie in cg shading.
    I have quite a significant experience in writing RenderMan shaders, but some realtime-shading approaches are still new for me. So yes, there may be a lot of room for optimization... especially considering the fact that the original (GLSL-) lib author admitted he didn't spend too much time optimizing it.

    Feel free to provide patches if you want... or, if you think it's necessary, I can create a project on GitHub.
     
  7. NavyFish

    NavyFish

    Joined:
    Aug 16, 2013
    Posts:
    28
    For archival purposes, here's the same code with Dolkar's optimizations incorporated - with the exception of taylorInvSqrt, which to my understanding does not actually equate to rsqrt.

    FWIW, on a GTX580m, conducting a compute shader which uses 30 octaves (way overkill, for the purposes of the benchmark) of simplex noise, on 36 batches of 224x224 vertices per frame (total 1.8 million verts) - I saw about 1 FPS increase with the optimizations, from 38.5 to 39.5.

    Code (CSharp):
    1. #ifndef NOISE_SIMPLEX_FUNC
    2. #define NOISE_SIMPLEX_FUNC
    3. /*
    4.  
    5. Description:
    6.     Array- and textureless CgFx/HLSL 2D, 3D and 4D simplex noise functions.
    7.     a.k.a. simplified and optimized Perlin noise.
    8.  
    9.     The functions have very good performance
    10.     and no dependencies on external data.
    11.  
    12.     2D - Very fast, very compact code.
    13.     3D - Fast, compact code.
    14.     4D - Reasonably fast, reasonably compact code.
    15.  
    16. ------------------------------------------------------------------
    17.  
    18. Ported by:
    19.     Lex-DRL
    20.     I've ported the code from GLSL to CgFx/HLSL for Unity,
    21.     added a couple more optimisations (to speed it up even further)
    22.     and slightly reformatted the code to make it more readable.
    23.  
    24. Original GLSL functions:
    25.     https://github.com/ashima/webgl-noise
    26.     Credits from original glsl file are at the end of this cginc.
    27.  
    28. ------------------------------------------------------------------
    29.  
    30. Usage:
    31.  
    32.     float ns = snoise(v);
    33.     // v is any of: float2, float3, float4
    34.  
    35.     Return type is float.
    36.     To generate 2 or more components of noise (colorful noise),
    37.     call these functions several times with different
    38.     constant offsets for the arguments.
    39.     E.g.:
    40.  
    41.     float3 colorNs = float3(
    42.         snoise(v),
    43.         snoise(v + 17.0),
    44.         snoise(v - 43.0),
    45.     );
    46.  
    47.  
    48. Remark about those offsets from the original author:
    49.  
    50.     People have different opinions on whether these offsets should be integers
    51.     for the classic noise functions to match the spacing of the zeroes,
    52.     so we have left that for you to decide for yourself.
    53.     For most applications, the exact offsets don't really matter as long
    54.     as they are not too small or too close to the noise lattice period
    55.     (289 in this implementation).
    56.  
    57. */
    58.  
    59. // 1 / 289
    60. #define NOISE_SIMPLEX_1_DIV_289 0.00346020761245674740484429065744f
    61.  
    62.  
    63. // ( x*34.0 + 1.0 )*x =
    64. // x*x*34.0 + x
    65. float permute(float x) {
    66.     return fmod(
    67.         x*x*34.0 + x,
    68.         289.0
    69.     );
    70. }
    71.  
    72. float3 permute(float3 x) {
    73.     return fmod(
    74.         x*x*34.0 + x,
    75.         289.0
    76.     );
    77. }
    78.  
    79. float4 permute(float4 x) {
    80.     return fmod(
    81.         x*x*34.0 + x,
    82.         289.0
    83.     );
    84. }
    85.  
    86.  
    87.  
    88. float taylorInvSqrt(float r) {
    89.     return 1.79284291400159 - 0.85373472095314 * r;
    90. }
    91.  
    92. float4 taylorInvSqrt(float4 r) {
    93.     return 1.79284291400159 - 0.85373472095314 * r;
    94. }
    95.  
    96.  
    97.  
    98. float4 grad4(float j, float4 ip)
    99. {
    100.     const float4 ones = float4(1.0, 1.0, 1.0, -1.0);
    101.     float4 p, s;
    102.     p.xyz = floor( frac(j * ip.xyz) * 7.0) * ip.z - 1.0;
    103.     p.w = 1.5 - dot( abs(p.xyz), ones.xyz );
    104.  
    105.     // GLSL: lessThan(x, y) = x < y
    106.     // HLSL: 1 - step(y, x) = x < y
    107.  
    108.     //s = float4(
    109.     //    1 - step(0.0, p)
    110.     //);
    111.     //p.xyz = p.xyz + (s.xyz * 2 - 1) * s.www;
    112.  
    113.     p.xyz -= sign(p.xyz) * (p.w < 0);
    114.  
    115.     return p;
    116. }
    117.  
    118.  
    119.  
    120. // ----------------------------------- 2D -------------------------------------
    121.  
    122. float snoise(float2 v)
    123. {
    124.     const float4 C = float4(
    125.         0.211324865405187, // (3.0-sqrt(3.0))/6.0
    126.         0.366025403784439, // 0.5*(sqrt(3.0)-1.0)
    127.      -0.577350269189626, // -1.0 + 2.0 * C.x
    128.         0.024390243902439  // 1.0 / 41.0
    129.     );
    130.  
    131. // First corner
    132.     float2 i = floor( v + dot(v, C.yy) );
    133.     float2 x0 = v - i + dot(i, C.xx);
    134.  
    135. // Other corners
    136.     // float2 i1 = (x0.x > x0.y) ? float2(1.0, 0.0) : float2(0.0, 1.0);
    137.     // Lex-DRL: afaik, step() in GPU is faster than if(), so:
    138.     // step(x, y) = x <= y
    139.  
    140.     //int xLessEqual = step(x0.x, x0.y); // x <= y ?
    141.     //int2 i1 =
    142.     //    int2(1, 0) * (1 - xLessEqual) // x > y
    143.     //    + int2(0, 1) * xLessEqual // x <= y
    144.     //;
    145.     //float4 x12 = x0.xyxy + C.xxzz;
    146.     //x12.xy -= i1;
    147.  
    148.     float4 x12 = x0.xyxy + C.xxzz;
    149.     int2 i1 = (x0.x > x0.y) ? float2(1.0, 0.0) : float2(0.0, 1.0);
    150.     x12.xy -= i1;
    151.  
    152. // Permutations
    153.     i = fmod(i,289.0); // Avoid truncation effects in permutation
    154.     float3 p = permute(
    155.         permute(
    156.                 i.y + float3(0.0, i1.y, 1.0 )
    157.         ) + i.x + float3(0.0, i1.x, 1.0 )
    158.     );
    159.  
    160.     float3 m = max(
    161.         0.5 - float3(
    162.             dot(x0, x0),
    163.             dot(x12.xy, x12.xy),
    164.             dot(x12.zw, x12.zw)
    165.         ),
    166.         0.0
    167.     );
    168.     m = m*m ;
    169.     m = m*m ;
    170.  
    171. // Gradients: 41 points uniformly over a line, mapped onto a diamond.
    172. // The ring size 17*17 = 289 is close to a multiple of 41 (41*7 = 287)
    173.  
    174.     float3 x = 2.0 * frac(p * C.www) - 1.0;
    175.     float3 h = abs(x) - 0.5;
    176.     float3 ox = floor(x + 0.5);
    177.     float3 a0 = x - ox;
    178.  
    179. // Normalise gradients implicitly by scaling m
    180. // Approximation of: m *= inversesqrt( a0*a0 + h*h );
    181.     m *= 1.79284291400159 - 0.85373472095314 * ( a0*a0 + h*h );
    182.  
    183. // Compute final noise value at P
    184.     float3 g;
    185.     g.x = a0.x * x0.x + h.x * x0.y;
    186.     g.yz = a0.yz * x12.xz + h.yz * x12.yw;
    187.     return 130.0 * dot(m, g);
    188. }
    189.  
    190. // ----------------------------------- 3D -------------------------------------
    191.  
    192. float snoise(float3 v)
    193. {
    194.     const float2 C = float2(
    195.         0.166666666666666667, // 1/6
    196.         0.333333333333333333  // 1/3
    197.     );
    198.     const float4 D = float4(0.0, 0.5, 1.0, 2.0);
    199.  
    200. // First corner
    201.     float3 i = floor( v + dot(v, C.yyy) );
    202.     float3 x0 = v - i + dot(i, C.xxx);
    203.  
    204. // Other corners
    205.     float3 g = step(x0.yzx, x0.xyz);
    206.     float3 l = 1 - g;
    207.     float3 i1 = min(g.xyz, l.zxy);
    208.     float3 i2 = max(g.xyz, l.zxy);
    209.  
    210.     float3 x1 = x0 - i1 + C.xxx;
    211.     float3 x2 = x0 - i2 + C.yyy; // 2.0*C.x = 1/3 = C.y
    212.     float3 x3 = x0 - D.yyy;      // -1.0+3.0*C.x = -0.5 = -D.y
    213.  
    214. // Permutations
    215.     i = fmod(i,289.0);
    216.     float4 p = permute(
    217.         permute(
    218.             permute(
    219.                     i.z + float4(0.0, i1.z, i2.z, 1.0 )
    220.             ) + i.y + float4(0.0, i1.y, i2.y, 1.0 )
    221.         )     + i.x + float4(0.0, i1.x, i2.x, 1.0 )
    222.     );
    223.  
    224. // Gradients: 7x7 points over a square, mapped onto an octahedron.
    225. // The ring size 17*17 = 289 is close to a multiple of 49 (49*6 = 294)
    226.     float n_ = 0.142857142857; // 1/7
    227.     float3 ns = n_ * D.wyz - D.xzx;
    228.  
    229.     float4 j = p - 49.0 * floor(p * ns.z * ns.z); // mod(p,7*7)
    230.  
    231.     float4 x_ = floor(j * ns.z);
    232.     float4 y_ = floor(j - 7.0 * x_ ); // mod(j,N)
    233.  
    234.     float4 x = x_ *ns.x + ns.yyyy;
    235.     float4 y = y_ *ns.x + ns.yyyy;
    236.     float4 h = 1.0 - abs(x) - abs(y);
    237.  
    238.     float4 b0 = float4( x.xy, y.xy );
    239.     float4 b1 = float4( x.zw, y.zw );
    240.  
    241.     //float4 s0 = float4(lessThan(b0,0.0))*2.0 - 1.0;
    242.     //float4 s1 = float4(lessThan(b1,0.0))*2.0 - 1.0;
    243.     float4 s0 = floor(b0)*2.0 + 1.0;
    244.     float4 s1 = floor(b1)*2.0 + 1.0;
    245.     float4 sh = -step(h, 0.0);
    246.  
    247.     float4 a0 = b0.xzyw + s0.xzyw*sh.xxyy ;
    248.     float4 a1 = b1.xzyw + s1.xzyw*sh.zzww ;
    249.  
    250.     float3 p0 = float3(a0.xy,h.x);
    251.     float3 p1 = float3(a0.zw,h.y);
    252.     float3 p2 = float3(a1.xy,h.z);
    253.     float3 p3 = float3(a1.zw,h.w);
    254.  
    255. //Normalise gradients
    256.     float4 norm = taylorInvSqrt(float4(
    257.         dot(p0, p0),
    258.         dot(p1, p1),
    259.         dot(p2, p2),
    260.         dot(p3, p3)
    261.     ));
    262.     p0 *= norm.x;
    263.     p1 *= norm.y;
    264.     p2 *= norm.z;
    265.     p3 *= norm.w;
    266.  
    267. // Mix final noise value
    268.     float4 m = max(
    269.         0.6 - float4(
    270.             dot(x0, x0),
    271.             dot(x1, x1),
    272.             dot(x2, x2),
    273.             dot(x3, x3)
    274.         ),
    275.         0.0
    276.     );
    277.     m = m * m;
    278.     return 42.0 * dot(
    279.         m*m,
    280.         float4(
    281.             dot(p0, x0),
    282.             dot(p1, x1),
    283.             dot(p2, x2),
    284.             dot(p3, x3)
    285.         )
    286.     );
    287. }
    288.  
    289. // ----------------------------------- 4D -------------------------------------
    290.  
    291. float snoise(float4 v)
    292. {
    293.     const float4 C = float4(
    294.         0.138196601125011, // (5 - sqrt(5))/20 G4
    295.         0.276393202250021, // 2 * G4
    296.         0.414589803375032, // 3 * G4
    297.      -0.447213595499958  // -1 + 4 * G4
    298.     );
    299.  
    300. // First corner
    301.     float4 i = floor(
    302.         v +
    303.         dot(
    304.             v,
    305.             0.309016994374947451 // (sqrt(5) - 1) / 4
    306.         )
    307.     );
    308.     float4 x0 = v - i + dot(i, C.xxxx);
    309.  
    310. // Other corners
    311.  
    312. // Rank sorting originally contributed by Bill Licea-Kane, AMD (formerly ATI)
    313.     float4 i0;
    314.     float3 isX = step( x0.yzw, x0.xxx );
    315.     float3 isYZ = step( x0.zww, x0.yyz );
    316.     i0.x = isX.x + isX.y + isX.z;
    317.     i0.yzw = 1.0 - isX;
    318.     i0.y += isYZ.x + isYZ.y;
    319.     i0.zw += 1.0 - isYZ.xy;
    320.     i0.z += isYZ.z;
    321.     i0.w += 1.0 - isYZ.z;
    322.  
    323.     // i0 now contains the unique values 0,1,2,3 in each channel
    324.     float4 i3 = saturate(i0);
    325.     float4 i2 = saturate(i0-1.0);
    326.     float4 i1 = saturate(i0-2.0);
    327.  
    328.     //    x0 = x0 - 0.0 + 0.0 * C.xxxx
    329.     //    x1 = x0 - i1  + 1.0 * C.xxxx
    330.     //    x2 = x0 - i2  + 2.0 * C.xxxx
    331.     //    x3 = x0 - i3  + 3.0 * C.xxxx
    332.     //    x4 = x0 - 1.0 + 4.0 * C.xxxx
    333.     float4 x1 = x0 - i1 + C.xxxx;
    334.     float4 x2 = x0 - i2 + C.yyyy;
    335.     float4 x3 = x0 - i3 + C.zzzz;
    336.     float4 x4 = x0 + C.wwww;
    337.  
    338. // Permutations
    339.     i = fmod(i,289.0);
    340.     float j0 = permute(
    341.         permute(
    342.             permute(
    343.                 permute(i.w) + i.z
    344.             ) + i.y
    345.         ) + i.x
    346.     );
    347.     float4 j1 = permute(
    348.         permute(
    349.             permute(
    350.                 permute (
    351.                     i.w + float4(i1.w, i2.w, i3.w, 1.0 )
    352.                 ) + i.z + float4(i1.z, i2.z, i3.z, 1.0 )
    353.             ) + i.y + float4(i1.y, i2.y, i3.y, 1.0 )
    354.         ) + i.x + float4(i1.x, i2.x, i3.x, 1.0 )
    355.     );
    356.  
    357. // Gradients: 7x7x6 points over a cube, mapped onto a 4-cross polytope
    358. // 7*7*6 = 294, which is close to the ring size 17*17 = 289.
    359.     const float4 ip = float4(
    360.         0.003401360544217687075, // 1/294
    361.         0.020408163265306122449, // 1/49
    362.         0.142857142857142857143, // 1/7
    363.         0.0
    364.     );
    365.  
    366.     float4 p0 = grad4(j0, ip);
    367.     float4 p1 = grad4(j1.x, ip);
    368.     float4 p2 = grad4(j1.y, ip);
    369.     float4 p3 = grad4(j1.z, ip);
    370.     float4 p4 = grad4(j1.w, ip);
    371.  
    372. // Normalise gradients
    373.     float4 norm = taylorInvSqrt(float4(
    374.         dot(p0, p0),
    375.         dot(p1, p1),
    376.         dot(p2, p2),
    377.         dot(p3, p3)
    378.     ));
    379.     p0 *= norm.x;
    380.     p1 *= norm.y;
    381.     p2 *= norm.z;
    382.     p3 *= norm.w;
    383.     p4 *= taylorInvSqrt( dot(p4, p4) );
    384.  
    385. // Mix contributions from the five corners
    386.     float3 m0 = max(
    387.         0.6 - float3(
    388.             dot(x0, x0),
    389.             dot(x1, x1),
    390.             dot(x2, x2)
    391.         ),
    392.         0.0
    393.     );
    394.     float2 m1 = max(
    395.         0.6 - float2(
    396.             dot(x3, x3),
    397.             dot(x4, x4)
    398.         ),
    399.         0.0
    400.     );
    401.     m0 = m0 * m0;
    402.     m1 = m1 * m1;
    403.  
    404.     return 49.0 * (
    405.         dot(
    406.             m0*m0,
    407.             float3(
    408.                 dot(p0, x0),
    409.                 dot(p1, x1),
    410.                 dot(p2, x2)
    411.             )
    412.         ) + dot(
    413.             m1*m1,
    414.             float2(
    415.                 dot(p3, x3),
    416.                 dot(p4, x4)
    417.             )
    418.         )
    419.     );
    420. }
    421.  
    422.  
    423.  
    424. //                 Credits from source glsl file:
    425. //
    426. // Description : Array and textureless GLSL 2D/3D/4D simplex
    427. //               noise functions.
    428. //      Author : Ian McEwan, Ashima Arts.
    429. //  Maintainer : ijm
    430. //     Lastmod : 20110822 (ijm)
    431. //     License : Copyright (C) 2011 Ashima Arts. All rights reserved.
    432. //               Distributed under the MIT License. See LICENSE file.
    433. //               https://github.com/ashima/webgl-noise
    434. //
    435. //
    436. //           The text from LICENSE file:
    437. //
    438. //
    439. // Copyright (C) 2011 by Ashima Arts (Simplex noise)
    440. // Copyright (C) 2011 by Stefan Gustavson (Classic noise)
    441. //
    442. // Permission is hereby granted, free of charge, to any person obtaining a copy
    443. // of this software and associated documentation files (the "Software"), to deal
    444. // in the Software without restriction, including without limitation the rights
    445. // to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
    446. // copies of the Software, and to permit persons to whom the Software is
    447. // furnished to do so, subject to the following conditions:
    448. //
    449. // The above copyright notice and this permission notice shall be included in
    450. // all copies or substantial portions of the Software.
    451. //
    452. // THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
    453. // IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
    454. // FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
    455. // AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
    456. // LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
    457. // OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
    458. // THE SOFTWARE.
    459. #endif
    Please Note: There was a typo in the code, fixed on 7/7/2015 - if you copied this code prior to that point, it won't compile.
     
    Last edited: Jul 7, 2015
    Lipoly, jason-fisher and skullthug like this.
  8. RiokuTheSlayer

    RiokuTheSlayer

    Joined:
    Aug 22, 2013
    Posts:
    356
    Thanks for this! I'll definitely be using it at a later date.
     
  9. verrazuriz

    verrazuriz

    Joined:
    Oct 2, 2015
    Posts:
    1
    Oh i love you so so so much <3
     
  10. joergzdarsky

    joergzdarsky

    Joined:
    Sep 25, 2013
    Posts:
    53
    Additional note to this thread, I've been using the latest piece of code summary from NavyFish for procedural planet generation, but noticed that based on the XYZ Input values the noise output was sometimes erroneous (sudden big value changes), which leaded in my case to strange heightmap artefacts.

    I did take LEX-DRL's original code and again applied two of the optimization hints of Dolkar (2nd and 3rd) and the artefacts were gone, the output correct. I've been in contact with NavyFish and we havent figured out what exactly the mistake in that latest code snippet is, however just in case you have some issues with the noise output you might want to try the fixed code below.

    Code (CSharp):
    1. #ifndef NOISE_SIMPLEX_FUNC
    2. #define NOISE_SIMPLEX_FUNC
    3. /*
    4.  
    5. Description:
    6.     Array- and textureless CgFx/HLSL 2D, 3D and 4D simplex noise functions.
    7.     a.k.a. simplified and optimized Perlin noise.
    8.  
    9.     The functions have very good performance
    10.     and no dependencies on external data.
    11.  
    12.     2D - Very fast, very compact code.
    13.     3D - Fast, compact code.
    14.     4D - Reasonably fast, reasonably compact code.
    15.  
    16. ------------------------------------------------------------------
    17.  
    18. Ported by:
    19.     Lex-DRL
    20.     I've ported the code from GLSL to CgFx/HLSL for Unity,
    21.     added a couple more optimisations (to speed it up even further)
    22.     and slightly reformatted the code to make it more readable.
    23.  
    24. Original GLSL functions:
    25.     https://github.com/ashima/webgl-noise
    26.     Credits from original glsl file are at the end of this cginc.
    27.  
    28. ------------------------------------------------------------------
    29.  
    30. Usage:
    31.  
    32.     float ns = snoise(v);
    33.     // v is any of: float2, float3, float4
    34.  
    35.     Return type is float.
    36.     To generate 2 or more components of noise (colorful noise),
    37.     call these functions several times with different
    38.     constant offsets for the arguments.
    39.     E.g.:
    40.  
    41.     float3 colorNs = float3(
    42.         snoise(v),
    43.         snoise(v + 17.0),
    44.         snoise(v - 43.0),
    45.     );
    46.  
    47.  
    48. Remark about those offsets from the original author:
    49.  
    50.     People have different opinions on whether these offsets should be integers
    51.     for the classic noise functions to match the spacing of the zeroes,
    52.     so we have left that for you to decide for yourself.
    53.     For most applications, the exact offsets don't really matter as long
    54.     as they are not too small or too close to the noise lattice period
    55.     (289 in this implementation).
    56.  
    57. */
    58.  
    59. // 1 / 289
    60. #define NOISE_SIMPLEX_1_DIV_289 0.00346020761245674740484429065744f
    61.  
    62. float mod289(float x) {
    63.     return x - floor(x * NOISE_SIMPLEX_1_DIV_289) * 289.0;
    64. }
    65.  
    66. float2 mod289(float2 x) {
    67.     return x - floor(x * NOISE_SIMPLEX_1_DIV_289) * 289.0;
    68. }
    69.  
    70. float3 mod289(float3 x) {
    71.     return x - floor(x * NOISE_SIMPLEX_1_DIV_289) * 289.0;
    72. }
    73.  
    74. float4 mod289(float4 x) {
    75.     return x - floor(x * NOISE_SIMPLEX_1_DIV_289) * 289.0;
    76. }
    77.  
    78.  
    79. // ( x*34.0 + 1.0 )*x =
    80. // x*x*34.0 + x
    81. float permute(float x) {
    82.     return mod289(
    83.         x*x*34.0 + x
    84.     );
    85. }
    86.  
    87. float3 permute(float3 x) {
    88.     return mod289(
    89.         x*x*34.0 + x
    90.     );
    91. }
    92.  
    93. float4 permute(float4 x) {
    94.     return mod289(
    95.         x*x*34.0 + x
    96.     );
    97. }
    98.  
    99.  
    100.  
    101. float taylorInvSqrt(float r) {
    102.     return 1.79284291400159 - 0.85373472095314 * r;
    103. }
    104.  
    105. float4 taylorInvSqrt(float4 r) {
    106.     return 1.79284291400159 - 0.85373472095314 * r;
    107. }
    108.  
    109.  
    110.  
    111. float4 grad4(float j, float4 ip)
    112. {
    113.     const float4 ones = float4(1.0, 1.0, 1.0, -1.0);
    114.     float4 p, s;
    115.     p.xyz = floor( frac(j * ip.xyz) * 7.0) * ip.z - 1.0;
    116.     p.w = 1.5 - dot( abs(p.xyz), ones.xyz );
    117.  
    118.     // GLSL: lessThan(x, y) = x < y
    119.     // HLSL: 1 - step(y, x) = x < y
    120.     s = float4(
    121.         1 - step(0.0, p)
    122.     );
    123.  
    124.     // Optimization hint Dolkar
    125.     // p.xyz = p.xyz + (s.xyz * 2 - 1) * s.www;
    126.     p.xyz -= sign(p.xyz) * (p.w < 0);
    127.  
    128.     return p;
    129. }
    130.  
    131.  
    132.  
    133. // ----------------------------------- 2D -------------------------------------
    134.  
    135. float snoise(float2 v)
    136. {
    137.     const float4 C = float4(
    138.         0.211324865405187, // (3.0-sqrt(3.0))/6.0
    139.         0.366025403784439, // 0.5*(sqrt(3.0)-1.0)
    140.      -0.577350269189626, // -1.0 + 2.0 * C.x
    141.         0.024390243902439  // 1.0 / 41.0
    142.     );
    143.  
    144. // First corner
    145.     float2 i = floor( v + dot(v, C.yy) );
    146.     float2 x0 = v - i + dot(i, C.xx);
    147.  
    148. // Other corners
    149.     // float2 i1 = (x0.x > x0.y) ? float2(1.0, 0.0) : float2(0.0, 1.0);
    150.     // Lex-DRL: afaik, step() in GPU is faster than if(), so:
    151.     // step(x, y) = x <= y
    152.  
    153.     // Optimization hint Dolkar
    154.     //int xLessEqual = step(x0.x, x0.y); // x <= y ?
    155.     //int2 i1 =
    156.     //    int2(1, 0) * (1 - xLessEqual) // x > y
    157.     //    + int2(0, 1) * xLessEqual // x <= y
    158.     //;
    159.     //float4 x12 = x0.xyxy + C.xxzz;
    160.     //x12.xy -= i1;
    161.  
    162.     float4 x12 = x0.xyxy + C.xxzz;
    163.     int2 i1 = (x0.x > x0.y) ? float2(1.0, 0.0) : float2(0.0, 1.0);
    164.     x12.xy -= i1;
    165.  
    166. // Permutations
    167.     i = mod289(i); // Avoid truncation effects in permutation
    168.     float3 p = permute(
    169.         permute(
    170.                 i.y + float3(0.0, i1.y, 1.0 )
    171.         ) + i.x + float3(0.0, i1.x, 1.0 )
    172.     );
    173.  
    174.     float3 m = max(
    175.         0.5 - float3(
    176.             dot(x0, x0),
    177.             dot(x12.xy, x12.xy),
    178.             dot(x12.zw, x12.zw)
    179.         ),
    180.         0.0
    181.     );
    182.     m = m*m ;
    183.     m = m*m ;
    184.  
    185. // Gradients: 41 points uniformly over a line, mapped onto a diamond.
    186. // The ring size 17*17 = 289 is close to a multiple of 41 (41*7 = 287)
    187.  
    188.     float3 x = 2.0 * frac(p * C.www) - 1.0;
    189.     float3 h = abs(x) - 0.5;
    190.     float3 ox = floor(x + 0.5);
    191.     float3 a0 = x - ox;
    192.  
    193. // Normalise gradients implicitly by scaling m
    194. // Approximation of: m *= inversesqrt( a0*a0 + h*h );
    195.     m *= 1.79284291400159 - 0.85373472095314 * ( a0*a0 + h*h );
    196.  
    197. // Compute final noise value at P
    198.     float3 g;
    199.     g.x = a0.x * x0.x + h.x * x0.y;
    200.     g.yz = a0.yz * x12.xz + h.yz * x12.yw;
    201.     return 130.0 * dot(m, g);
    202. }
    203.  
    204. // ----------------------------------- 3D -------------------------------------
    205.  
    206. float snoise(float3 v)
    207. {
    208.     const float2 C = float2(
    209.         0.166666666666666667, // 1/6
    210.         0.333333333333333333  // 1/3
    211.     );
    212.     const float4 D = float4(0.0, 0.5, 1.0, 2.0);
    213.  
    214. // First corner
    215.     float3 i = floor( v + dot(v, C.yyy) );
    216.     float3 x0 = v - i + dot(i, C.xxx);
    217.  
    218. // Other corners
    219.     float3 g = step(x0.yzx, x0.xyz);
    220.     float3 l = 1 - g;
    221.     float3 i1 = min(g.xyz, l.zxy);
    222.     float3 i2 = max(g.xyz, l.zxy);
    223.  
    224.     float3 x1 = x0 - i1 + C.xxx;
    225.     float3 x2 = x0 - i2 + C.yyy; // 2.0*C.x = 1/3 = C.y
    226.     float3 x3 = x0 - D.yyy;      // -1.0+3.0*C.x = -0.5 = -D.y
    227.  
    228. // Permutations
    229.     i = mod289(i);
    230.     float4 p = permute(
    231.         permute(
    232.             permute(
    233.                     i.z + float4(0.0, i1.z, i2.z, 1.0 )
    234.             ) + i.y + float4(0.0, i1.y, i2.y, 1.0 )
    235.         )     + i.x + float4(0.0, i1.x, i2.x, 1.0 )
    236.     );
    237.  
    238. // Gradients: 7x7 points over a square, mapped onto an octahedron.
    239. // The ring size 17*17 = 289 is close to a multiple of 49 (49*6 = 294)
    240.     float n_ = 0.142857142857; // 1/7
    241.     float3 ns = n_ * D.wyz - D.xzx;
    242.  
    243.     float4 j = p - 49.0 * floor(p * ns.z * ns.z); // mod(p,7*7)
    244.  
    245.     float4 x_ = floor(j * ns.z);
    246.     float4 y_ = floor(j - 7.0 * x_ ); // mod(j,N)
    247.  
    248.     float4 x = x_ *ns.x + ns.yyyy;
    249.     float4 y = y_ *ns.x + ns.yyyy;
    250.     float4 h = 1.0 - abs(x) - abs(y);
    251.  
    252.     float4 b0 = float4( x.xy, y.xy );
    253.     float4 b1 = float4( x.zw, y.zw );
    254.  
    255.     //float4 s0 = float4(lessThan(b0,0.0))*2.0 - 1.0;
    256.     //float4 s1 = float4(lessThan(b1,0.0))*2.0 - 1.0;
    257.     float4 s0 = floor(b0)*2.0 + 1.0;
    258.     float4 s1 = floor(b1)*2.0 + 1.0;
    259.     float4 sh = -step(h, 0.0);
    260.  
    261.     float4 a0 = b0.xzyw + s0.xzyw*sh.xxyy ;
    262.     float4 a1 = b1.xzyw + s1.xzyw*sh.zzww ;
    263.  
    264.     float3 p0 = float3(a0.xy,h.x);
    265.     float3 p1 = float3(a0.zw,h.y);
    266.     float3 p2 = float3(a1.xy,h.z);
    267.     float3 p3 = float3(a1.zw,h.w);
    268.  
    269. //Normalise gradients
    270.     float4 norm = taylorInvSqrt(float4(
    271.         dot(p0, p0),
    272.         dot(p1, p1),
    273.         dot(p2, p2),
    274.         dot(p3, p3)
    275.     ));
    276.     p0 *= norm.x;
    277.     p1 *= norm.y;
    278.     p2 *= norm.z;
    279.     p3 *= norm.w;
    280.  
    281. // Mix final noise value
    282.     float4 m = max(
    283.         0.6 - float4(
    284.             dot(x0, x0),
    285.             dot(x1, x1),
    286.             dot(x2, x2),
    287.             dot(x3, x3)
    288.         ),
    289.         0.0
    290.     );
    291.     m = m * m;
    292.     return 42.0 * dot(
    293.         m*m,
    294.         float4(
    295.             dot(p0, x0),
    296.             dot(p1, x1),
    297.             dot(p2, x2),
    298.             dot(p3, x3)
    299.         )
    300.     );
    301. }
    302.  
    303. // ----------------------------------- 4D -------------------------------------
    304.  
    305. float snoise(float4 v)
    306. {
    307.     const float4 C = float4(
    308.         0.138196601125011, // (5 - sqrt(5))/20 G4
    309.         0.276393202250021, // 2 * G4
    310.         0.414589803375032, // 3 * G4
    311.      -0.447213595499958  // -1 + 4 * G4
    312.     );
    313.  
    314. // First corner
    315.     float4 i = floor(
    316.         v +
    317.         dot(
    318.             v,
    319.             0.309016994374947451 // (sqrt(5) - 1) / 4
    320.         )
    321.     );
    322.     float4 x0 = v - i + dot(i, C.xxxx);
    323.  
    324. // Other corners
    325.  
    326. // Rank sorting originally contributed by Bill Licea-Kane, AMD (formerly ATI)
    327.     float4 i0;
    328.     float3 isX = step( x0.yzw, x0.xxx );
    329.     float3 isYZ = step( x0.zww, x0.yyz );
    330.     i0.x = isX.x + isX.y + isX.z;
    331.     i0.yzw = 1.0 - isX;
    332.     i0.y += isYZ.x + isYZ.y;
    333.     i0.zw += 1.0 - isYZ.xy;
    334.     i0.z += isYZ.z;
    335.     i0.w += 1.0 - isYZ.z;
    336.  
    337.     // i0 now contains the unique values 0,1,2,3 in each channel
    338.     float4 i3 = saturate(i0);
    339.     float4 i2 = saturate(i0-1.0);
    340.     float4 i1 = saturate(i0-2.0);
    341.  
    342.     //    x0 = x0 - 0.0 + 0.0 * C.xxxx
    343.     //    x1 = x0 - i1  + 1.0 * C.xxxx
    344.     //    x2 = x0 - i2  + 2.0 * C.xxxx
    345.     //    x3 = x0 - i3  + 3.0 * C.xxxx
    346.     //    x4 = x0 - 1.0 + 4.0 * C.xxxx
    347.     float4 x1 = x0 - i1 + C.xxxx;
    348.     float4 x2 = x0 - i2 + C.yyyy;
    349.     float4 x3 = x0 - i3 + C.zzzz;
    350.     float4 x4 = x0 + C.wwww;
    351.  
    352. // Permutations
    353.     i = mod289(i);
    354.     float j0 = permute(
    355.         permute(
    356.             permute(
    357.                 permute(i.w) + i.z
    358.             ) + i.y
    359.         ) + i.x
    360.     );
    361.     float4 j1 = permute(
    362.         permute(
    363.             permute(
    364.                 permute (
    365.                     i.w + float4(i1.w, i2.w, i3.w, 1.0 )
    366.                 ) + i.z + float4(i1.z, i2.z, i3.z, 1.0 )
    367.             ) + i.y + float4(i1.y, i2.y, i3.y, 1.0 )
    368.         ) + i.x + float4(i1.x, i2.x, i3.x, 1.0 )
    369.     );
    370.  
    371. // Gradients: 7x7x6 points over a cube, mapped onto a 4-cross polytope
    372. // 7*7*6 = 294, which is close to the ring size 17*17 = 289.
    373.     const float4 ip = float4(
    374.         0.003401360544217687075, // 1/294
    375.         0.020408163265306122449, // 1/49
    376.         0.142857142857142857143, // 1/7
    377.         0.0
    378.     );
    379.  
    380.     float4 p0 = grad4(j0, ip);
    381.     float4 p1 = grad4(j1.x, ip);
    382.     float4 p2 = grad4(j1.y, ip);
    383.     float4 p3 = grad4(j1.z, ip);
    384.     float4 p4 = grad4(j1.w, ip);
    385.  
    386. // Normalise gradients
    387.     float4 norm = taylorInvSqrt(float4(
    388.         dot(p0, p0),
    389.         dot(p1, p1),
    390.         dot(p2, p2),
    391.         dot(p3, p3)
    392.     ));
    393.     p0 *= norm.x;
    394.     p1 *= norm.y;
    395.     p2 *= norm.z;
    396.     p3 *= norm.w;
    397.     p4 *= taylorInvSqrt( dot(p4, p4) );
    398.  
    399. // Mix contributions from the five corners
    400.     float3 m0 = max(
    401.         0.6 - float3(
    402.             dot(x0, x0),
    403.             dot(x1, x1),
    404.             dot(x2, x2)
    405.         ),
    406.         0.0
    407.     );
    408.     float2 m1 = max(
    409.         0.6 - float2(
    410.             dot(x3, x3),
    411.             dot(x4, x4)
    412.         ),
    413.         0.0
    414.     );
    415.     m0 = m0 * m0;
    416.     m1 = m1 * m1;
    417.  
    418.     return 49.0 * (
    419.         dot(
    420.             m0*m0,
    421.             float3(
    422.                 dot(p0, x0),
    423.                 dot(p1, x1),
    424.                 dot(p2, x2)
    425.             )
    426.         ) + dot(
    427.             m1*m1,
    428.             float2(
    429.                 dot(p3, x3),
    430.                 dot(p4, x4)
    431.             )
    432.         )
    433.     );
    434. }
    435.  
    436.  
    437.  
    438. //                 Credits from source glsl file:
    439. //
    440. // Description : Array and textureless GLSL 2D/3D/4D simplex
    441. //               noise functions.
    442. //      Author : Ian McEwan, Ashima Arts.
    443. //  Maintainer : ijm
    444. //     Lastmod : 20110822 (ijm)
    445. //     License : Copyright (C) 2011 Ashima Arts. All rights reserved.
    446. //               Distributed under the MIT License. See LICENSE file.
    447. //               https://github.com/ashima/webgl-noise
    448. //
    449. //
    450. //           The text from LICENSE file:
    451. //
    452. //
    453. // Copyright (C) 2011 by Ashima Arts (Simplex noise)
    454. // Copyright (C) 2011 by Stefan Gustavson (Classic noise)
    455. //
    456. // Permission is hereby granted, free of charge, to any person obtaining a copy
    457. // of this software and associated documentation files (the "Software"), to deal
    458. // in the Software without restriction, including without limitation the rights
    459. // to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
    460. // copies of the Software, and to permit persons to whom the Software is
    461. // furnished to do so, subject to the following conditions:
    462. //
    463. // The above copyright notice and this permission notice shall be included in
    464. // all copies or substantial portions of the Software.
    465. //
    466. // THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
    467. // IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
    468. // FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
    469. // AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
    470. // LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
    471. // OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
    472. // THE SOFTWARE.
    473. #endif
     
    Oxygame, Lipoly, JoeStrout and 3 others like this.
  11. NavyFish

    NavyFish

    Joined:
    Aug 16, 2013
    Posts:
    28
    Thanks, Joerg! I'm using your version now.
     
  12. nug700

    nug700

    Joined:
    Nov 19, 2012
    Posts:
    20
    Just a quick question: how do I set the frequency and, more importantly, the seed?
     
  13. NavyFish

    NavyFish

    Joined:
    Aug 16, 2013
    Posts:
    28
    Set the frequency by multiplying your input coordinates, I.E. snoise(float2(freq*X, freq*y)).

    Currently I don't think there's a way to set the seed. You can offset your input coordinates by adding some value to them, but this only gets you so far.

    Setting a new seed requires changing a permutation table - something this implementation does without. I'm slowly working on an implementation that allows you to change the seed but retains the portability of this implementation (i.e. no requirement to upload a look-up table to the GPU). It will be slightly slower than this approach, but hopefully not too bad. Once it's working, I will share the code here!
     
  14. JoeStrout

    JoeStrout

    Joined:
    Jan 14, 2011
    Posts:
    6,874
    Just found this, and it's great stuff. Thank you all for contributing.

    Perhaps it's a dumb question, but would it be worth including an efficient 1D implementation? I know I can just do snoise(float2(x, 0)).x, but I suspect this is working harder than it needs to.

    (I need 1D noise function of the angle around a center point.)
     
  15. orangeNameAlreadyTakenSoIveGotToImprovise

    orangeNameAlreadyTakenSoIveGotToImprovise

    Joined:
    Dec 28, 2016
    Posts:
    3
    AFAIK, 1D Simplex noise gives very similar results to value noise (although usually the interpolation function is different).

    If that is enough for you, you can simply use the permute function from this code to evaluate the noise amplitude at point x and point x+1, and then interpolate between these results however you like (e.g. with smoothstep).
     
    JoeStrout likes this.
  16. orangeNameAlreadyTakenSoIveGotToImprovise

    orangeNameAlreadyTakenSoIveGotToImprovise

    Joined:
    Dec 28, 2016
    Posts:
    3
    Back to topic: Does anyone have a version of the 3D noise function that also calculates the analytical derivatives? I would like to use them for bump mapping.
     
  17. jason-fisher

    jason-fisher

    Joined:
    Mar 19, 2014
    Posts:
    133