# Question Calculating Normals of a Mesh in Compute Shader

Discussion in 'Shaders' started by ay_ahmet, Oct 11, 2022.

1. ### ay_ahmet

Joined:
May 17, 2021
Posts:
5
I'm trying to achieve the exact same result with Unity's built-in 'Mesh.RecalculateNormals()' method.

I can get the same result in a C# script. (script attached below)

I can not get the same result even though I'm using the same algorithm. (compute script attached below)

Can someone point out the thing I'm missing or doing wrong? (picture of a sample result with compute shader attached below)

Some notes:
• I get different results every time I dispatch Compute Shader.
• I have used Unity's default sphere and a simple sphere created in blender. Same results.
• My ambient light color is black, that is why bottom half of the sphere is complete black. It does not affect the results.

Here's how I calculate normals in C# - CPU:
Code (CSharp):
1. private void CalculateNormalsCPU()
2.         {
3.             var sphereMesh = MeshFilter.mesh;
4.             var vertices = sphereMesh.vertices;
5.             var triangles = sphereMesh.triangles;
6.             var triangleCount = triangles.Length / 3;
7.
8.             var normals = new Vector3[vertices.Length];
9.
10.             for (var i = 0; i < triangleCount; i++)
11.             {
12.                 var triangleIndex = i * 3;
13.                 var vertex1 = vertices[triangles[triangleIndex]];
14.                 var vertex2 = vertices[triangles[triangleIndex + 1]];
15.                 var vertex3 = vertices[triangles[triangleIndex + 2]];
16.
17.                 var side1 = vertex2 - vertex1;
18.                 var side2 = vertex3 - vertex1;
19.
20.                 var triangleNormal = Vector3.Normalize(Vector3.Cross(side1, side2));
21.
22.                 normals[triangles[triangleIndex]] += triangleNormal;
23.                 normals[triangles[triangleIndex + 1]] += triangleNormal;
24.                 normals[triangles[triangleIndex + 2]] += triangleNormal;
25.             }
26.
27.             for (int i = 0; i < vertices.Length; i++)
28.             {
29.                 normals[i] = normals[i].normalized;
30.             }
31.             sphereMesh.normals = normals;
32.         }

Here is how I prepare and dispatch my Compute Shader:
Code (CSharp):
2.         {
3.             var sphereMesh = MeshFilter.mesh;
4.             var vertexCount = sphereMesh.vertexCount;
5.             var triangleCount = sphereMesh.triangles.Length / 3;
6.             sphereMesh.normals = new Vector3[vertexCount];
7.
8.             var trianglesBuffer = new GraphicsBuffer(GraphicsBuffer.Target.Structured, sphereMesh.triangles.Length, sizeof(int));
9.             trianglesBuffer.SetData(sphereMesh.triangles);
10.
11.             sphereMesh.vertexBufferTarget |= GraphicsBuffer.Target.Raw;
12.             var vertexBuffer = sphereMesh.GetVertexBuffer(0);
13.
17.
21.
24.
25.             vertexBuffer.Dispose();
26.             trianglesBuffer.Dispose();
27.         }

Code (CSharp):
1. #pragma kernel CalculateNormals
2. #pragma kernel NormalizeNormals
3.
4. #define PI 3.14159265359
5. #define TAU 6.28318530718
6.
7. uint VertexCount;
8. uint TriangleCount;
9. uint Stride;
10.
12. StructuredBuffer<uint> Triangles;
13.
15. void CalculateNormals (uint3 id: SV_DispatchThreadID)
16. {
17.     if (id.x >= TriangleCount) return;
18.
19.     uint triangleIndex = id.x * 3;
20.
21.     uint indexVertex1 = uint(Triangles[triangleIndex]);
22.     uint indexVertex2 = uint(Triangles[triangleIndex + 1]);
23.     uint indexVertex3 = uint(Triangles[triangleIndex + 2]);
24.
25.     float3 vertex1 = asfloat(VertexBuffer.Load3(indexVertex1 * Stride));
26.     float3 vertex2 = asfloat(VertexBuffer.Load3(indexVertex2 * Stride));
27.     float3 vertex3 = asfloat(VertexBuffer.Load3(indexVertex3 * Stride));
28.
29.     float3 side1 = vertex2 - vertex1;
30.     float3 side2 = vertex3 - vertex1;
31.
32.     float3 triangleNormal = normalize(cross(side1, side2));
33.
34.     float3 normalVertex1 = asfloat(VertexBuffer.Load3(indexVertex1 * Stride + 12));
35.     VertexBuffer.Store3(indexVertex1 * Stride + 12, asuint(normalVertex1 + triangleNormal));
36.
37.     float3 normalVertex2 = asfloat(VertexBuffer.Load3(indexVertex2 * Stride + 12));
38.     VertexBuffer.Store3(indexVertex2 * Stride + 12, asuint(normalVertex2 + triangleNormal));
39.
40.     float3 normalVertex3 = asfloat(VertexBuffer.Load3(indexVertex3 * Stride + 12));
41.     VertexBuffer.Store3(indexVertex3 * Stride + 12, asuint(normalVertex3 + triangleNormal));
42. }
43.
44.
46. void NormalizeNormals (uint3 id: SV_DispatchThreadID)
47. {
48.     if (id.x >= VertexCount) return;
49.     uint vid = id.x * Stride;
50.
51.     float3 normal = asfloat(VertexBuffer.Load3(vid + 12));
52.     VertexBuffer.Store3(vid + 12, asuint(normalize(normal)));
53. }

File size:
501.2 KB
Views:
164
2. ### burningmime

Joined:
Jan 25, 2014
Posts:
845
The threads in a compute shader run out-of-order with a lot of them running simultaneously.

Easiest way would be to quantize the floats to ints (eg multiply each component by 2^16 or something) and then use atomic operations (InterlockedAdd) to add them directly to the memory location. This would be quite fast and you wouldn't need to change your algorithm at all. For example, instead of...

Code (CSharp):
1. float3 normalVertex1 = asfloat(VertexBuffer.Load3(indexVertex1 * Stride + 12));
2.     VertexBuffer.Store3(indexVertex1 * Stride + 12, asuint(normalVertex1 + triangleNormal));
3.     float3 normalVertex2 = asfloat(VertexBuffer.Load3(indexVertex2 * Stride + 12));
4.     VertexBuffer.Store3(indexVertex2 * Stride + 12, asuint(normalVertex2 + triangleNormal));
5.     float3 normalVertex3 = asfloat(VertexBuffer.Load3(indexVertex3 * Stride + 12));
6.     VertexBuffer.Store3(indexVertex3 * Stride + 12, asuint(normalVertex3 + triangleNormal));
You would write (untested):

Code (CSharp):
1. float QUANTIIZE_FACTOR = 32768.0;
2. int3 quantizedNormal = (int3) (triangleNormal * QUANTIIZE_FACTOR);
3. int ignore;
4.
5. VertexBuffer.InterlockedAdd(indexVertex1 * Stride + 12, quantizedNormal.x, ignore);
6. VertexBuffer.InterlockedAdd(indexVertex1 * Stride + 16, quantizedNormal.y, ignore);
7. VertexBuffer.InterlockedAdd(indexVertex1 * Stride + 20, quantizedNormal.z, ignore);
8. VertexBuffer.InterlockedAdd(indexVertex2 * Stride + 12, quantizedNormal.x, ignore);
9. VertexBuffer.InterlockedAdd(indexVertex2 * Stride + 16, quantizedNormal.y, ignore);
10. VertexBuffer.InterlockedAdd(indexVertex2 * Stride + 20, quantizedNormal.z, ignore);
11. VertexBuffer.InterlockedAdd(indexVertex3 * Stride + 12, quantizedNormal.x, ignore);
12. VertexBuffer.InterlockedAdd(indexVertex3 * Stride + 16, quantizedNormal.y, ignore);
13. VertexBuffer.InterlockedAdd(indexVertex3 * Stride + 20, quantizedNormal.z, ignore);
14.
EDIT: Should be
``int``
not
``uint``
, but you probably figured that out.

Last edited: Oct 14, 2022
ay_ahmet likes this.
3. ### Qleenie

Joined:
Jan 27, 2019
Posts:
937
This should definitely work, but the InterlockedAdd do have as I understood some performance impact. I saw another trick in the implementation of Ziva; they calculate the normals per face, not per vertex. and thus there is no race conditions in the compute shader. I guess this is the fastest way of doing normals calculation.

4. ### edeguine

Joined:
Jul 2, 2015
Posts:
5
I am tempted to do something like this because my app / game deforms some large mesh and needs to recompute normals every frame. @ay_ahmet did you see a performance improvement compared to unity provided function. I can't tell if Unity does the job on cpu or gpu.