Search Unity

  1. Megacity Metro Demo now available. Download now.
    Dismiss Notice
  2. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

DrawMeshInstancedIndirect with transforms / Matrices

Discussion in 'General Graphics' started by Kushulain, Jan 11, 2020.

  1. Kushulain

    Kushulain

    Joined:
    Sep 20, 2012
    Posts:
    19
    Hey,
    Is it possible to provide transforms (like unity_ObjectToWorld) via DrawMeshInstancedIndirect in the buffer with no need to make a custom shader ?
    DrawMeshInstanced works perfectly for this, with Standard shaders, but point lights (forward_add pass) aren't working with it.

    The idea would be to have a instancing solution which works with lighting, and does not require to modify shaders like with DrawMeshInstanced.
     
    Last edited: Jan 11, 2020
  2. richardkettlewell

    richardkettlewell

    Unity Technologies

    Joined:
    Sep 9, 2015
    Posts:
    2,281
    No it’s not possible. This indirect method is specifically for use with custom shaders.
     
  3. Kushulain

    Kushulain

    Joined:
    Sep 20, 2012
    Posts:
    19
    Thank you ! I couldn't find a clear answer anywhere.
    I modified the shader in the end. I share code here for people who want to do the same and are new to buffers like me. I saw some pieces of code here and there on the forums, but none was complete.

    Here is same of C# code. It's not fully complete, but should be quite enough I think.

    Code (CSharp):
    1. public class GPUGroup
    2. {
    3.     public Material material;
    4.     Mesh mesh;
    5.     public List<Matrix4x4> gpuInstances = new List<Matrix4x4>();
    6.     public bool changed;
    7.     private ComputeBuffer buffer;
    8.     private ComputeBuffer argsBuffer;
    9.     private uint meshIndexCount;
    10.     private uint meshIndexStart;
    11.     private uint meshBaseVertex;
    12.  
    13.     public GPUGroup(Material mat, Mesh _mesh)
    14.     {
    15.         mesh = _mesh;
    16.  
    17.         argsBuffer = new ComputeBuffer(1, 5 * sizeof(uint), ComputeBufferType.IndirectArguments);
    18.         material = new Material(mat);
    19.         changed = true;
    20.     }
    21.  
    22.     public void Add(Matrix4x4 obj2World, Matrix4x4 world2Obj)
    23.     {
    24.         gpuInstances.Add(obj2World);
    25.         gpuInstances.Add(world2Obj);
    26.         changed = true;
    27.     }
    28.  
    29.     public void CheckForRefresh()
    30.     {
    31.         if (changed)
    32.         {
    33.             changed = false;
    34.             RefreshBuffer();
    35.             //uint[] args = new uint[5] { mesh.GetIndexCount(0), (uint)(gpuInstances.Count / 2), 0, 0, 0 };
    36.             //If i'm right, because of buffer size limitation (64KB) we can't draw more than ~500 meshes (if we send both world2object object2world matrices)
    37.             uint[] args = new uint[5] { mesh.GetIndexCount(0), (uint)(500), 0, 0, 0 };
    38.             argsBuffer.SetData(args);
    39.         }
    40.     }
    41.  
    42.     void RefreshBuffer()
    43.     {
    44.         if (buffer != null)
    45.             buffer.Release();
    46.  
    47.         buffer = new ComputeBuffer(gpuInstances.Count, 128);
    48.         buffer.SetData(gpuInstances);
    49.         material.SetBuffer("transforms", buffer);
    50.  
    51.     }
    52.  
    53.     void ReleaseBuffers()
    54.     {
    55.         if (buffer != null)
    56.             buffer.Release();
    57.         buffer = null;
    58.  
    59.         if (argsBuffer != null)
    60.             argsBuffer.Release();
    61.         argsBuffer = null;
    62.     }
    63.  
    64.     public void Draw(Bounds bounds)
    65.     {
    66.         CheckForRefresh();
    67.         Graphics.DrawMeshInstancedIndirect(mesh, 0, material, bounds, argsBuffer);
    68.     }
    69.  
    70. }
    71.  
    72. //Usage example :
    73.  
    74. GPUGroup gpuGroup = new GPUGroup(material, mesh));
    75. gpuGroup.Add(go.transform.localToWorldMatrix, go.transform.worldToLocalMatrix)
    76. gpuGroup.Draw(new Bounds(Vector3.zero, Vector3.one * 1000f));
    Now pieces of my shader code with matrices, full instanced shader example at https://docs.unity3d.com/ScriptReference/Graphics.DrawMeshInstancedIndirect.html
    Code (CSharp):
    1.  
    2.             #pragma target 4.5
    3.  
    4.            struct v2f_surf
    5.            {
    6.                float4 pos : SV_POSITION;
    7.                float2 uv_MainTex : TEXCOORD0;
    8.                float3 ambient : TEXCOORD1;
    9.                float3 diffuse : TEXCOORD2;
    10.                float3 color : TEXCOORD3;
    11.                SHADOW_COORDS(4)
    12.                uint instanceIDD : SV_InstanceID;
    13.            };
    14.  
    15. #if SHADER_TARGET >= 45
    16.             struct MatData { float4x4 obj2world, world2Obj; };
    17.             StructuredBuffer<MatData> transforms;
    18. #endif
    19.  
    20.             v2f_surf vert_surf (appdata_full v, uint instanceID : SV_InstanceID)
    21.             {
    22. #if SHADER_TARGET >= 45
    23.                 unity_ObjectToWorld = transforms[instanceID].obj2world;
    24.                 unity_WorldToObject = transforms[instanceID].world2Obj;
    25. #endif
    26.  
    27.             v2f_surf o;
    28.             o.instanceID = instanceID; //uint instanceID : SV_InstanceID;
    29.  
    30.             [vertex code here]
    31.    
    32.             }
    33.  
    34.  
    35.             //Same thing in fragment if you use unity_ObjectToWorld & unity_WorldToObject in there
    36.             fixed4 frag_surf (v2f_surf IN) : SV_Target
    37.             {
    38. #if SHADER_TARGET >= 45
    39.                 unity_ObjectToWorld = transforms[IN.instanceID].obj2world;
    40.                 unity_WorldToObject = transforms[IN.instanceID].world2Obj;
    41. #endif
    42.  
    43.             [fragment code here]
    44.  
    45.             }
    It works nice, additionnal light pass works as well.

    I know one would have to reprocess all UNITY_MATRIX_MVP / UNITY_MATRIX_MV / UNITY_MATRIX_T_MV / UNITY_MATRIX_IT_MV (example in UnityInstancing.cginc) but my shader doesn't use any of these matrices.

    Would be great to have a way to draw instanced objects without the need for custom SM 4.5 shader (like with Graphics.DrawMeshInstanced), and with the addtionnal lights working correctly. (froward add pass especially)
     
    Last edited: Jan 12, 2020
    mitcore likes this.
  4. richardkettlewell

    richardkettlewell

    Unity Technologies

    Joined:
    Sep 9, 2015
    Posts:
    2,281
    Can do it with constant buffers or texture reads with Graphics.DrawMeshProcedural in 2019.3. It requires the instance count to be known from script, but doesn’t have the requirement for Compute support.

    If you were going to use constant buffers, weigh up whether it’s really worth it, vs. just using DrawMeshInstanced, which uses constant buffers under the hood anyway. Constant buffers are limited to 32kb in size, too.
     
    Kushulain likes this.
  5. Kushulain

    Kushulain

    Joined:
    Sep 20, 2012
    Posts:
    19
    Thank you for the answer !
    I didn't know about this Graphics.DrawMeshProcedural. We are stuck on 2018.3.7 unfortunately.

    We would like to draw a maximum of 50,000 total static objects in the scene, for VR. We aim high end GPU. So SM 4.5 is not really the most important problem, it was rather lighting (DrawMeshInstanced) or having to modify all shaders (DrawMeshInstancedIndirect).

    We first tried LOD 3D meshes, then added "controlled" mesh combining. Perf are really good, but we had a lot of frame drop when moving.
    In the end we tried impostors, they draw on screen quite fast, but cannot be combined easily. So we now use Impostors + GPU Instancing and it seems to be a lot more efficient than previous system (maybe 10 times). And we got rid of frame drops. And maybe use octree scene graph, to process frustrum culling / LOD.

    For maybe 10,000 visible object on screen, we would have (10,000 / 500) 20 batches.
    two_eyes * (lightCount + updateDepth) + (4shadowCascades)

    20 batches * (2*(1+1) + (4)) = 20 * (4+4) = 160 draw call total.
    I think we may end up with better performances with this recipe !
     
    MaxEden and laurentlavigne like this.
  6. rulk

    rulk

    Joined:
    Aug 1, 2015
    Posts:
    13
    Hi, maybe you can elaborate on how to use constant buffers with DrawMeshInstancedProcedural.

    I'm unable to find any examples on how to correctly bind constant buffers.

    It always reads only the first element as if it is missing glVertexAttribDivisor
     
  7. richardkettlewell

    richardkettlewell

    Unity Technologies

    Joined:
    Sep 9, 2015
    Posts:
    2,281
    You ought to be able to use this: https://docs.unity3d.com/2020.1/Documentation/ScriptReference/Material.SetConstantBuffer.html (preferably with a GraphicsBuffer)

    Then declare a cbuffer in your shader.

    Ive not tried this myself, but this is how it should work :)

    And if your cbuffer uses arrays, just remember it can only be 32k total size. Which may impact the max number of instances you could draw, if you are indexing into it with SV_InstanceID. So you may need a few draw calls to get through all your data.
     
  8. JSmithIR

    JSmithIR

    Joined:
    Apr 13, 2023
    Posts:
    111
    Could you provide a possible solution to accessing similar functionality to Unity_WorldToObject() when using DrawMeshInstanedIndirect()? I am using a custom shader and have figured out how to assign unique transforms for each instanced mesh, but have not found a way to replicate Unity_WorldToObject() for my instances. Thank you!