Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Graphics.DrawMeshInstanced

Discussion in 'Graphics for ECS' started by Arathorn_J, Jun 26, 2018.

  1. Arathorn_J

    Arathorn_J

    Joined:
    Jan 13, 2018
    Posts:
    47
    Do we have an ETA or is there a plan to allow calls to Graphics.DrawMeshInstanced from inside of a job or at least being able to pass a native array of Matrix4x4 to the DrawMeshInstanced method?
     
    Last edited: Jun 26, 2018
  2. zulfajuniadi

    zulfajuniadi

    Joined:
    Nov 18, 2017
    Posts:
    117
    You can keep a class property of Matrix4x4[1023] ready and just copy over the values on each update. During the DrawMeshInstanced calls, you have to pass in the actual length of the data so its fine to have a larger array.
     
  3. Arathorn_J

    Arathorn_J

    Joined:
    Jan 13, 2018
    Posts:
    47
    Thanks for the reply. I’m doing that precisely in my code. The problem is I’ve created a batched mesh animation system that allows me to animate thousands of characters but the bottleneck right now is the call to draw meshed instance which has to be called outside the job and copying from native arrays to my matrix array for the next frame takes a lot of time... I’m still able to get 30 FPS even with a 100 thousand animated characters but that’s with really minimizing draw calls which limits variety and realism of the characters. I saw in the examples for ECS and from other posts that Unity was hinting at allowing passing native array or allowing Graphics methods from inside the job system but I haven’t seen anything for certain on when or if this is going to happen?
     
  4. zulfajuniadi

    zulfajuniadi

    Joined:
    Nov 18, 2017
    Posts:
    117
    You can try looking into DrawMeshInstancedIndirect. That API is a beast and can happily draw a hundred thousand agents without hurting the CPU. I drew 20,000 trees in one batch and it took less than 0.3ms every frame. Of course they were static so I only updated the cached tree transform when it is cut down.
     
  5. Arathorn_J

    Arathorn_J

    Joined:
    Jan 13, 2018
    Posts:
    47
    Oh yes, I played around with that a bit, but I'm not familiar enough with shaders and most of the examples for using with animations were not friendly to the job system at all. So as an example I have 30 statically baked meshes for my walk animation cycle, to create variety I have that broken up into 20 different cycles with 5 draw calls per cycle so for my 100K example I have about 100 draw calls (DrawMeshInstanced), which makes them look non-uniform for their walk cycle. The drawing part of that takes between 1-2 MS, but getting the data ready to pass to that system takes about 6 ms, if I can make the calls inside a job or just pass a native array I'm betting I can totally eliminate the 6ms it takes to prep the data. You can imagine once these characters have LOD cycles with different animations, walk, run, attack, death, plus different materials for more variety that it will slow down exponentially if I can't quickly pass data to the DrawMeshInstanced calls.
     
  6. zulfajuniadi

    zulfajuniadi

    Joined:
    Nov 18, 2017
    Posts:
    117
    Another way to go about it is by baking the animations into textures rather than the meshes. And then setting the animation index via material property blocks on the DrawMeshInstanced calls. In theory it should work though i'm not too sure.

    So far I've only looked into the Austin Tech Demo and the Unity Animation Instancing repo to tackle the animation problem in ECS. But have yet to fully test both methods to see which one works best.

    But I totally understand what you are going through.
     
    NathanJSmith likes this.
  7. Arathorn_J

    Arathorn_J

    Joined:
    Jan 13, 2018
    Posts:
    47
    Thanks for the great replies! Yes, so the Austin Tech Demo basically is licensed in such a way you can't use anything in there other than for inspiration, but its a good reference. The Unity Animation Instancing Repo demo you linked to was the one I had the most success with, though once I hit about 20,000 animations the frame rate plummeted as it just does so much work on the CPU that it bogs the whole thing down, it was taking nearly 30 ms to finish doing the work it needed to do for the bones. It was still really impressive, but considering my baked plan I can get higher framerate with 5x the number of characters I decided against it. The nice thing about that static batching is performance is pretty amazing, you are doing no additional calculations on the GPU or CPU for skinning (since you pre-bake the animations), the downside is you really have to plan and have a tight budget for the number of different units and animations you allow on screen at once or you might find yourself with no memory left. I'm hoping we get some additional options here soon, I've been following your work on the nav agent system and that is really great, I think we can see some outstanding simulations coming out of this software if we have all these tools made available even with some of the headaches it takes with getting your mind working in a more ECS thinking approach. I'm really hopeful we see NativeArray options for the DrawMeshInstanced call (better yet just call it from a job) so we can plan accordingly for future work, right now my project probably won't see beta till the end of the year so I'm not too worried about timeline but its nice to know for architecture decisions as to whether it will be an option.
     
    zulfajuniadi likes this.
  8. Arathorn_J

    Arathorn_J

    Joined:
    Jan 13, 2018
    Posts:
    47
    This thread helped a lot in case anyone else was looking, though the issues should be resolved without hacking things together when drawmeshinstanced can be called from a job with a nativearray.
     
    Babiole and zulfajuniadi like this.
  9. zulfajuniadi

    zulfajuniadi

    Joined:
    Nov 18, 2017
    Posts:
    117
  10. Soaryn

    Soaryn

    Joined:
    Apr 17, 2015
    Posts:
    325
    How would you animate them say separately. Say one is running the others are idle.
     
    lclemens likes this.
  11. zulfajuniadi

    zulfajuniadi

    Joined:
    Nov 18, 2017
    Posts:
    117
    @Soaryn I was planning to set the animation index via the material property group. It seems that the Graphics.MeshInstancedRenderer does accept that as one of the properties
     
  12. Arathorn_J

    Arathorn_J

    Joined:
    Jan 13, 2018
    Posts:
    47
    Wow, that’s fantastic. I was able to hit about 50 FPS before I left for summer vacation, but this might be a better example for getting the diversity I need, also packing to the 2d textures uses less memory and fewer draw calls than my texture baking does.
     
  13. zulfajuniadi

    zulfajuniadi

    Joined:
    Nov 18, 2017
    Posts:
    117
    I'll be tinkering over the night. I'll push the code and share the result afterwards. Right now its not very user friendly. I imagine something where you reference a prefab and it does the pre-processing automatically. I might even attempt to make it render via instancedindirect to get better performance
     
    lclemens and Arathorn_J like this.
  14. zulfajuniadi

    zulfajuniadi

    Joined:
    Nov 18, 2017
    Posts:
    117
    I've updated the repo. Now you can set each individual animation for each entity:

    upload_2018-7-6_9-57-15.png

    [Video]

    The project now has a window where you can drag in skinned mesh prefab and it will generate the animation textures. Conversion is pretty slow but I guess it's okay since you only have to do it once. Plus it does some nifty stuff like fix the model's scale, rotation and pivot point.

    upload_2018-7-6_9-49-46.png

    The tool will merge all selected animation and bake the vertices and normal data into two textures:

    upload_2018-7-6_9-52-33.png

    Github: https://github.com/zulfajuniadi/Animation-Texture-Baker
     
  15. tertle

    tertle

    Joined:
    Jan 25, 2011
    Posts:
    3,609
    Ok. That's cool.
     
    zulfajuniadi likes this.
  16. Arathorn_J

    Arathorn_J

    Joined:
    Jan 13, 2018
    Posts:
    47
    That looks great! Good job! I won’t be able to get back to work for a few days but I really Can’t wait to dig into the details!
     
    zulfajuniadi likes this.
  17. zyc_dc

    zyc_dc

    Joined:
    May 11, 2018
    Posts:
    42
    Hey! Just tried your demo on 2018.2.0f2. It does not show anything except particle effects
     
  18. zulfajuniadi

    zulfajuniadi

    Joined:
    Nov 18, 2017
    Posts:
    117
    Sounds like a shader issue. What platform are you on? Tested on 2018.2.0f1 on Windows 10 works fine.

    Yeah doesn't work on OSX. Like I suspected it's a shader issue. I'm in the middle of a large upgrade for the system (adding state machines to the baked agents) and will look into getting the shaders compatible with opengl & metal.
     
    Last edited: Jul 13, 2018
  19. zyc_dc

    zyc_dc

    Joined:
    May 11, 2018
    Posts:
    42
    It seems like a shader issue. My platform is Windows 7. BTW, will it work on mobile?
     
  20. zulfajuniadi

    zulfajuniadi

    Joined:
    Nov 18, 2017
    Posts:
    117
    Right now it doesn't as the shaders uses compute buffers to store the state. I'm working to get that remove so that it can be compatible with lower shader targets.
     
    lclemens likes this.
  21. zyc_dc

    zyc_dc

    Joined:
    May 11, 2018
    Posts:
    42
    Cool! Look forward to it!
     
  22. drallcom3

    drallcom3

    Joined:
    Feb 12, 2017
    Posts:
    162
    I would love to see that.
     
    zulfajuniadi likes this.
  23. Spy-Shifty

    Spy-Shifty

    Joined:
    May 5, 2011
    Posts:
    546
    How many animation does your tool support and how long can each animation be?
    There is no blending between animations or?
     
  24. zulfajuniadi

    zulfajuniadi

    Joined:
    Nov 18, 2017
    Posts:
    117
    AS of now it's limited by the number of vertices of the model and the duration of the animation * animation frame rate. As the data is stored inside a texture, it is also limited to the supported texture size. For a low poly character of around 800 vertices, you can store all the basic animations such as idle, walk, get hit, attack, die in a 1024x1024 px texture.

    No, there isn't any blending between animations yet. Though it is technically possible.
     
    lclemens, racer161 and Spy-Shifty like this.
  25. Danistmein

    Danistmein

    Joined:
    Nov 15, 2018
    Posts:
    82
    beautiful!
     
  26. Ylly

    Ylly

    Joined:
    May 15, 2015
    Posts:
    21
    I downloaded your project and it seems to be old and there is no functional that you described
     
  27. tylo

    tylo

    Joined:
    Dec 7, 2009
    Posts:
    154
  28. gltfxe

    gltfxe

    Joined:
    Jan 10, 2020
    Posts:
    1
    I Use Graphics.DrawMeshInstanced With a Problem that the Direction Light can't be Enable. How can I Use Graphics.DrawMeshInstanced With a realtime light?
     
  29. Arathorn_J

    Arathorn_J

    Joined:
    Jan 13, 2018
    Posts:
    47
    I'm not precisely sure what the issue would be, I used mixed lighting usually and bake the directional light, but I've used realtime lighting as well and haven't had an issue using DrawMeshInstanced. It could be an issue with the particular shader you are using? My suggestion would be to create a clean scene, setup realtime lighting and make sure a simple script that draws a couple instances of a mesh is working as expected.
     
  30. lclemens

    lclemens

    Joined:
    Feb 15, 2020
    Posts:
    703
    I'm curious... So why are there so many gpu mesh/texture animation baking repos from 2 to 3 years ago and then suddenly all the progress on them stopped? I've spent the past two days downloading and playing with every gpu mesh/texture baking git repo I could find. Pretty much all of them are 2 to 4 years old - even the ones from Unity, so I was unable to get the majority of them updated to editor 2020.1.17 with the latest packages. Is it because people are using the new DOTS animation package instead ( https://docs.unity3d.com/Packages/com.unity.animation@0.8/manual/index.html )? Does it do gpu baking or run as fast as the gpu instanced mesh/texture techniques discussed in this thread?

    Hey zulfajuniadi, I've been messing around with your Animation-Texture-Baker repo - very cool!! Even though it hasn't been updated in 2 years, so far yours is quite complete and looks the promising. I was able to update the project to use editor version 2020.1.17 and run the examples.

    I have a couple of quick questions though...

    I'm unable to build for mobile devices right now because of bugs in Hybrid Renderer V2, but I'm hoping to use mobile if/when Unity fixes it. Did you ever get it working on mobile devices?

    Sometimes I get this error... "A Hybrid Renderer V2 batch is using a pass from the shader "Unlit/TextureAnimPlayer", which is not SRP batcher compatible. Only SRP batcher compatible passes are supported with the Hybrid Renderer." I haven't quite figured out the pattern on why I get it sometimes and other times I don't.

    One question I have is what is the intended procedure for changing animations during gameplay (idle, run, attack, etc)? I'll be using ECS... so I guess in a system I would swap out the material whenever I need to change the animation? Or is there a way to build all the animations into one material and change the animations by modifying a material property?
     
    elJoel likes this.
  31. elJoel

    elJoel

    Joined:
    Sep 7, 2016
    Posts:
    125
    That sounds awesome any chance you could upload the changes to a speperate branch?

    What fixed that issue for me sometimes is just opening the shader and saving it. If it's still not SRP batcher compatible try adding:

    Code (CSharp):
    1.      
    2. #pragma multi_compile _ DOTS_INSTANCING_ON
    3.  
    and set this in the shader passes
    Code (CSharp):
    1.         #pragma target 4.5
    2.  
     
    lclemens likes this.
  32. nyanpath

    nyanpath

    Joined:
    Feb 9, 2018
    Posts:
    77
    How would one set up a system that can handle different entities using this?

    So far it's been only one object x amount of times, but I want x type of objects y amount of times. I've tried to make something like this myself for my own game to handle different kinds of vehicle wheels, doors, etc. of moving objects that only need a visual representation, but setting up the data structure for it to use with the Job System has been a challenge due to the need for Matrix4x4 in Graphics instead of float4x4 (Matrix4x4 seems to come with extra read costs), and that I am still new to this. I ended up making a 2D-array that I fed each part as a NativeArray to an IJobFor, but then I saw .GetSubArray()-existed and have been wondering if I should use this somehow. There was also this SharedArray that I looked at that seems promising, but still only a temporary solution:

    : https://github.com/stella3d/SharedArray
     
  33. Guedez

    Guedez

    Joined:
    Jun 1, 2012
    Posts:
    821
    I have a similar system for my crops, but I don't think it can handle more than 20k at 60fps. I will have to check this out after I actually have a game and more time for unnecessary optimizations. Can't wait
     
  34. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    3,916
    They all stopped because that was around the time the graphics pipeline fragmentation got really bad. However, the upstream of the repo you mentioned is using ShaderGraph now. https://github.com/sugi-cho/Animation-Texture-Baker/blob/master/Readme.md

    Before HR V2 and the animation package, people were doing texture animation based on the Nordeus demo. This was my favorite implementation: https://github.com/joeante/Unity.GPUAnimation
    However, that repo isn't compatible with HR V2. Migrating it shouldn't be difficult if you understand it, and I am planning to do that work after my next framework release in January assuming someone doesn't beat me to it. However, it is still suboptimal and making it truly optimal would require some major changes to HR V2 and probably the DOTS SRP Batcher backend (or at least I would have to see its implementation to get it right).

    So DOTS animation has a bone matrix array for each instance (I'm not sure if this is pre or post culling) that is calculated on the CPU and transferred to the GPU. Then the compute skinner computes the transformed mesh in local space for each instance (yes it is fairly memory-intensive and I suspect Unity is using some technique to recycle memory) using the bone matrices as well as blend shapes. Finally, the skinned meshes are drawn probably using some low-level multi-draw API.

    On the other hand, with texture-skinning, most implementations are either storing mesh animations in textures, or storing bone animations and computing bone interpolation and blending for every vertex. "Ideally" you would have a compute pass after culling to compute bone matrices for each visible instance, and then process the skinning in the vertex shader so that you don't bloat your memory demand any further.

    I put "Ideally" in quotes as there may be better options for low poly, high shadow counts, and/or primitive shaders.
     
    lclemens likes this.
  35. Arathorn_J

    Arathorn_J

    Joined:
    Jan 13, 2018
    Posts:
    47
    I've refactored my personal solution several times, and actually got it working with HDRP and shader graph using some custom scripts in shadergraph.

    It is a lot of work to create when you get into utilizing the system, but when you can render a couple hundred thousand (with attachments) independently animated characters at 30-60 FPS (depending on gfx card) it feels pretty worth it.

    I had to write a bunch of tools that took a couple months to get the baking just the way I want, and struggling with getting the Editor to play the animations properly, I ended up just writing tools that ran in play mode. Originally I was baking all the vertex data at each animation frame to a texture but switched to bone data as the memory utilization got pretty steep and you had to generate a texture for each LOD. Just baking the bones is a lot less memory but is more work on the GPU because you have to run each vert through the bone matrix and if you have more than one bone influencing that vert (up to 4) it can really cost performance. But with LOD options I can tune that so only 4 bones are used on the closest characters.

    I have a custom set of jobs that take character velocity and actions and then call instancing scripts that build out the data needed to feed to calling Graphics.Drawmeshinstanced. So the character AI just has different flags like "Attacking", "Dying", "Falling", "Climbing" etc... and then some short blending depending on the type of animation transition.

    The biggest overhead I have for the animation system on the CPU is calling the Graphics.Drawmeshinstanced, because you can't do that inside a job, so you have to finish getting a native array done for all the data you need to call and then transfer it to a standard array and the overhead for that is pretty substantial. With a couple hundred thousand characters I can use a burst job that will get everything they need in 1-2ms (depending on number of cores) but then it can take another 8ms to copy the arrays and then call Drawmeshinstanced, luckily I can kick off some other jobs while the main thread calls all its Drawmeshinstances.

    My original ask in this thread was hoping that Unity would make the Graphics methods callable inside of jobs. But alas, I haven't seen an ETA on when this would happen, but it would dramatically lower my CPU utilization for animations.
     
    Krajca and lclemens like this.
  36. lclemens

    lclemens

    Joined:
    Feb 15, 2020
    Posts:
    703
    Thanks for the shader tip! I'll give it shot. Yes, I would be happy to push a separate branch in a couple of days. I added a new scene for benchmarking entity spawning and I'm still in the middle of cleaning that up. Also I'm going to try and upgrade it to 2020.2.0 since that appears to have been removed from beta today.

    I'm still trying to figure out how to change animations at runtime... currently whenever I bake a model that has 4 animations, it creates 4 separate textures. zulfajuniadi wrote "you can store all the basic animations such as idle, walk, get hit, attack, die in a 1024x1024 px texture". This makes me think that there must be a way to get all the animations in one texture so I'll be trying to figure that out this evening.

    When I instanced 100,000 of those horses as Entities (HRV2 + URP) such that 1/3 was running, 1/3 walking, and 1/3 idle... I got around 30fps in a release build on a laptop with a gtx 2060 mobile version. It's not as good as the 50 to 60fps that Arathorn_J and zulfajuniadi were getting, but they might have been using better hardware.
     
  37. Arathorn_J

    Arathorn_J

    Joined:
    Jan 13, 2018
    Posts:
    47
    I've been running on a GTX 1080. But it really comes down to the detail, I have LOD (0-5) on my objects so the smallest ones are like 300-600 verts.

    I store a scriptable object that has the start and length of each animation, so the texture has all the animations and I just have to reference the scriptable object to know when the animation starts and how long it will run to know the position in the texture without having to do a separate texture for each animation.

    So say my first animation is idle and then the second is walk :
    FPS 60
    idle start row 0 length 1.5 seconds
    next animation would start at row 90
    walk start row 90 length .5 seconds
    next animation would start at row 120
     
  38. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    3,916
  39. lclemens

    lclemens

    Joined:
    Feb 15, 2020
    Posts:
    703
    I updated the Animation Texture Baker project to 2020.2 and got URP + HRV2 working! I had to make 6 or 7 changes to the shader (including the fix recommended by @elJoel )... which I completely pulled out of my a** because I don't know how to write HLSL or any other shader language. I made a branch and sent a pull request, but it's hard to say when zulfajuniadi will see it. I forked it here: https://github.com/Ph0t0nX/Animation-Texture-Baker/tree/editor-update-ecs-hrv2 .

    upload_2020-12-16_3-9-31.png

    BTW, that horse has 1,518 verts with no LODs, so 22fps seems reasonable for 100k entities.

    Unfortunately, I was unable to figure out how to pack all the animations into a single texture like Arathorn_J is describing. As far as I can tell, the Animation Texture Baker project doesn't pack the animations (it builds them into one material for each animation), and it would take me a long time to figure out how to modify the animation baker to combine animations. @Arathorn_J - is there a repo with your technique available?

    @DreamingImLatios - I also played around with the joeante Unity.GPUAnimation demo, and I didn't know that there are updated branches either :). It's a cool technique, but there's one comment in there that scares me: "The conversion pipeline currently creates an entity with transform hierarchy components for each bone in the hierarchy below the character. In this case none of it is used and it is simply wasted. This makes instantiation slow and causes massive overhead at runtime to keep all these unused transform nodes up to date". That statement, plus that fact that it doesn't work with HRV2 makes me think it would be quite a bit of work (for myself anyway) in order to work out the kinks.

    From DreamingImLatios 's description of the DOTS animation package, it sounds like it could be pretty fast... maybe comparable to these baking techniques? Has anyone tried to benchmark it with 100,000 animations? They have a pretty big warning on the readme about it being experimental and not even close to release. I know @nicolasgramlich was using it for his game.
     
    cultureulterior, elJoel and jdtec like this.
  40. Baggers_

    Baggers_

    Joined:
    Sep 10, 2017
    Posts:
    97
    Just a quick kickback to the original question regarding DrawMeshInstanced and NativeArrays, if you are ok with some unsafe code and really need to avoid the copy from a native array to the managed one you can do this:

    Code (CSharp):
    1.  
    2. var managedTransforms = new Matrix4x4[len];
    3. var dataPtr = UnsafeUtility.PinGCArrayAndGetDataAddress(managedTransforms, out _gcHandle);
    4. var byteSize = len * sizeof(Matrix4x4);
    5. var nativeTransforms = NativeArrayUnsafeUtility.ConvertExistingDataToNativeArray<Matrix4x4>(dataPtr, byteSize, Allocator.None);
    You can the pass the native view over the array to a job and populate from there. Naturally once you are done you'll need to unpin the native array.
     
    lclemens likes this.
  41. jdtec

    jdtec

    Joined:
    Oct 25, 2017
    Posts:
    296
    Major kudos to you and thanks for posting this. It seems to me to be the most straightforward minimalist example that gets the job done for HRV2 from which to build upon. I like that it uses a very simple text shader to show how it works.
     
  42. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    3,916
    The extra entities only affect CPU performance, so it won't matter if you are GPU-bound like running simple experiments to compare performance. Also, for runtime conversion you can get rid of the extra entities with the Convert to Entity (Stop). I forget what the technique is for subscenes, but it might just be as simple as deleting the converted entities in a GameObjectConversionSystem.
    I haven't had the chance to test this version, but did you actually try it with HR V2?
     
  43. iamarugin

    iamarugin

    Joined:
    Dec 17, 2014
    Posts:
    857

    It can be done simplier:
    Code (CSharp):
    1. // Matrix4x4[] matrices;
    2. // NativeArray<float4x4> nativeArray;
    3. var nativeMatrices = nativeArray.Reinterpret<Matrix4x4>();
    4. NativeArray<Matrix4x4>.Copy(nativeMatrices, beginIndex, matrices, 0, length);
     
  44. Arathorn_J

    Arathorn_J

    Joined:
    Jan 13, 2018
    Posts:
    47
    I've got that working well for copying, though I have to pass through Material Property Blocks as well, all sorts of things that have to be done outside of jobs.

    Though I just discovered a bug in HDRP and Graphics.DrawMeshInstanced that I'm wondering if anyone else can replicate, I logged the issue with Unity in a bug report. Essentially Graphics.DrawMeshInstanced in the latest version of HDRP doesn't receive shadows or properly light surfaces, it only seems to cast shadows. This is with HDRP 10.2 and Unity 2020.2.0f1.

    Just create a primitive cube with a new material that has instancing turned on, make sure something is over it to partially shadow the cube and have this script on the primitive cube.

    Code (CSharp):
    1. using System.Collections;
    2. using System.Collections.Generic;
    3. using UnityEngine;
    4.  
    5. public class InstanceThisObject : MonoBehaviour
    6. {
    7.     MeshRenderer MyMeshRenderer;
    8.     MeshFilter MyMeshFilter;
    9.     Matrix4x4[] MyMatrixArray;
    10.     public int lightLayer = 0;
    11.     // Start is called before the first frame update
    12.     void Start()
    13.     {
    14.         MyMeshRenderer = GetComponent<MeshRenderer>();
    15.         MyMeshRenderer.enabled = false;
    16.         MyMeshFilter = GetComponent<MeshFilter>();
    17.  
    18.         MyMatrixArray = new Matrix4x4[1];
    19.     }
    20.  
    21.     // Update is called once per frame
    22.     void Update()
    23.     {
    24.         MyMatrixArray[0] = transform.localToWorldMatrix;
    25.         Graphics.DrawMeshInstanced(MyMeshFilter.sharedMesh, 0, MyMeshRenderer.material, MyMatrixArray, 1, null, UnityEngine.Rendering.ShadowCastingMode.On, true, 0, Camera.current);
    26.     }
    27. }

    Before pressing play and running script

    upload_2020-12-16_10-39-36.png

    After pressing play

    upload_2020-12-16_10-40-36.png
     
  45. iamarugin

    iamarugin

    Joined:
    Dec 17, 2014
    Posts:
    857
  46. Arathorn_J

    Arathorn_J

    Joined:
    Jan 13, 2018
    Posts:
    47
  47. Baggers_

    Baggers_

    Joined:
    Sep 10, 2017
    Posts:
    97
    @iamarugin Unless I'm missing something that doesn't appear to avoid the copy. In the version I was suggesting the ConvertExistingDataToNativeArray allows you to get a native view onto the managed memory which can then be written to directly. This means you can pass the managed array to DrawMeshInstanced without first copying the values from the native array to the managed one.
     
    lclemens likes this.
  48. thelebaron

    thelebaron

    Joined:
    Jun 2, 2013
    Posts:
    822
    The animation package seems to choke depending on the complexity of the model, I was able to get 1400 very simple mobile characters going in one case but swapping out for another slightly more complex model was only able to get 2-300.

    There are some stress tests under the StressTests folder for either renderpipeline, I built the HDRP PerformanceNClipsAndNMixer which shows 256 terraformers and it would be pretty interesting to see what people get for performance/hardware - i get 35fps with a 1070 + 8700k at 1440p
    https://www.dropbox.com/s/a0fm7i5tn9tn70d/AnimationStressTest256_il2cpp.zip?dl=0
     
    DreamingImLatios and lclemens like this.
  49. Arathorn_J

    Arathorn_J

    Joined:
    Jan 13, 2018
    Posts:
    47

    Is this with burst enabled or off?
     
  50. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    3,916
    My old R9 390 gets 32 FPS at 1080p. I' completely GPU-bound. No VRAM swapping.

    Something is very, very wrong with this skinning process. The whole point of texture animation is to sacrifice GPU performance to reduce CPU load and data transfer bandwidth. Neither of those things are a bottleneck though, so I would expect better performance.

    Anyways, I don't really have the time to set up all my debugging tools, but here are my suspicions:
    1) The skinning compute kernel is getting dispatched in a loop for each RenderMesh with skinning. This is prior to HDRP execution and there might be some horrific scheduling happening on the native side.
    2) I haven't investigated the details, but the striding of the buffers and indices looked odd to me. I'm suspicious there might be some cache thrashing.
    3) I didn't look too closely to how values were being read and outputted. I suspect there's some weird resource binding behavior there because something didn't look right, although I am blanking on what that might be.

    It could also be the instantiator or the vertex and fragment shader causing problems. RenderDoc is suggesting that the HR V2 compute shaders, the shadow pass, and the rendering pass all take roughly equal amount of time. I don't really trust RenderDoc though.
     
    thelebaron likes this.