Search Unity

  1. Unity Asset Manager is now available in public beta. Try it out now and join the conversation here in the forums.
    Dismiss Notice
  2. Unity 6 Preview is now available. To find out what's new, have a look at our Unity 6 Preview blog post.
    Dismiss Notice
  3. Unity is excited to announce that we will be collaborating with TheXPlace for a summer game jam from June 13 - June 19. Learn more.
    Dismiss Notice

Official GPU Driven Rendering In Unity

Discussion in 'Unity 6 Beta' started by Tim-C, Oct 6, 2023.

  1. LYHyper

    LYHyper

    Joined:
    Oct 11, 2023
    Posts:
    11
    You can call Apply(markNoLongerReadable: true) to delete the cpu memory of texture2darray in RAM and still have it in VRAM. That will not cause white or fuchsia color.
     
    adamgolden likes this.
  2. lolium

    lolium

    Joined:
    Oct 14, 2014
    Posts:
    34
    3.0a15 is out. been updating in my bug report but just posting here for visibility... its very strange that turning on GPU resident drawer, the frame time goes to 300ms territory. upon inspecting profiler almost all the extra time was spent in shadowmaps.
    below are the profile GPU vs CPU
    10600k, 64GB, 3070Ti
    CPU:
    upload_2023-11-29_11-48-25.png

    GPU:


    Additionally, i have a second machine which rocks a 7700k, and a 1080Ti.
    the performance with GPU is also much better, more than my 3070Ti! i'm pretty baffled.

    1080Ti machine + GPU
     
  3. bnmguy

    bnmguy

    Joined:
    Oct 31, 2020
    Posts:
    137
    Right, but this makes mip streaming impossible and the whole full res texture would be loaded to memory, which isn't always feasible for the memory footprint. Hence the issues Unity is currently experiencing.
     
    Last edited: Dec 7, 2023
  4. echu33

    echu33

    Joined:
    Oct 30, 2020
    Posts:
    68
    I tried it on Japanese Street Scene.

    Indeed the framerate has improve with resident drawer but not by a large margin. I'd expect more? (or maybe my scene just not CPU bound from the beginning. so the improvement doesn't look that obvious )

    Here's a screenshot of comparing the profiler capture datas.
    -blue is capture of resident drawer-disable.
    -orange is capture of resident drawer-enable.

    Stats after enable resident drawer :
    It's around 14% fps gain on my machine (rtx 3060).
    - the stats windows shows fps improve from 220ish to 250ish.
    - batch count down to 1384 from 2454
    - set pass call cut to 42 from 87
    - shadow caster cut to 1001 from1445.

    Most of the objects on the scene has LODGroup, Will resident drawer work nicely with LODGroup as well?

    upload_2023-12-2_20-24-36.png

    heres a screenshot of frame debugger, the scene has no static mesh. the Hybrid BatchGroup seems works as intended.

    upload_2023-12-2_20-31-33.png
     
    Last edited: Dec 3, 2023
  5. Neonage

    Neonage

    Joined:
    May 22, 2020
    Posts:
    288
    Found a bug in contextual prefab mode - active prefab meshes are rendered two times, both by regular rendering and Instanced Drawer. It is visible when making visual changes, as Instanced Drawer doesn't update these meshes until prefab changes are saved.
     
  6. retired_unity_saga

    retired_unity_saga

    Joined:
    Sep 17, 2016
    Posts:
    290
    hello, please tell me where in the documentation can I find about newly included feature for GPU Resident Drawer, "Small-Mesh Screen-Percentage" I cannot find this concept anywhere on the net and nor in the documentation I want to see if it is something that can benefit my use cases to test benchmarks appropriate to the feature set.

    Also, I was able to test the config and it works, but it only works if you update the setting and then manually select "Edit----> Rendering ----> Generate shader includes".
     
  7. lolium

    lolium

    Joined:
    Oct 14, 2014
    Posts:
    34
    Any update on GPU occlusion culling?
     
    Genebris and mgear like this.
  8. Onigiri

    Onigiri

    Joined:
    Aug 10, 2014
    Posts:
    497
    Already in alpha
     
  9. XCO

    XCO

    Joined:
    Nov 17, 2012
    Posts:
    382
    I would love a tutorial video :)
     
  10. jiraphatK

    jiraphatK

    Joined:
    Sep 29, 2018
    Posts:
    306
    Doesn't seems to work for me. Also, holy s, the alpha keep crashing and barely work. It's really hard to test stuff.
     
  11. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    For a scene like this I would consider 14% total fame time to be pretty good. Might also be worth looking into things like main thread time saved and similar :)
     
    heartingNinja and echu33 like this.
  12. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    Thanks, can you make sure you log a bug and report the number here.
     
  13. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    We have not written the docs for this one yet but I'll write up some small notes here:

    When you use BRG there is a now new option that can take effect called 'small mesh culling'. This is basically inserting an automatic lod group so that when the object is below a certain screen percentage size it will be culled. If you are manually adding LOD groups this setting is ignored for those options.

    Basically a way to auto-cull small meshes on the screen that don't really contribute to the frame. We had good results with this in the fantasy kingdom demo.
     
    colin299, saskenergy, Rewaken and 2 others like this.
  14. sqallpl

    sqallpl

    Joined:
    Oct 22, 2013
    Posts:
    384
    Will it be possible to use GPU culling with RenderMesh functions like RenderMeshIndirect? I assume it won't work out of the box, but maybe it will be possible to use it somehow? Like using occluders that are rendered with GPU Resident Drawer for culling instances that are rendered with RenderMeshIndirect.

    Basically I'm looking for an option to use GPU Culling (and possibly some other actual and future benefits of the whole GPU rendering feature set) without using GameObjects for foliage/small detail objects because of the number of instances (now I'm using RenderMeshIndirect for these layers).

    Maybe moving these layers from RenderMeshIndirect to BatchRendererGroup will be a good option in this case?
     
    Last edited: Dec 11, 2023
  15. mgear

    mgear

    Joined:
    Aug 3, 2010
    Posts:
    9,503
    This is a great feature! (helps with cad models, without having to do any extra manual work).

    For GPU culling, can i trust the unity game view Stats window?
    (to check if objects/vertices actually decrease due to culling, on a quick test it sometimes seems bugged or not updating)
     
    jiraphatK likes this.
  16. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    We have some additional debug accessible via the rendering debugger:
    Screenshot 2023-12-11 at 13.18.20.png

    Looks kind of like this:
    Screenshot 2023-12-11 at 13.25.26.png


    This will add some overlays and there is the stats option to see specific amount of occlusion culled objects per view. Just a note I think there might be a bug in the stats view on latest alpha on urp (getting an exception) we are investigating this internally.
     
    Shikoq, mariandev and Reanimate_L like this.
  17. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    We plan to add an API that will allow you to inject and manage objects manually into the resident drawer which don't need gameobjects (for things like foliage and similar). We don't plan on making it work with existing calls like 'RenderMeshIndirect' currently.
     
    Shikoq, saskenergy, joshcamas and 3 others like this.
  18. sqallpl

    sqallpl

    Joined:
    Oct 22, 2013
    Posts:
    384
    Great to hear that! Sounds like a perfect option when GameObjects are not needed.

    1. As for now, will using BatchRendererGroup work with GPU culling out of the box? I mean, will the rendered instances be occluded and/or be used as occluders?

    2. Does the GPU culling work with alpha cutout materials? Probably the most common use cases are tree branches and ground foliage.

    3. Will it be possible to use LODs for objects that are injected with the API?
     
    Last edited: Dec 11, 2023
  19. retired_unity_saga

    retired_unity_saga

    Joined:
    Sep 17, 2016
    Posts:
    290
    woah

    ok nice

    i hope to have some quick note on how to set this up for most use cases in the future. ill run benchmark tests in the mean time

    could you perhaps give us a size clue based on 1m and 2m sized meshes (cubes for instance), for what are best settings to cull objects over 200m away from camera? im not sure where to begin when it comes to creating effective tests for this or what input number for the setting I should start with based on a reference object mesh which starts at a size and is culled at a distance from camera.

    ill keep trying in the meantime
     
  20. Crazy34

    Crazy34

    Joined:
    Jul 14, 2019
    Posts:
    75
    Yeah, I think these promised features will be enough for me to update a big project in unity to 2023 by accepting the new pricing. :D

    We expect it to stabilise as soon as possible. Good work.
     
    retired_unity_saga likes this.
  21. retired_unity_saga

    retired_unity_saga

    Joined:
    Sep 17, 2016
    Posts:
    290
    ok tests show this working out pretty great (0.5-0.8, 1M sizes),

    is there a way we can get some kind of API call to check on status of a mesh culled by this ? (per frame)

    either way, great work.
     
  22. jbooth

    jbooth

    Joined:
    Jan 6, 2014
    Posts:
    5,461
    This will be super useful and I'd love to see that API added sooner than later, but wouldn't DMI approaches still be more appropriate in these cases? Particularly when dealing with extremely large instance counts? My understanding of the resident drawer stuff is that it's still based on storing the matrix list in a cbuffer, and thus still limited to 1023 instances per draw call.

    I ask because I'm currently working on a indirect instance renderer for terrain vegetation, but one of the real reasons I'm doing it is to enable MicroVerse like workflows on meshes, where meshes can be easily populated with vegetation, etc. Spawning these as game objects would obviously be a non-starter, because there are often hundreds of thousands of them. But if I could use resident drawer it would prevent me from having to write (and more importantly, maintain) an indirect approach, assuming it's fast enough.
     
    heartingNinja and Saniell like this.
  23. joshcamas

    joshcamas

    Joined:
    Jun 16, 2017
    Posts:
    1,279
    Wow, the small object feature is amazing! Once again blown away, usually we don't get these sorts of features, needing to instead mess with culling layers and laying small objects.

    Having it work out of the box is exactly the direction we need to be going, thank you so much!
     
  24. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    You need to use the GPU Resident drawer, it won't work with vanilla BRG as most of this code is in the BRG callback we implemented.

    We use a depth pyramid for occluders (i.e if your foliage renders to depth it will occlude) and bounding boxes for occludes. So occlusion is not pixel perfect but it's fast.

    The API is not written yet, but this will be a requirement.
     
  25. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    This is a case of generalization vs specialization I think. Our goal will be to provide an easy to use api that allows people to manage instances that 'just work' with all the Unity rendering features that exist out of the box. This will inevitably lead to some level of overhead if there is things that you don't specifically need or similar.

    I think for the case you are describing where you know all your data and similar it will undoubtedly be possible to write a faster version that is more compact and less general. I think in this case we need to make it possible for your to use a draw mesh indirect but also have access to the occlusion buffers and similar so you can do testing and culling. This might already be possible but I would need to check the API we expose to the culling data.
     
    chadfranklin47 and joshcamas like this.
  26. strich

    strich

    Joined:
    Aug 14, 2012
    Posts:
    383
    Did I read correctly that the GPU OC requires widespread use of GPU instancing? Our game is heavily optimised for URP batching and not GPU instancing.
     
  27. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    No this is something we won't be adding. The GPU path we are considering to be a 'rendering optimization' path. Nothing looping back to simulation due to frame pacing / syncing.
     
    retired_unity_saga likes this.
  28. retired_unity_saga

    retired_unity_saga

    Joined:
    Sep 17, 2016
    Posts:
    290
    I actually realized this after I had asked. Makes sense to me.

    Hope to see more features like this in the future.
     
  29. -DarkTiger-

    -DarkTiger-

    Joined:
    May 6, 2015
    Posts:
    8
    Hello,
    I have a problem, when I enable "GPU Resident Drawer" the build fail and show these errors.
    I use 2023.3.0a17, URP 17.0.1, DX12, Forward+, SRP Batcher.
    If I disable the GPU Residet Drawer, the build works.

    the strange thing is that these are the integrated particle shaders.

    upload_2023-12-12_17-27-32.png

    upload_2023-12-12_18-8-31.png

    Thanks
     
    Last edited: Dec 12, 2023
  30. jbooth

    jbooth

    Joined:
    Jan 6, 2014
    Posts:
    5,461
    That all makes sense - a big speedup for everyone is extremely valuable and exactly the kind of stuff I'm excitted to see Unity doing. And yes, right now I'm generating my own hi-z buffer in URP/BiRP, and notice you already generate one in HDRP (with funky layout). It would be nice if there was a way to access yours so we don't need two of them around. I took a brief look at this for HDRP, and there's a nice include for it so you can access it regardless of packing, but this can't be included from a compute shader due to it's dependencies.

    In an ideal world, there would be an include file somewhere that has these functions in them so they match Unity's perfectly, and handle all the edge cases (ortho vs. perspective, etc). I've had a really hard time matching up Unity's LOD with mine - even found the source in this repository and still need to multiply the screen height by 1.82 to get a very similiar (but not exact) result, so I'm still missing something there. Anyway, the more this stuff can be shared, the easier it will be to write matching fast paths for everyone.
     
  31. jiraphatK

    jiraphatK

    Joined:
    Sep 29, 2018
    Posts:
    306
    I tested GPU occlusion culling again today. I noticed significant fps gain when on build, not so much in the editor.
    Also, I notice view ID of the light in the debug view. Does this mean lights also get the benefit of occlusion culling?
     
  32. Neonage

    Neonage

    Joined:
    May 22, 2020
    Posts:
    288
    Will GPU occlusion culling work on shadows?
    Also, why Render Graph is necessary for this feature? Very few projects would be able to upgrade all their custom render features to support it.
     
  33. -DarkTiger-

    -DarkTiger-

    Joined:
    May 6, 2015
    Posts:
    8
    I have investigating better and I found out that these errors are caused by Forward+ (required from GPU Resident Drawer).

    I have tried in 2 different projects and the errors are the same.

    Is Forward+ broken?
     
  34. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    We have it running in our automated test suits and it's running properly. Could you log a bug and i'll make sure the forward+ developer takes a look.
     
  35. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    I've forwarded this feedback to the developer of the occlusion framework.
     
  36. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    We have some prototype stuff for shadow view occlusion but it's a bit flakey still (so not ready to ship). We'll see where we land on it but no promises for it happening in this release.

    Build vs Non Build - Editor has a lot of overheard and can eat a lot of the improvements that this work gives... that being said we get a significant usability improvement in the editor when this feature is enabled and the scene is very large. It was impossible to work with the fantasy kingdom project in the editor on my laptop for example with GPU driven rendering off but it works pretty well with it turned on.
     
    jiraphatK likes this.
  37. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    Rendergraph will be 'the way' moving forward for rendering features so are going all in on it when it comes to new things like occlusion (and it also simplifies the code by at least an order of magnitude).
     
    AljoshaD, saskenergy and Shikoq like this.
  38. jbooth

    jbooth

    Joined:
    Jan 6, 2014
    Posts:
    5,461
    When trying to port shaders to this, I'm getting:

    "Shader error in 'BetterShaders/Unlit': undeclared identifier 'unity_DOTSInstancingF48_MetadataGetObjectToWorldMatrix' at line 3709 (on metal)"

    This seems to be triggered by the UNITY_SETUP_INSTANCE_ID(v) call in the vertex shader. Does not happen if resident drawer is not used, and I'm using the new #include_with_pragmas "Packages/com.unity.render-pipelines.universal/ShaderLibrary/DOTS.hlsl" to bring in the dots code.
     
  39. retired_unity_saga

    retired_unity_saga

    Joined:
    Sep 17, 2016
    Posts:
    290
    I see based on the Unity 2023.3.0a18 changelog it is now mandatory to utilize Rendergraph API.

    Where can we find a tutorial on the ins and outs of utilizing the Rendergraph? Or like, some kind of documents.

    I wish to create a "Vertex-Lit" only Rendergraph that passes on lights in Forward+ to vertex lit lighting model. Basically, I want Forward+ Vertex Lit. But, I am almost a total noob to this.

    I am wondering how can I get started on my journey for knowledge to do this? How can I figure out to make shaders for my special graph as well (flipbook, UV scrolling, multi-albedo-render, bit mask blend), and also support shader graph in the most recent Alpha release (Unity 2023.3.0a18).
     
  40. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    https://forum.unity.com/threads/int...in-the-universal-render-pipeline-urp.1500833/
     
    retired_unity_saga likes this.
  41. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    I'll poke one of the shader guys, there is some specific ordering here that is important.
     
  42. JussiKnuuttila

    JussiKnuuttila

    Unity Technologies

    Joined:
    Jun 7, 2019
    Posts:
    351
    This sounds like you might be missing
    Code (CSharp):
    1. #include "com.unity.render-pipelines.universal/ShaderLibrary/Input.hlsl"
    If you already have this include, try moving it earlier in the sequence of includes.

    The DOTS.hlsl header itself is not enough, that just enables the right variants and shader model.
     
  43. jbooth

    jbooth

    Joined:
    Jan 6, 2014
    Posts:
    5,461
    Its copied exactly from the forward/gbuffer/etc passes the shader graph outputs. For instance, here's the one from the forward pass:

    Code (CSharp):
    1. #include_with_pragmas "Packages/com.unity.render-pipelines.universal/ShaderLibrary/DOTS.hlsl"
    2.             #include_with_pragmas "Packages/com.unity.render-pipelines.universal/ShaderLibrary/RenderingLayers.hlsl"
    3.             #include_with_pragmas "Packages/com.unity.render-pipelines.universal/ShaderLibrary/ProbeVolumeVariants.hlsl"
    4.             #include "Packages/com.unity.render-pipelines.core/ShaderLibrary/Color.hlsl"
    5.             #include "Packages/com.unity.render-pipelines.core/ShaderLibrary/Texture.hlsl"
    6.             #include "Packages/com.unity.render-pipelines.universal/ShaderLibrary/Core.hlsl"
    7.             #include "Packages/com.unity.render-pipelines.universal/ShaderLibrary/Lighting.hlsl"
    8.             #include "Packages/com.unity.render-pipelines.universal/ShaderLibrary/Input.hlsl"
    9.             #include "Packages/com.unity.render-pipelines.core/ShaderLibrary/TextureStack.hlsl"
    10.             #include_with_pragmas "Packages/com.unity.render-pipelines.core/ShaderLibrary/FoveatedRenderingKeywords.hlsl"
    11.             #include "Packages/com.unity.render-pipelines.universal/ShaderLibrary/Shadows.hlsl"
    12.             #include "Packages/com.unity.render-pipelines.universal/ShaderLibrary/ShaderGraphFunctions.hlsl"
    13.             #include "Packages/com.unity.render-pipelines.universal/ShaderLibrary/DBuffer.hlsl"
    14.             #include "Packages/com.unity.render-pipelines.universal/Editor/ShaderGraph/Includes/ShaderPass.hlsl"
    15.             #include "Packages/com.unity.render-pipelines.universal/ShaderLibrary/LODCrossFade.hlsl"
    I played with moving input higher, but then I just get warnings about TEXCUBE not being defined when it's above Texture.
     
  44. retired_unity_saga

    retired_unity_saga

    Joined:
    Sep 17, 2016
    Posts:
    290
  45. flyer19

    flyer19

    Joined:
    Aug 26, 2016
    Posts:
    126
    GPU Residet Drawer how make per triangles culling?For example, drawing a big mesh with 100w triangles,should change to draw 100w instance triangles which do frustum culling, and the render cost should be better than draw a big mesh ,right?So, which api in GPU Residet Drawer can do this?In this case,If use drawProcedualIndirect with old unity render pipleline,the rendering cost is poor.Unity 6 can do better?
     
  46. merpheus

    merpheus

    Joined:
    Mar 5, 2013
    Posts:
    202
    Is there a visibility buffer/deferred materials implementation in place or planned for this feature?
     
  47. qjlmeyes

    qjlmeyes

    Joined:
    Apr 19, 2023
    Posts:
    23
    I found that GPU Resident Drawer requires shaders to support DOTS, which is a feature that will only be available in next year's version of Unity. This will result in 99.99% of shaders not being able to use this feature. Do you have any plans to create a feature that automatically converts regular shaders to DOTS shaders without rewriting the shaders
     
    Claytonious likes this.
  48. Shikoq

    Shikoq

    Joined:
    Aug 5, 2023
    Posts:
    12
    Correct me if I'm wrong, but all base URP/HDRP shaders and all shadergraph shaders support DOTS Instancing, which is mandatory for GPU Resident Drawer.
     
  49. cloverme

    cloverme

    Joined:
    Apr 6, 2018
    Posts:
    199
    I've seen a few rumors that the GPU Culling is also doing hi-z as well but maintains the shadow if the shadow is in the camera frustum, is that correct?
     
  50. qjlmeyes

    qjlmeyes

    Joined:
    Apr 19, 2023
    Posts:
    23
    I found that when the camera cannot see objects, turning on Gpu Resident Drawer takes longer to render than turning it off

    1 image off Gpu Resident Drawer
    2 image On Gpu Resident Drawer
    3 image On Gpu Resident Drawer and Additional performance consumption when there are no objects inside the camera
     

    Attached Files: