Search Unity

  1. Unity Asset Manager is now available in public beta. Try it out now and join the conversation here in the forums.
    Dismiss Notice
  2. Unity 6 Preview is now available. To find out what's new, have a look at our Unity 6 Preview blog post.
    Dismiss Notice
  3. Unity is excited to announce that we will be collaborating with TheXPlace for a summer game jam from June 13 - June 19. Learn more.
    Dismiss Notice

Official GPU Driven Rendering In Unity

Discussion in 'Unity 6 Beta' started by Tim-C, Oct 6, 2023.

  1. qjlmeyes

    qjlmeyes

    Joined:
    Apr 19, 2023
    Posts:
    23
    Static occluder can lead to poorer performance of Gpu Resident Drawer
     
  2. qjlmeyes

    qjlmeyes

    Joined:
    Apr 19, 2023
    Posts:
    23
    It seems like adding a switch to set the render enabled false for occluded objects at the end of occlusion removal can solve the problem. This requires you to add this feature, I cannot do it
     
  3. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    Those allocate bins jobs are taking waaaay too long. I've done some testing locally but can not reproduce what you are seeing though. Can you raise a bug with your project so we can check the specific set up. Might be some cornercase we have not experienced in our testing yet so having the project would help us debug it.
     
  4. qjlmeyes

    qjlmeyes

    Joined:
    Apr 19, 2023
    Posts:
    23
    I created 10,000,000 cubes that performed better in the lens than SRP. Rotating the lens moved them out of the perspective, and at this point, GpuDriven had 40fps, while SRP had 100fps.

    Although the number of cubes in the lens decreases and the performance consumption of GpuDriver also decreases, it still consumes a lot. If the objects in the scene reach hundreds of thousands, even millions, or even more, an additional shot removal is needed to make the renderer enbale=false, which will greatly optimize the performance of GpuDriver. However, this needs to be completed in the main thread, which will be very slow.

    In addition, GpuDriven has taken over the rendering process of GameObject. Can we add an API? I need to only change the position of the rendering without changing the position of the Transform, because SetPosition is time-consuming and can only be operated on the main thread. It would be great if we could use multiple threads to change the position of the Renderer in the job.

    My computer is 7950x+4090
    Translated by AI
     

    Attached Files:

    joshcamas likes this.
  5. Rastapastor

    Rastapastor

    Joined:
    Jan 12, 2013
    Posts:
    591
    Following the above I also tested some scenes with instanced drawer on / off and also noticed performance regression. On top of that, when enabling DX12 its even worst!!!!

    upload_2024-1-6_5-33-10.png

    On same scene on dx11 i get over 100 fps with dx12 on its barely 60 :). Everything happens on latest 2023.3.b1.
     
    lacas8282 likes this.
  6. optimise

    optimise

    Joined:
    Jan 22, 2014
    Posts:
    2,130
    I think official can start from solving main thread stalling at entities graphics first to make it off the main thread for all the supported platforms. Can have a look at case IN-65173.
     
  7. Kichang-Kim

    Kichang-Kim

    Joined:
    Oct 19, 2010
    Posts:
    1,018
    Hi. When I enabled this feature, building player for Web platform is failed (Unity 2023.3.0b1). For Web Build, should I disable for this manually? (Isn't automatically fallback to traditional rendering path?)

    Error log is here:
     
    Last edited: Jan 6, 2024
  8. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    Can you raise a bug with a reproduction project so we can take a look?
     
    lacas8282 likes this.
  9. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    We'll have to check this - when we added occlusion culling we might have broken the fallback (well it will still fall back... but you can't build the player).
     
  10. MikkelSim

    MikkelSim

    Unity Technologies

    Joined:
    Nov 14, 2023
    Posts:
    2
    Thanks for reporting this, I've tested the project and there's definitely an issue here! Will look into a fix for this.
     
    Last edited: Jan 9, 2024
    lacas8282 and mariandev like this.
  11. qjlmeyes

    qjlmeyes

    Joined:
    Apr 19, 2023
    Posts:
    23
    Because DX12 is always the worst in any situation, switching to DX12 FPS only leaves half, which may not be an issue with the Gpu drawer
     
    lacas8282 likes this.
  12. qjlmeyes

    qjlmeyes

    Joined:
    Apr 19, 2023
    Posts:
    23
    Terrain effect is very poor, unable to HybridBatchGroup
    upload_2024-1-10_15-34-37.png
     
  13. Reanimate_L

    Reanimate_L

    Joined:
    Oct 10, 2009
    Posts:
    2,788
    if i'm not mistaken terrain elements are still not supported yet
     
  14. wellmor

    wellmor

    Joined:
    Jan 9, 2016
    Posts:
    2
    Do I understand correctly that for the GPU Resident Drawer to function properly, all mesh and texture data need to be preloaded into GPU memory?

    Additionally, if we have a limited amount of VRAM, could this lead to a significant decrease in performance in comparison with old srp batcher?
     
  15. XCO

    XCO

    Joined:
    Nov 17, 2012
    Posts:
    382
    May I ask, and Im sorry if someone already asked, that this method work better with Combined Meshes ? Or UN-combined ?
     
  16. adamgolden

    adamgolden

    Joined:
    Jun 17, 2019
    Posts:
    1,555
    Any update on [roughly] when the SpeedTree support is coming and whether it will also be a Shader Graph (to replace the ST8 one that comes with URP), or will the existing one just start working despite the use of MaterialPropertyBlock?

    Does LOD crossfading work with this?

    Will this work for WebGPU (now or in the future)?

    Sorry if any of the above were already asked and/or answered.
     
  17. KLGames8207

    KLGames8207

    Joined:
    Nov 13, 2023
    Posts:
    24
    in unity 2023.3.0b3 I get this when enable gpu culling resident drawer:

    OFF:

    upload_2024-1-17_12-6-51.png

    ON:

    upload_2024-1-17_12-7-3.png

    So the FPS just lower when it is ON.
     
    lacas8282 likes this.
  18. PutridEx

    PutridEx

    Joined:
    Feb 3, 2021
    Posts:
    1,136
    From my tests, it seems gpu culling can be very inefficient. I noticed it's taking as long as the scene takes to render on the GPU. Didn't do any extensive testing though, just quickly when I opened the beta for a bit.
     
  19. jiraphatK

    jiraphatK

    Joined:
    Sep 29, 2018
    Posts:
    306
    Yeah, the gain in editor are hard to measure. Need to test it on the build.
     
  20. runner78

    runner78

    Joined:
    Mar 14, 2015
    Posts:
    802
    In editor you need at least set c# to release mode and disable safety checks
     
  21. KLGames8207

    KLGames8207

    Joined:
    Nov 13, 2023
    Posts:
    24
    I just disabled safety checks, and running in release mode.
    Or do you mean I have to build the game? (not possible in the editor?)
     
  22. flyer19

    flyer19

    Joined:
    Aug 26, 2016
    Posts:
    126
    DrawProcedural still lower than default meshrender with resident drawer:
    566.9K tris , meshrender fps is 108, but drawprocedural is 72, per triangle culling gpu is not possible?
     

    Attached Files:

  23. hanhaotong

    hanhaotong

    Joined:
    Jan 27, 2024
    Posts:
    2
    I have the same question.There is little improvement over the default pipeline
     
  24. hanhaotong

    hanhaotong

    Joined:
    Jan 27, 2024
    Posts:
    2
    Most time was spent on render loop self time.I think this is because culling spends most of time
     
  25. MoruganKodi

    MoruganKodi

    Joined:
    Feb 11, 2015
    Posts:
    79
    Can this work with MaterialPropertyBlock via DrawInstanced?
     
  26. qjlmeyes

    qjlmeyes

    Joined:
    Apr 19, 2023
    Posts:
    23
    upload_2024-1-30_21-25-5.png
    Left is 3 Level Group 23M Tris
    Right is No Lod 130M Tris

    Because each model has 3 more lod MeshRenderers and imbale=true, the performance of the lod is worse
     
    Prodigga likes this.
  27. lacas8282

    lacas8282

    Joined:
    Apr 25, 2015
    Posts:
    139
  28. joshuacwilde

    joshuacwilde

    Joined:
    Feb 4, 2018
    Posts:
    734
    @Tim-C how does this work on the actual graphics api side? Which (directx, let's say) apis is this using? Is it using multi draw indirect? Or is it just using regular instancing then writing out 0 instances to the indirect buffer if they are GPU culled?

    I'd love to know more of the details here, and also if there are any new graphics APIs exposed as a result. Thanks.
     
    Shikoq likes this.
  29. MoruganKodi

    MoruganKodi

    Joined:
    Feb 11, 2015
    Posts:
    79
    My lightmaps are failiing when using GPU resident drawer with Copytexture spam.
    - Mipmap Limits is off.
    - Fixed Lightmap size is on.
    - Lightmaps using low quality compression and set to 4096.

    As far as I am aware - there are not any other settings which should get in the way of using GPU Resident Drawer with lightmapping.

    This is what my scene looks like without entering play mode with Resident Drawers Enabled (HDRP).
    1.png

    However - upon entering play mode - this happens:
    2.png


    I have tried everything I could think of to try work around this issue - but at the moment I am forced to keep Resident Drawers disabled.

    I WANT it enabled as I do get a noticable boost with it enabled - the only issue is failure when processing lightmaps.

    I could use some assistance with this.
     
    Last edited: Feb 18, 2024
    JamesArndt likes this.
  30. bnmguy

    bnmguy

    Joined:
    Oct 31, 2020
    Posts:
    137
    Not just MipMap limits should be off. All MipMaps should be disabled. Only the full res texture is working.
     
  31. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    Depends on the configuration you use:

    Instanced rendering with no occlusion: - Just standard instance rendering.
    Instanced rendering + occlusion uses DrawMeshIndirect - we add the gpu occlusion step which rewrites the draw ranges.

    We do not currently use multi draw indirect due to some API's not being as good with it but we have some desire to add this in the future.

    As for API's this is all done via the batch render group - we added support for new draw types (indirect) so you can do similar things to us if you want via that API.
     
    Shikoq and joshuacwilde like this.
  32. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    Hi, this should be fixed in either the next or next next beta. We have changed how we are doing lightmapping due to a number of shortcomings with the solution you are using now. We are (unfortunately) breaking batches now to switch lightmaps but it's quite fast due to how we have done the implementation and uses less GPU memory.
     
  33. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    For a number of the responses up in the thread there are some performance issues that we would like to dig into on our side. Posting a timeline capture is good - but we really need a bug report and a project to reproduce the issue and make a real fix. Please report a bug and log the issue number in this thread so we can look into it.
     
  34. Shikoq

    Shikoq

    Joined:
    Aug 5, 2023
    Posts:
    12
    Hey Tim!

    1. Can we expect MDI support to appear in DX12/Vulkan by the release of Unity 6?
    2. Is it true that MIDI is not supported in GPU Resident Driver only, but it is added to the API for BRG?

    3. How will resources behave if NVRAM is full?
    For example, I have 4 Gb RAM, 8 Gb textures and GPU Resident Drawer decides to render an object with a texture that is currently in RAM, not in VRAM.

    Is it true that if it were not indirect rendering, and the CPU made the decision to render this object, then the resources would have been preloaded into VRAM in advance? Have you performed any such tests? Won't this have a significant impact on performance?
     
  35. Matjio

    Matjio

    Unity Technologies

    Joined:
    Dec 1, 2014
    Posts:
    108
    Hi MoruganKodi!
    In the meantime you could also try using Adaptive Probe Volumes which are compatible with Resident Drawer, and could reduce the need for lightmaps, baking time, and memory, while improving visual quality for dynamic objects, and quality of SSGI and reflections.
     
  36. Shikoq

    Shikoq

    Joined:
    Aug 5, 2023
    Posts:
    12
    bump
     
  37. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    1/2: (assuming you mean multi draw indirect and not mesh draw indirect): We do intend to have multi draw indirect for Unity 6. GPU Resident drawer users BRG under the hood so any draw modes in Resident drawer MUST be supported by BRG.

    3. Depends on API as different graphics API's manage things differently. In terms of data management this system is not much different to standard unity in terms of uploaded texture and mesh data. The difference is that transforms and similar are always persistent.
     
    joshcamas and Shikoq like this.
  38. Shikoq

    Shikoq

    Joined:
    Aug 5, 2023
    Posts:
    12
    Oh, sorry, I wasn't talking about Resident Drawer, but about gpu occlusion culling in my thoughts.
    But thank you so much anyway
     
  39. TJHeuvel-net

    TJHeuvel-net

    Joined:
    Jul 31, 2012
    Posts:
    838
    What does this look like, is this something you'll use internally, or will there be an addition in the Graphics API for an indirect-with-count-from-buffer method? I'd really like to use this outside of the brg too!

    (Edit: That was a typo, not in Unity6 perhaps a later date!)
     
    Last edited: Mar 4, 2024
  40. Neonage

    Neonage

    Joined:
    May 22, 2020
    Posts:
    288
    Vertex snapping doesn't work when Resident Drawer is enabled.
     
  41. Tim-C

    Tim-C

    Unity Technologies

    Joined:
    Feb 6, 2010
    Posts:
    2,225
    Oh no. I made a typo. We DO NOT intend to have multi draw indirect in Unity 6. We will be investigating it at a later date. Sorry. Really unfortunate typo.
     
    TJHeuvel-net likes this.
  42. TJHeuvel-net

    TJHeuvel-net

    Joined:
    Jul 31, 2012
    Posts:
    838
    No problem, thanks for the correction!
     
  43. qjlmeyes

    qjlmeyes

    Joined:
    Apr 19, 2023
    Posts:
    23
    Will terrain be supported in the future?
     
  44. Churd

    Churd

    Joined:
    Dec 12, 2015
    Posts:
    126
    Is GPU Occlusion Culling currently supported by Entities Graphics? It seems that the only way to enable it is via the GPU Resident Drawer, which is only for GameObjects. (Unity 2023.3.0b9)

    If there's no way to enable it for Entities Graphics now, is it planned for the near-ish future?
     
    Last edited: Mar 12, 2024
  45. BragBiscuitz

    BragBiscuitz

    Joined:
    Mar 29, 2020
    Posts:
    3
    Is it possible to get more context for this? Skinned mesh batching (or total lack thereof) has been a real pain in the rear for me for a while. There's been multiple things I've been keeping an eye on and am fairly excited about lately with Unity rendering, but such issues make it harder for some projects to justify moving to URP and take full advantage of the many back/frontend upgrades. Specifically on batching, between the Batched GPU Skinning, the SRP Batcher, and now the GPU Resident Drawer, it's been multiple rounds of efforts figuring out how these things work, and multiple cold showers finding out skinned mesh batching remains elusive.

    I've also seen a few posts about MultiDrawIndirect around the forums, and it's really, really nice that Unity is considering it given its ability to all-around simplify the rendering process and significantly reduce CPU overhead, but I feel even that will have a limited effect if batches break down to the worst case scenario because some meshes happen to need extra displacement.

    I'm not quite sure what prevents skinned mesh batching, honestly. The only (bold) guess I can make based on the profiler's timeline is that potential for instancing is broken because skinning happens before dispatching the draw calls, which transforms the vertex data in a pattern breaking way. Even then, I'm not sure what prevents the SRP Batcher from combining the already-skinned meshes, or the GPU Resident Drawer from applying instancing to multiple identical already-skinned meshes (the mesh data would keep its instantiable pattern). I can see the persistent buffers being more painful to manage because of the highly dynamic nature of game rendering, but that's something I'd guess Unity has already found ways to deal with.

    I really hope these batching improvements will be extended to other things than simple mesh renderers soon. This would make Unity projects natively much more scalable and allow artists/designers to focus more on what/how to build things without worrying about blowing up the editor or making a test build unplayable after dropping in a handful of assets.
     
    Last edited: Mar 12, 2024
    pierre92nicot likes this.
  46. qjlmeyes

    qjlmeyes

    Joined:
    Apr 19, 2023
    Posts:
    23
    upload_2024-3-15_14-25-2.png
    Hope to optimize LOD in the future
     
  47. qjlmeyes

    qjlmeyes

    Joined:
    Apr 19, 2023
    Posts:
    23
    upload_2024-3-15_14-49-48.png
    Isn't it supposed to be 0 Tris when you can't see anything
     
  48. motillaones

    motillaones

    Joined:
    Oct 11, 2018
    Posts:
    18
    I'm pretty excited about these new changes. I have been testing the results of the GPU Resident Drawer in my project, which is quite graphically demanding, but the results are not very significant at the moment, probably because I'm doing something wrong.

    Using the configuration recommended by this topic I have the following results:

    The batch reduction seems very little to me, am I forgetting some step?

    Edit:

    - HDRP
    - Unity version: 2023.3.0b10
    - BatchRenderGroup variants : Keep All
    - GPU Occlusion Culling OFF (If I activate it, it gets worse results).
    - Graphics API for windows: DX11 (with DX12 there doesn't seem to be any change).
    - Static Batching : OFF

    System: AMD Ryzen 9 7900X3D 12-Core 4.40 GHz, 64,0GB RAM, NVIDIA GFORCE 4070ti
     

    Attached Files:

    Last edited: Mar 15, 2024
  49. lacas8282

    lacas8282

    Joined:
    Apr 25, 2015
    Posts:
    139

    Is it fixed in the new 6000 version?
    Thanks.
     
  50. qjlmeyes

    qjlmeyes

    Joined:
    Apr 19, 2023
    Posts:
    23
    LodGroup Unable to reduce CullingJob time,Even though he has been cut by the LodGroup

    3Lod Cube 1%Culled All Cube 64000*3=192000

    80FPS
    upload_2024-3-19_11-12-45.png upload_2024-3-19_11-12-53.png

    NoLod All Cube 64000
    60FPS
    upload_2024-3-19_11-13-51.png
    upload_2024-3-19_11-14-5.png
     

    Attached Files: