Search Unity

  1. Unity 6 Preview is now available. To find out what's new, have a look at our Unity 6 Preview blog post.
    Dismiss Notice
  2. Unity is excited to announce that we will be collaborating with TheXPlace for a summer game jam from June 13 - June 19. Learn more.
    Dismiss Notice
  3. Dismiss Notice

Official GPU Driven Rendering In Unity

Discussion in 'Unity 6 Beta' started by Tim-C, Oct 6, 2023.

  1. BragBiscuitz

    BragBiscuitz

    Joined:
    Mar 29, 2020
    Posts:
    7
    This GPU Occlusion culling implementation uses an additional pass, which ends up adding more draw calls than anything else from what I've seen so far, so you should probably disable it (failing to see the point of it, to be honest). Also if you have any, make sure your custom shaders use DOTS Instancing. Even then, your mileage will vary depending on the renderers in your scene (only MeshRenderers are supported) and the amount of instancing they're capable of, among other things. Batches using GPU Resident Drawer should have the "Hybrid Batch Group" label in the Frame Debugger.

    Hopefully the Burst Occlusion Culling system on the roadmap is gonna be the actual answer to Umbra's shortcomings, among other things. :copium:
     
    almstqyms73 and Lymdun like this.
  2. qjlmeyes

    qjlmeyes

    Joined:
    Apr 19, 2023
    Posts:
    23
    Now I need a function similar to TransformAccessiArray, let's call it "RenderAccessiArray" in order to use the JobsSystem high-performance switch display
     
  3. qjlmeyes

    qjlmeyes

    Joined:
    Apr 19, 2023
    Posts:
    23
    This is a performance analysis of 100000 mobile cubes. For moving entities, it looks quite poor, and the performance bottleneck lies in three aspects: IJobParallelForTransform, UpdateRenderBoundingVolumes,
    And GpuResidentDrawer
    upload_2024-5-15_11-30-29.png
     
  4. TakeITStudioDevelopment

    TakeITStudioDevelopment

    Joined:
    Apr 7, 2021
    Posts:
    1
    gpu driven rendering work on ps5?
     
  5. unity_OwsoyU3laIykcQ

    unity_OwsoyU3laIykcQ

    Joined:
    Dec 9, 2019
    Posts:
    5
    do you have any info on when it will be fixed?
     
  6. Life_Is_Good_

    Life_Is_Good_

    Joined:
    Mar 4, 2013
    Posts:
    53
    Can we opt into the new GPU Occlusion Culling? Say I have a compute shader for collecting instance data. Is there a shader function available to us that we can just call with the bounds or whatnot to see if those are occluded?
     
  7. ArIaNFury008

    ArIaNFury008

    Joined:
    Dec 22, 2019
    Posts:
    29
    Hi. is GPU Rendering Driven better or GPU Instancer (Asset)? what could be the difference? or for a scene with open-world foliage which one could be better? Which one do you suggest?
     
    Last edited: May 26, 2024
  8. LaneFox

    LaneFox

    Joined:
    Jun 29, 2011
    Posts:
    7,598
    It seems like GPU Resident Drawer actually makes performance much worse in our use case. This is on a Revit model of a Stadium.

    Static Batching is ON and GPU Occlusion Culling is ON.

    GPURD ON
    Screenshot 2024-05-31 101609.png

    GPURD OFF
    Screenshot 2024-05-31 101625.png


    I'm not sure why the batching becomes zero when it's turned on, but it does not offer any benefits. It seems like it is kind of behind the scenes actually turning off features which are helping performance? I thought we would see at least a 20% improvement based on what I've read about GPURD.

    What's weirder is that GPU Occlusion Culling is ON and I'm just looking at the wall of the stadium, so there's no reason to be rendering thousands of smaller structural members on the inside, and yet perf behaves like it is.
     
    in0finite likes this.
  9. jiraphatK

    jiraphatK

    Joined:
    Sep 29, 2018
    Posts:
    311
    I'm assuming that this model is like CAD models. I don't think the system is suitable for this scenario where there are tons of unique mesh because it's instance rendering.
     
  10. LaneFox

    LaneFox

    Joined:
    Jun 29, 2011
    Posts:
    7,598
    We run it through a pipeline that converts it into glTF. It detects mesh duplicates and reuses the mesh for those instances, so there's definitely some room for it to work, and definitely room for occlusion culling to make huge improvements.

    But, I do agree it's not as simple as their test cases using mostly copies of a few objects. I just don't quite understand why perf is actually worse.
     
  11. BragBiscuitz

    BragBiscuitz

    Joined:
    Mar 29, 2020
    Posts:
    7
    As I understand it, this GPU occlusion culling implementation seems to be mostly made for heavily GPU-bottlenecked scenarios where there's a lot of expensive overdraw. It's honestly pretty hard to imagine given how CPU-bottleneck the engine usually is (at least on desktop).
    But yeah, it's mentioned in the Unity 6 conference that it uses an additional pass dedicated to building some kind of occlusion buffer if I recall correctly, so increased draw calls are expected if you enable this GPU occlusion culling option. It's a tradeoff between draw calls (CPU load) and overdraw (GPU load).

    I'm pretty curious about your scene setup/assets for the render to still have that many draw calls in the better case, though. I'd expect a stadium to have a lot of standardized/reusable parts in general. Did you check the Frame Debugger to see if there's anything you can do to collapse the draw calls further?

    You might want to disable static batching to avoid conflicting with instancing and probably the SRP Batcher as well.
     
  12. LaneFox

    LaneFox

    Joined:
    Jun 29, 2011
    Posts:
    7,598
    We may just misunderstand what its for then. The expectation was to just have regular ol' Occlusion Culling, but processed on the GPU, which is the opposite - moving load from CPU to GPU.

    If the feature works as you describe, it seems rather misleading in name.
     
  13. BragBiscuitz

    BragBiscuitz

    Joined:
    Mar 29, 2020
    Posts:
    7
    In the current renderer context, I agree. This implementation might work better with MultiDrawIndirect/Work Graph rendering with which much further collapse of draw calls is possible, so most of the load would be on the GPU (although better parallelized) and it might be possible to discern improvements to the general frame time. Multiplying draw calls by even 3 when you just have a few dozens shouldn't affect performance in a meaningful way. Sadly, Unity still has limitations when it comes to batching (especially non-MeshRenderer objects), so if anything, this GPU Occlusion Culling option was added too early?

    I honestly have no idea where things are going right now. With the way skinning and textures are handled currently, MultiDrawIndirect on its own should have a limited impact. It's nice that some overhead was shaved by making some buffers persistent (feature as old as OpenGL 4), but it's only a part of the problem (driver overhead). Though I have to say that my understanding of a "modern" rendering pipeline is probably too dated. Vulkan (and probably D3D12) have different costs for various actions as opposed to OpenGL and D3D11, so what I know would work on the old rendering APIs might not be appropriate in newer ones. :confused:
     
    Last edited: Jun 3, 2024
  14. TheHartPony

    TheHartPony

    Joined:
    Apr 26, 2015
    Posts:
    14
  15. Jonathan-Westfall-8Bits

    Jonathan-Westfall-8Bits

    Joined:
    Sep 17, 2013
    Posts:
    304
    So someone correct me if I am wrong on this please, but it sounds like the following.

    You can have somethings use the Umbra Occlusion Culling where it still offers slightly better performance or customization.

    The next part is mentioned in the documents and I will link them at the end.
    The new GPU Resident Drawer is based off of the BatchRenderGroup API to draw game objects with GPU instancing.

    If the objects being drawn are not compatible with the GPU Drawer than it will fallback to Unity's drawing with GPU Instance and thus would be still able to be used in the Culling Group API.

    Document link mentioning and going over how DGPU Resident Drawer uses BatchRenderGroups.
    https://docs.unity3d.com/Packages/c...niversal@17.0/manual/gpu-resident-drawer.html