Search Unity

  1. Unity 6 Preview is now available. To find out what's new, have a look at our Unity 6 Preview blog post.
    Dismiss Notice
  2. Unity is excited to announce that we will be collaborating with TheXPlace for a summer game jam from June 13 - June 19. Learn more.
    Dismiss Notice
  3. Dismiss Notice

Question huge dx12 memory overhead compared to dx11

Discussion in 'Visual Effect Graph' started by julian-moschuering, Aug 17, 2023.

  1. julian-moschuering

    julian-moschuering

    Joined:
    Apr 15, 2014
    Posts:
    529
    Tested in 2022.3.7, 2022.3.5, 2021.3.27

    We are seeing a massive memory overhead per vfx in dx12 compared to dx11. For 10k instances (with disabled batching) of the default vfx it's around 500MB in dx11 vs 10GB(!) in dx12. This is easily reproducible by just dragging 10k instances of any vfx into a scene and switching between dx11 and dx12.
    Activating batching prevents this mostly but isn't an option yet for bugs introduced and not fixing the overhead when having a lot of different vfx assets.

    A similar overhead existed in the hybrid render in entities 0.51 so my guess is that each buffer in dx12 has a gigantic overhead, maybe always allocations a page at minimum. A huge part might be the readback buffers used.

    On Xbox Series dx12 is mandatory and we waste a lot of memory on this platforms due to this.

    Can you give me any information about why this happens and if there is anything we can do to prevent this? Is it possible to get rid of the read back somehow?

    Thanks
     
  2. HB_VFX

    HB_VFX

    Joined:
    Nov 15, 2019
    Posts:
    3
    I will be interested to hear any details you have about this as well.
     
  3. julian-moschuering

    julian-moschuering

    Joined:
    Apr 15, 2014
    Posts:
    529
    From the RenderDoc captures it looks like all dx12 resources are aligned to 64kb which seems to match the memory usage observations (~1MB per VFX -> 16 buffers, wasting ~99% of space, the biggest buffer in this sample has 1.6kb, most are read back buffers with 4 bytes).

    If that's correct I don't think that is acceptable.
     
  4. DevDunk

    DevDunk

    Joined:
    Feb 13, 2020
    Posts:
    5,215
    Is there already a bug report made for this?
     
  5. tvirolai

    tvirolai

    Unity Technologies

    Joined:
    Jan 13, 2020
    Posts:
    79
    DX12 mandates 64kB alignment for buffers. That we cannot affect. But we can look into making the vfx system batch them itself. On the API level our hands are unfortunately tied. It would be great if you could file a bug report about this.
     
    DevDunk and Saniell like this.
  6. julian-moschuering

    julian-moschuering

    Joined:
    Apr 15, 2014
    Posts:
    529
    Yes thank you, we expected as much after reading DX12 docs.

    What really bothers me is that this is not an unknown issue for you (1.0 ecs renderer and vfx batching target this issue pretty obviously) but you did not include this information in any documentation. Without the new batching VFX with DX12 is pretty much unusable outside of samples. This information belongs into ComputeBuffer documentation too. Being more transparent about this stuff would have spared us and likely others a lot of time.
     
  7. tvirolai

    tvirolai

    Unity Technologies

    Joined:
    Jan 13, 2020
    Posts:
    79
    In a way it was an unknown issue and also not. On more future looking case we know of a certain workaround for this, it just depends on the availability of enhanced barriers.

    With the new enhanced barriers we can just basically create bigger buffer and suballocate from that, as in one computebuffer you allocate is not real DX12 buffer but just offset to a bigger one. This might work on the resource state system depending on exact workings of the aliasing system, as in just create multiple buffers on top of eachother and hope it works.

    CreateCommittedResource has a lower memory overhead which is interesting to say the least. But creating those is also way more expensive on the CPU and the higher level code spams buffer creation constantly which is why we aim to use placed resources.

    The reason why the documentation has not been changed is that we will try almost everything to keep the public API unified. Users should not need to care which backend they are running on. Unfortunately sometimes things like these leak up.
     
    DevDunk likes this.
  8. strich

    strich

    Joined:
    Aug 14, 2012
    Posts:
    383
    Is this still a thing in 2022.3.24?