Search Unity

  1. Unity Asset Manager is now available in public beta. Try it out now and join the conversation here in the forums.
    Dismiss Notice

Feedback [Feature Request] Vertex Buffer with LockBufferForWrite when possible.

Discussion in '2022.2 Beta' started by nishikinohojo, Jun 12, 2022.

  1. nishikinohojo

    nishikinohojo

    Joined:
    Aug 31, 2014
    Posts:
    46
    I don't know the internal details of these things, but I believe we should be able to use vertex buffer directly in Burst job at least on some platforms.
    https://developer.apple.com/documen...ynchronization/synchronizing_cpu_and_gpu_work
    Currently VertexBuffer obtained by using GetVertexBuffer does not have a LockBufferForWrite flag.

    I couldn't test this with 2022.2a as I'm suffering from a load of crash. But as far as I'm aware by reading release note no difference has made from 2022.1 so I'm requesting it here.

    If possible, it would be really nice!
    I have made custom implementation of Bone Animation and Skinning with Burst because I wanted to steal tasks out of GPU and give it to CPU.(It is running much faster than builtin Animator and CPU skinning, of course.)
    Currently the only proper way to feed data into vertex buffer from CPU is making new Graphics Buffer with LockBufferForWrite enabled and use it for (Burst) calculation, then copy it to Vertex Buffer by using Graphics.CopyBuffer.
    It already runs bloody fast but I would like to avoid Graphics.CopyBuffer whenever possible, apparently.

    I'm posting a thread for the first time so if I'm doing anything wrong, please let me know.
     
    Last edited: Jun 12, 2022
  2. TJHeuvel-net

    TJHeuvel-net

    Joined:
    Jul 31, 2012
    Posts:
    838
    What you want to avoid is sending data from your CPU to the GPU. This will cause the GPU to have to wait for this.

    As long as you do your work on the CPU, e.g. Burst, you *have* to do this. It is unavoidable.
     
  3. nishikinohojo

    nishikinohojo

    Joined:
    Aug 31, 2014
    Posts:
    46
    Thank you for your reply.
    On dGPU with dedicated VRAM, I know there's not much I can do.
    But specifically, on unified memory device like mobile phones, I think we don't need multiple buffers in this scenario. If I'm wrong, sorry! (Or just my English was so bad that I couldn't express what I was implying about in the first comment? I used the word "on some platform" to indicate iOS. Sorry! Seems like I've learnt something new today.)

    But now I'm thinking! Even if it is possible, this kind of platform-specific optimization may end up notoriously complicated API. I'm sure I can handle whatever it will be, but it might not be appropriate for game engine.
    At this point my implementation of skinning runs fast enough, so I think I can retract this topic.
     
    Last edited: Jun 27, 2022
  4. nishikinohojo

    nishikinohojo

    Joined:
    Aug 31, 2014
    Posts:
    46
    I am reviving this thread.

    After a few month, I rethink this and I concluded there is no reason for Vertex Buffer not to have LockBufferForWrite.
    I mean, this is currently what is happening in my environment with DX12.

    2022y08m25d_051633781.png

    EndGraphicsJobs(24ms) means waiting for uploading GraphicsBuffer to the GPU and CopyBuffer.
    What I'm doing here is

    1. I LockBufferForWrite for GraphicsBuffer manually created like this to schedule skinning job in burst.
    2022y08m25d_052134409.png

    2. After completing the job, Graphics.CopyBuffer is invoked to copy skinning result to actual vertex buffer. Like this.
    2022y08m25d_052254421.png

    BTW This is my custom skinning n animation solution.

    Seriously, CopyBuffer should be avoided in this situation because I have CPU side copy of the buffer. I mean, I am making mesh isReadable true.(I think it is not that important for LockBufferForWrite though.)
    In some platform LockBufferForWrite seems get a direct pointer to GPU resource but even if so burst can handle that because that is the point of the API.(I am assuming this happens in unified memory architecture)
    https://docs.unity3d.com/2022.2/Documentation/ScriptReference/GraphicsBuffer.LockBufferForWrite.html


    If I can directly use a vertex buffer in Job, this is the expected result.(I boldly emulated just by removing Graphics.CopyBuffer)
    EndGraphicsJobs finishes much faster.

    2022y08m25d_051630065.png

    This makes difference.
    I do not expect this for 2022.2, but in the future, I think this will help somebody if implemented properly.
     
    Last edited: Aug 24, 2022