Search Unity

  1. Unity 2020.1 has been released.
    Dismiss Notice
  2. We are looking for feedback on the experimental Unity Safe Mode which is aiming to help you resolve compilation errors faster during project startup.
    Dismiss Notice
  3. Good news ✨ We have more Unite Now videos available for you to watch on-demand! Come check them out and ask our experts any questions!
    Dismiss Notice

Bug ComputeBuffer.BeginWrite takes a long time if program like OBS (Open Broadcaster Software) are activ

Discussion in 'Data Oriented Technology Stack' started by Guedez, Aug 1, 2020.

  1. Guedez

    Guedez

    Joined:
    Jun 1, 2012
    Posts:
    448

    I would assume it's because the previous operation didn't end yet or something?
     
  2. Guedez

    Guedez

    Joined:
    Jun 1, 2012
    Posts:
    448
    In the editor, this is only a slow down, in the player, this either completely hangs or straight up crashes the player.
    Built in 2020.1.0b16.4139
     
  3. tertle

    tertle

    Joined:
    Jan 25, 2011
    Posts:
    2,403
    What are you putting in it? Seems like more of a question for graphics forums.
     
  4. Guedez

    Guedez

    Joined:
    Jun 1, 2012
    Posts:
    448
    Well, I thought here would be the place because it outputs
    NativeArrays
    .
    I am updating many non consecutive pieces of a large compute buffer.
    BufferContainer.cuthits.Length
    is 4~10 long each frame.
    Code (CSharp):
    1. complete = job.Schedule(BufferContainer.cuthits.Length, 1, input);
    2. complete.Complete();
    3. for (int o = 0; o < BufferContainer.cuthits.Length; o++) {
    4.     var t = BufferContainer.InstanceDataLenght.BeginWrite<uint>(BufferContainer.cuthits[o].index * 1024, 1024);
    5.     NativeArray<uint>.Copy(job.odata, o * 1024, t, 0, 1024);
    6.     BufferContainer.InstanceDataLenght.EndWrite<uint>(1024);
    7.     var t2 = BufferContainer.Lenght.BeginWrite<float>(BufferContainer.cuthits[o].index * 1, 1);
    8.     t2[0] = EverydayEngine.GameEngine.CurrentGameTime;
    9.     BufferContainer.Lenght.EndWrite<float>(1);
    10. }
     
  5. sngdan

    sngdan

    Joined:
    Feb 7, 2014
    Posts:
    1,006
    I am interested in this too. A long time ago I wanted to optimize a render system we toyed with in the forum and get rid of the double buffer. Never go to it but I think I had an issue with only updating parts of the buffer too.

    I think relevant here as it’s a strong relation to dots (native container, jobs / chunk did change) - only upload changed chunks to cpu.

    maybe it was just a bug when I tried it (recall this was the first unity version that offered this API)
     
  6. Guedez

    Guedez

    Joined:
    Jun 1, 2012
    Posts:
    448
    I've actually attempted to fix this with a
    ComputeShader
    , but turns out
    ComputeShaders
    can't neither read nor write
    ComputeBufferMode.SubUpdates
    buffers.
    I am downloading 2020.1 to upgrade from the beta version hoping this is not an issue there.

    Update: The issue exists in 2020.1.0f1 too, but it's much harder to reproduce and I am not sure anymore why or when it happens

    It seems related to this message that do not appear in any of the console levels, but appear in the Editor.log in the appdata
    Code (CSharp):
    1. d3d11: multiple uploads in flight for buffer 0000026FE1B399D8 of size 46136. Falling back to slow path
    2. d3d11: multiple uploads in flight for buffer 0000026FE1B3AA58 of size 39370752. Falling back to slow path
    3. d3d11: multiple uploads in flight for buffer 0000026FE1B3AA58 of size 39370752. Falling back to slow path
    4. d3d11: multiple uploads in flight for buffer 0000026FE1B407D8 of size 39370752. Falling back to slow path
    5. d3d11: multiple uploads in flight for buffer 0000026FE1B407D8 of size 39370752. Falling back to slow path
    6. d3d11: multiple uploads in flight for buffer 0000026FE1B399D8 of size 46136. Falling back to slow path
    7. d3d11: multiple uploads in flight for buffer 0000026FE1B3AA58 of size 39370752. Falling back to slow path
    8. d3d11: multiple uploads in flight for buffer 0000026FE1B3AA58 of size 39370752. Falling back to slow path
    9. d3d11: multiple uploads in flight for buffer 0000026FE1B407D8 of size 39370752. Falling back to slow path
    10. d3d11: multiple uploads in flight for buffer 0000026FE1B407D8 of size 39370752. Falling back to slow path
    11. d3d11: multiple uploads in flight for buffer 0000026FE1B407D8 of size 39370752. Falling back to slow path
     
    Last edited: Aug 2, 2020
  7. Guedez

    Guedez

    Joined:
    Jun 1, 2012
    Posts:
    448
    After taking way longer to figure out how and why it was happening. Turns out it only happens when the game is being captured by a program like OBS (Open Broadcaster Software).
     
  8. OndrejP

    OndrejP

    Joined:
    Jul 19, 2017
    Posts:
    101
    There's great information about GPU buffer management:
    https://www.gamedevs.org/uploads/efficient-buffer-management.pdf

    My guess is that you're trying to overwrite data which are currently in use by GPU. So CPU have to wait for that.

    It's typical that when capturing screen, GPU gets a little behind, although 500ms is way more than I would expected. Maybe Unity is performing full sync between CPU and GPU for some reason.

    Good solution might be to create bigger "circular buffer" so you can write data for multiple frames without overwriting.
    You'll use BeginWrite, start at offset 0 at frame 0. Each frame increase offset by the amount of data written. After several frames (when buffer is almost full) start over at offset 0. This way GPU can still use data when it's one or more frames behind. You'll need to modify compute shader a bit to read from correct offset, but it shouldn't be hard. This would be only useful when overwriting whole buffer.

    I also see that you're doing multiple Begin/End writes, it might be better to do one big Begin/End write when the data is continuous.

    Do you always overwrite whole buffer? Because if you're writing 40 MB every frame, I can see why this can become a problem (40 MB x 60 fps = 2.4 GB/s). I see from profiler that it's something called grass cut.

    Maybe you can split the buffer into multiple smaller buffers and update just few of them? Or write a compute shader which would performs grass cutting directly on the GPU? (Only upload cutting parameters - like position, radius...).
    This is similar to how Unity terrain brushes works.
     
  9. Guedez

    Guedez

    Joined:
    Jun 1, 2012
    Posts:
    448
    Unfortunately I am hardly ever writing continuous data, only when the player is at the very center of the map that I will get that, since the data being written to is the grass data that spans the entire map split in tiles. Each time a tile have any blade of grass changed, the whole tile is written to the GPU. And that is for each grass bender instance, ECS can handle a whole lot of them without much issue.

    Example image of a single grass bender affecting 4 tiles, thus causing 4096 uints to be uploaded to the GPU.
    No, I only upload the whole tile for tiles that were modified this frame, which is 1 to 4 per grass bender, which is 1024 uints per tile. Grass cut uses the same logic but happens less often, so grass bending and grass cutting, on this specific issue, are equal.
    I wish, but unless there is a 'computebufferarray' or similar, is not really possible. This whole field of grass is drawn in 3 draw calls, one for close (cast and receive shadow), one for middle range (only receive shadow) and far away with no shadows, thus the whole information for all of the tiles need to be supplied to the shader to use on demand, which is 3 huge 9~k*1024*4 bytes buffers. Unfortunately, I can't occlude tiles with Umbra, because there is a lot of wasted geometry in this screenshot


    I guess... But that would be a whole lot of duplicated work.
    I actually tried to put all of the data in a single continuous buffer and use a ComputeShader to place each tile on it's place, but it seems that ComputeBuffers with ComputeBufferMode.SubUpdates cannot be written by ComputeShaders.
     
unityunity