Search Unity

Readback synchronously from GPU the number (uint 4 Bytes) of live particles? Why?

Discussion in 'Visual Effect Graph' started by fabio_polimeni, Jun 9, 2022.

  1. fabio_polimeni

    fabio_polimeni

    Joined:
    Nov 30, 2021
    Posts:
    2
    Hello everybody,

    I am new to the VFX (currently on 2022.1 v13) and I am trying to optimise some spikes I have been observing on some of the particle systmes. It seems to come down to the fact that every now and then there is a readback from GPU to CPU of a 4 Bytes buffer. It looks like it is there to query the number of live particles, as I never observe the value being higher than the Capacity field of the initializer block.

    Now, this readback, as far as I can tell, comes from inside the DLL (not part of the C# source code shared on the github repository) and I can't figure why this is happenig, what it is used for, or if there is any workaround for it.

    Can anyone at Unity shed some lights on this? What is this readback used for? Is it strictly necessary? Can we avoid/replace it? Does exist a combindations of blocks/contexts that will not trigger this readback?

    Thanks in advance,
    Fabio
     
  2. JulienF_Unity

    JulienF_Unity

    Unity Technologies

    Joined:
    Dec 17, 2015
    Posts:
    326
    Hi. Yes VFX Graph is doing some asynchronous readbacks under the hood. Typically so that CPU knows particle count and is able to put fx into sleep state when there's no more particle alive and spawned.

    However this shouldnt be causing spikes as it's asynchronous. Would you be able to provide a small repro and/or a profiler screen.
     
  3. fabio_polimeni

    fabio_polimeni

    Joined:
    Nov 30, 2021
    Posts:
    2
    Hi, thanks Julien for the quick reply. So, what I am observing are spikes at driver level, and you won't necessary be able to see any noticeable difference with typical local rendering setups. Lets say readbacks hurt us several order of magnitudes more than they normally do when, for example, PCI x16 is locally available (not the case I am looking at).

    By the way, the test case is the Spaceship Demo, but I doubt it will be of any benefit to you because of the above it is unlikley you will see any measurable difference with a local setup.

    Finally, I say these are "synchronous" because when I look at the params received by D3D11DeviceContext::Map(...) MapFlags != D3D11_MAP_FLAG_DO_NOT_WAIT. I guess, you can still have a similar asynchronous readback behaviour if you use an array (ring buffer?) of resources you want to read from, and you don't care too much at the immediate correctness of the data.

    These are all assumptions, beacuse I can only observe what the driver receives, so if something is off, please let me know. Otherwise, would be D3D11_MAP_FLAG_DO_NOT_WAIT something vable to consider at implementaiton level?

    Thanks,
    Fabio
     
  4. JulienF_Unity

    JulienF_Unity

    Unity Technologies

    Joined:
    Dec 17, 2015
    Posts:
    326
    This is strange. We don't use D3D11_MAP_FLAG_DO_NOT_WAIT but the map should never be blocking anyway as we're using a fence to ensure the copy was performed.
     
  5. JulienF_Unity

    JulienF_Unity

    Unity Technologies

    Joined:
    Dec 17, 2015
    Posts:
    326
    Maybe there's something suspicious in DX11 async readback implementation, unsure. If you're able to file a bug with relevant data, we can take a look. btw this is not really related to VFX, Some other systems in Unity are also using async readack (HDRP for sky ambiant for instance)