Search Unity

  1. Unity Asset Manager is now available in public beta. Try it out now and join the conversation here in the forums.
    Dismiss Notice

Question Why two DispatchRays(n) and one DispatchRays(n * 2) the work time consumption is not the same

Discussion in 'HDRP Ray Tracing' started by KU5D, Mar 9, 2021.

  1. KU5D

    KU5D

    Joined:
    Dec 11, 2017
    Posts:
    3
    Hi,
    I am trying to make screen space soft shadows use DXR ,But encountered performance issues

    RayTracingMode is Static
    No repeat Build of RayTracingAccelerationStructure
    Same setting
    5k Cube RRSCZS5@J5CS}E07S8R8L)C.png 2LK{RUM1%U6F_POY`UUTY6B.png
    context.ExecuteCommandBuffer(buffer);
    buffer.Clear();
    ------------------------------------------------------ C$6GATZ)6F}2CJ))9IEPL7E.png QMTHDDWS(4U8L8U%ME}G)YF.png
    context.ExecuteCommandBuffer(buffer);
    buffer.Clear();

    How to shorten the time of multiple DispatchRays?
    Or only use one DispatchRays
     
  2. INedelcu

    INedelcu

    Unity Technologies

    Joined:
    Jul 14, 2015
    Posts:
    173
    Hi!

    Resource binding in Unity is not persistent. It's stateless. Every explicit draw call (e.g. Graphics.DrawMesh), compute dispatch and ray tracing dispatch will bind all resources that it needs. In ray tracing dispatches, the resources and parameters setup come from various places like materials (used by Renderers in the RayTracingAccelerationStructure), resources and values set using Graphics.SetGlobalXXX for example, property blocks set using Renderer.SetPropertyBlock or other setting in MeshRenderer.

    The cost of RayTracing.Dispatch depends on how many Renderers are in the RayTracingAccelerationStructure (how complex the scene is) and how many cores your CPU has.
     
  3. m0nsky

    m0nsky

    Joined:
    Dec 9, 2015
    Posts:
    257
    Is this potentially something HDRP Ray Tracing could benefit from? Would the total cost be reduced (like OP is pointing out) if we would schedule all ray tracing effects 'jobs' for a final "collected" dispatch (in the HDRaytracingDeferredLightLoop?) instead of doing a separate DispatchRays call for every effect?
     
  4. Camarent

    Camarent

    Joined:
    Feb 19, 2014
    Posts:
    168
    Did you manage to find an answer for your question? Interesting if this is possiblity to optimize RTX effects.
     
  5. m0nsky

    m0nsky

    Joined:
    Dec 9, 2015
    Posts:
    257
    A collected dispatch could definitely bring performance gains when it comes to ray binning.

    I did some digging into this about a month ago, and as far as I could see (and measure), the current solution seems to be inefficient and can actually result in a net loss in many cases, but please correct me if I'm wrong.

    1. Because the rays are binned separately for every effect, and this is done twice (both eyes) in XR, the ray binning overhead cost will add up. If you are running the full ray tracing stack (RTGI, RTR, RTAO, RTSS), it will execute 8 separate ray binning passes.

    2. Because the rays are dispatched separately, every set of binned rays will start BVH traversal from the beginning (8x in this case), so they will not trigger the cache hits we are trying to achieve.

    If we would merge all of these together, and bin and dispatch these rays at once, all (maximum attainable) BVH cache would be triggered and unnecessary (repeated) binning pass overhead would be eliminated.

    (For more info about ray binning, check out this Battlefield V presentation from GDC 2019, starting from page 20)
     
    Last edited: Sep 13, 2021
    rz_0lento likes this.
  6. KU5D

    KU5D

    Joined:
    Dec 11, 2017
    Posts:
    3
    I don't know how HDRP works, so I don't know the result
    But in SRP, it is useful to combine execution when the number of entities is large
     
  7. KU5D

    KU5D

    Joined:
    Dec 11, 2017
    Posts:
    3
    A:
    A:
    C#
    buffer.DispathRay(shadername,"job1",x,y,1,camera);
    buffer.DispathRay(shadername,"job2",x,y,1,camera);


    B:
    buffer.DispathRay(shadername,"job3",x,y,2,camera);
    shader:
    if(z == 0)
    {
    job1Code
    return;
    }
    job2Code
    return;

    In the case of a large number of entities, B is much faster than A. If the mesh is not merged, it can be considered that B is the fastest, but the speed of the mesh is not certain.

    I used Google Translate, I don't know if it can translate accurately