Search Unity

  1. Get the latest news, tutorials and offers directly to your inbox with our newsletters. Sign up now.
    Dismiss Notice

Efficient resolves and tiled memory

Discussion in 'Shaders' started by DavidSWu, Mar 4, 2020.

  1. DavidSWu

    DavidSWu

    Joined:
    Jun 20, 2016
    Posts:
    183
    I want to do the following:
    - Render to the frame buffer
    - resolve to a texture/rendertexture
    - Continue rendering to the frame buffer, using that resolved texture for things like refraction effects

    This works naturally with the RenderPass API, but that is not well supported.
    If I do a blit, I have to switch away from my active render target, use the shader pipeline and then restore color and depth to the frame buffer
    If I use CopyTexture from the frame buffer, odd things happen.
    ResolveAntiAliasedSurface only works with anti-aliased surfaces and it requires a RenderTexture rather than a RenderTargetIdentifer,

    Thanks!
     
  2. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    1,906
    Do you mean "copy"?

    This all sounds like this would be the way to go:
    - Render to a rendertexture
    - Do a blit of this render texture to the framebuffer
    - Finish rendering using the same texture to the framebuffer

    You end up having just one intermediate store operation in this case.
     
  3. DavidSWu

    DavidSWu

    Joined:
    Jun 20, 2016
    Posts:
    183
    That definitely works, assuming that you copy Z's in the blit.
    However, on mobile platforms, all rendering goes to tiled memory and there is dedicated hardware for copying/resolving from the tiled memory to RAM.
    So ideally I should be able to render and resolve at will to a RenderTexture and then continue rendering.
    Having said that, since you can only have one tile active at a time, I might need to resolve each tile twice if I want to read texels from outside of the tile currently being rendered without potentially overwriting texels that I am reading from and creating feedback.
    Still, resolving twice would be more efficient than resolving twice and doing shader driving pixel and z blit.
     
  4. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    1,906
    The only way to resolve to a render texture is to do a copy in this case. When you make a copy to a render texture, you still make a copy of the data from tile memory to system RAM.

    If you need to read neighbouring pixels, RenderPass API will not help you much - it's going to be two render passes anyway, exactly for the reason you mentioned, as you need to finish rendering the whole render target before accessing it later.
     
  5. DavidSWu

    DavidSWu

    Joined:
    Jun 20, 2016
    Posts:
    183
    Thank you for the information.
    Regarding
    "When you make a copy to a render texture, you still make a copy of the data from tile memory to system RAM."
    Do you mean that it copies to System readable memory as well as GPU readable memory? (last I checked, on mobile memory is unified but some of it passes through the CPU cache and some if it does not.)
    Also, would it be better to use a RenderTexture or a Texture2D or would it not make a difference?

    We are planning to provide a low detail pathway through the renderer which only requires the current pixels color and depth if it is significantly faster, but we are still evaluating the performance.
     
  6. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    1,906
    GPU readable memory. But this doesn't matter really, because, as you mentioned, memory on mobile is unified.
    What matters here is that it does a memcpy, and memory bandwidth is usually the biggest battery drainer (and perf limiting factor) on mobile.
    No difference. What does matter is the pixel format: the lower the bpp, the better (memory bandwidth again).

    Here's a comparison of what would need to be done with your proposed approach vs what I suggested:
    - Render to frame buffer: set frame buffer as target, render: loadOp = clear, storeOp = store
    - Copy to render texture: set render texture as target, blit: loadOp = dontCare, storeOp = store
    - Render to frame buffer again: set frame buffer as target, render: loadOp = load, storeOp = store

    - Render to render texture: set render texture as target, render: loadOp = clear, storeOp = store
    - Render to frame buffer: set frame buffer as target, blit, then render: loadOp = dontCare, storeOp = store

    If the pixel formats don't change, and the render texture is the same size and same bpp as frame buffer, you get 4xFBSize memory traffic in your case vs 2x FBSize when rendering directly to the render texture first.
     
  7. DavidSWu

    DavidSWu

    Joined:
    Jun 20, 2016
    Posts:
    183
    Thank you for the information.
    I was hoping to avoid this step: - Render to frame buffer again: set frame buffer as target, render: loadOp = load, storeOp = store by using something like ResolveAntialiasedSurface or CopyTexture

    I was under the impression that you could not render to a render texture directly on mobile, i.e. that you only rendered to tiled memory and then resolved it to wherever you wanted to store it.
    This may be out of date now.
     
  8. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    1,906
    No, it still doesn't render to it directly. This all goes to tile memory, and then, depending on the store action, to system memory or just gets discarded.

    This could benefit from using the RenderPass API, but only if you need to sample the current pixel and not the neighbourhood. The render texture in this case can be declared transient, and it would not even get memory allocated.
     
  9. DavidSWu

    DavidSWu

    Joined:
    Jun 20, 2016
    Posts:
    183
    Thank you for the information. I look forward to using the RenderPass API when it is available.
     
unityunity