Question Most Efficient Mechanism to Capture Camera Render Target as Single Pixel Color (Standard Pipeline)

stonstad · Nov 13, 2022

My goal is to capture Camera output as an averaged color for to drive external RGB lighting. I have this working by using a second camera to rendering to a render target texture that is 64x64 pixels with mip maps enabled. I then call GetPIxel on the 1x1 mip level. This works but it requires about 10 FPS on my machine and it gets slower as scene complexity increases. I'd like to see if there is a more efficient approach.

Does anyone know if there is a more performant way of doing this? I see that I can call AsyncGPUReadback and read from a specific mip level...

Do I convert my current camera to use a render target output ... call AsyncGPUReadback in OnRenderImage, and then BLT the result back? I see many approaches but it isn't clear what the happy path is for ideal performance. What would you do in this scenario?

joshuacwilde · Nov 14, 2022

Use AsyncGPUReadback. Also unless your scene is very very simple, you will have better performance calculating mips from your original camera render texture rather than having a second camera, as that will about double your CPU rendering cost.

stonstad · Nov 14, 2022

Thanks @joshuacwilde! Would I then call Graphics.Blit on the RenderTexture within OnRenderImage? I'm new to this and unsure if this is the best practice approach for performance. Thank you!

c0d3_m0nk3y · Nov 14, 2022

Yeah, as joshuacwilde said, I would calculate mips on the GPU from the original camera render texture and then use AsyncGPUReadback to send the 1x1 mip back to the CPU. It will be 3 frame later, but for most use cases that's ok (you can have 3 pending readbacks - one for each pending frame - so that you get a result each frame)

You can either use one draw call for each mip (using the previous mip map as input) or leverage group-shared memory to calculate multiple mips at once in a compute shader:
https://github.com/Microsoft/Direct.../MiniEngine/Core/Shaders/GenerateMipsCS.hlsli

stonstad · Nov 16, 2022

Thank you for the help! Here is where this effort landed:

SampleColor.cs (github.com) https://gist.github.com/stonstad/8db9bfd80d189b55ec7d9edf810e18b7

- Uses Screenshot.CaptureScreenshotIntoRenderTexture, Graphics.Blit to resize and build mip map via a RenderTexture.
- Uses a queue of native array buffers to ensure proper handling of buffers across frames.
- Exposes debug option for writing output to file system.
- Interpolates color across samples, sampling every N frames.
- Cleans up native arrays on script destroy

It works but can it be better?

A mip map is created by the second render texture -- I'm not sure if that is a CPU or GPU operation. Would I create mip maps in the Graphics.Blit step via a material or is this already happening within the existing blit call?

Is there anything else I might consider to improve performance? Thank you!

joshuacwilde · Nov 16, 2022

Not sure why you are using Screenshot.CaptureScreenshotIntoRenderTexture. You can just blit directly from src to your RT in OnRenderImage(). Mipmaps are generated on the GPU. You would get better performance by blitting into an RT that is half the width and height, rather than the full width and height. That will mean generating 1 less mip, as well as copying a lot less data on the GPU. I think mip maps will regenerate everytime you blit into the texture as long as useMipMap and autoGenerateMips is enabled https://docs.unity3d.com/ScriptReference/RenderTexture-autoGenerateMips.html. I am not 100% sure about that though.

Everything else looks decent though. I didn't look through it super thoroughly, but got the main idea.

stonstad · Nov 16, 2022

Thanks, @joshuacwilde! Gist is updated w/ a fix and it blits at 1/4 width/height. I am not seeing a measurable difference in framerate with this on/off. Thank you so much!

joshuacwilde · Nov 17, 2022

stonstad said: ↑

Thanks, @joshuacwilde! Gist is updated w/ a fix and it blits at 1/4 width/height. I am not seeing a measurable difference in framerate with this on/off. Thank you so much!
Click to expand...

Yeah you can't really measure just framerate for optimizations like that. Better to look at the actual frame data in some GPU profiler. There are lots of things that can influence frame timing. For example, maybe you are getting the same frame time, but at a lower clock rate, thus meaning you are drawing less power. So you could fit more into the GPU in the same frame time with that optimization, even if there isn't an immediately noticeable difference in framerate. That being said, it is a small optimization, so can't expect too much

stonstad · Nov 17, 2022

My mistake. What I intended to say is that with the final version of the script, including your awesome perf insights, the script's impact is negligible to frame rate. Thank you!

semaphor · Feb 19, 2023

stonstad said: ↑

My mistake. What I intended to say is that with the final version of the script, including your awesome perf insights, the script's impact is negligible to frame rate. Thank you!
Click to expand...

Hello, I'm looking to achieve the same thing and your solution is working so thank you!

One issue though is when I enable
WriteDebugOutput
I get a crash, which I've narrowed it down to the
mipIndex
being 0 in
AsyncGPUReadback.RequestIntoNativeArray()
Do you know why this would be happening? I'm completely new to GPU stuff

Search Unity

Unity ID

Useful Searches

Question Most Efficient Mechanism to Capture Camera Render Target as Single Pixel Color (Standard Pipeline)