Search Unity

Efficiently doing Real-time Prerendering?

Discussion in 'General Graphics' started by Sluggy, Dec 13, 2021.

  1. Sluggy

    Sluggy

    Joined:
    Nov 27, 2012
    Posts:
    983
    Yeah, you read that title correctly. I'm trying to create a method of pre-rendering 3D models in Realtime so that they can be used as sprites... also... in realtime. Effectively what I'm going for is that very specific pre-rendered look as well as a completely stable pixel placement regardless of position, rotation, and scale while at the same time gaining all of the benefits of using 3D models, skinning, bones, cloth, animation retargeting and blending, IKs, etc...

    I'm using a custom SRP so I've got a little extra control with what I'm doing but I can help but shake the feeling that I'm walking down a long corridor when the exit I wanted was ten feet to the left of the entrance. I've tried a few different methods and the one I've settled on so far is to schedule a list of objects to draw and then have a dedicated phase in my SRP to swap render targets one-by-one and render each object out onto a different one using CommandBuffer.DrawMesh(). Later during a more 'natural' phase of rendering all of the sprites will be drawn like normal using quads. Each sprite quad would of course need its own material with its own render texture supplied as the sprite to draw.

    This leaves me with one really big issue. Each sprite effectively needs it's own material supplied. I supposed I could generate these materials and render targets and do all of the linking at runtime but it somehow seems wasteful and I'm also slightly worried that, down the road, this could have severe technical or performance issues I'm currently not aware of.

    Is there some better alternative I should consider? Perhaps rendering sprites to a single very large rendertarget, where each sprite gets a dedicated rect of it's own? Perhaps a shader-only method that doesn't involve all of these render targets and sprites to be rendered in the first place?
     
  2. McDev02

    McDev02

    Joined:
    Nov 22, 2010
    Posts:
    664
    Ok if I understand you correctly you got a pixel-art style game but you use 3D assets as a base?

    This is a topic of which I have no concrete experience with but one thing is for sure, rendering directly to the screen instead of to render targets is more performant. Maybe you are able to position your 3D objects and camera like they are 2D-pixel perfect?

    There might be a number of tutorials, I remember at least one game, name of which I can't recall, that dealt with this issue and I actually believe that they used a 2.5D technique.

    Only thing I don't get what you mean with pre-rendering and in real-time. So you basically render things to a sprite and then on screen? Then this is not pre-rendering in a caching aspect. Anyway one alternative is to pre-render literally everything that you need (all animations etc.) and have a complete 2D game.

    Does each sprite need it's own material? Are there really no duplicates of the same object?
    Without a clear idea of what you do it is hard to provide help.
     
  3. Sluggy

    Sluggy

    Joined:
    Nov 27, 2012
    Posts:
    983
    Hi @McDev02 ,

    So, I want to replicate the feel of having 2D sprite billboards being rendered in an otherwise 3D world. However, I don't want deal with the massive undertaking of creating all of these assets by hand (multiple characters, with swappable gear that can be seen on the character) nor the hassle of having to come up with a system that would manage all of that. Instead I wanted to be able to create 3D models that can have their bone-based animation retargetable - thus allowing me to create much more variation in characters and the gear they can wear. It would also have the benefit of using other real-time features such as IKs, cloth and hair physics, swappable weapons, animation blends, animation masks, ..... .. But I still want to retain that very distinct 'sprite' look regardless of the scale and position of the character on screen. I'm intended to combine this effect with quantized animations (i.e. no tweening, just keyframes) to really sell the look.

    It might be possible to do all of this using a separate camera layer with a lower resolution and some pixel-perfect rendering but in some of my early tests that proved to bring up a lot of other issues that are hard to work around, like using the depth buffer correctly. And even then it probably wouldn't work the way I want it to with regards to zooming camera in and out. So my current plan is to render the 3D models of each entity in the scene to an offscreen render texture and then render a sprite in it's place using that render texture. My biggest concern currently is if there is any kind of hardcap limit on how many render textures can be used by common platforms/hardware? Or are render textures effectively the same as regular textures and as long as I have the space, I have no other limits?

    I have a basic proof-of-concept working now but I've been short on time so it's still *very* rough. I think this weekend I'll spend some time fixing it up, do some stress tests, and post back here with my results.
     
  4. burningmime

    burningmime

    Joined:
    Jan 25, 2014
    Posts:
    845
    I don't *toatally* get what you're trying to do, but it sounds like you're a bit of a trailblazer (unless you can point to a game or image that has a similar style to what you're after?). Which is cool since you get to figure it all out in the way that's best for your project, but that always gets in the way of actually making a game ;-P.

    If I understand you correctly, then the most performant way would be to pre-allocate some render targets (maybe a few 2048x2048 RTs). At the beginning of every frame, on the CPU (maybe in a burst job), do your culling/whatever and assign a UV rect to each object. Then, in your prerender pass, you render each object to its UV rect by modifying the clip space position in the vertex shader. Upload the sprite transformations and UV offsets for all sprites as a compute buffer, and the actual rendering of them is a single DrawProceduralIndirect() call per sprite layer. That shader takes care of generating the quad vertices, applying transformations, etc. It's one draw call per prerendered object and then one draw call for all the sprites.

    However, that's not necessarily the best way to do this, even if it is likely the fastest. I can think of several pitfalls:

    1. It requires modifying the shaders for the prerendered objects. If you have a lot of them or are using third-party/unity-builtin shaders, this is a ton of work in Unity (easy peasy in Unreal). And it adds an additional step for every new shader you want to integrate.
    2. It pushes onto *you* a ton of work the graphics pipeline would otherwise be doing.

    Separate materials that share the same shader aren't bad with the SRP batcher. Drawing a ton of individual quads instead of batching sprites, and juggling dozens to hundreds of render targets isn't great, but depending on how many objects you have on screen at once and what platforms/hardware you're targeting, it might not actually be a problem. Basically it comes down to that age-old question of which resource you have less of: ms of frame time or months of dev time.

    That being said, I could be talking out my *** since I don't completely understand what you're trying to achieve here. If you post a screenshot of what you want or another game to compare to, maybe there's a different way to achieve the look you want without doing the prerendering thing at all.
     
  5. Sluggy

    Sluggy

    Joined:
    Nov 27, 2012
    Posts:
    983
    @burningmime

    Heh, I'm used to people giving me weird looks when I think up some of this stuff. So no worries there. But yes! I CAN in fact point to an example of how I want it to look, though this is achieved just using traditionally animated sprites in an otherwise 3D world.


    Basically I want to get a look that is similar to that. But taking the time to draw all of those sprites, simple as they are, still takes ages. And when you want to add things like visible gear and different weapons you are suddenly hitting exponential growth in the amount of work that needs to be done. The obvious solution would be to use some kind of pre-rendered 3D models that can be rendered in a way that makes them look very much like sprites. There was an very old post about a shader developed for blender that did exactly this. And this is where I got my initial spark of an idea.



    However, because this is just outputting static 2D images at edit-time it still requires a TON of workarounds and effort. And has some pretty huge limitations - simple things like running while attacking become massive issues that often require limits in how the art can be made as well as a ton of extra setup. Then there's the potentially massive amount storage space just to store the generated images.

    But these days computer are pretty darn fast at doing things. So... why not make the computer do all of that work in real-time? Make it generate that sprite for us and then display that sprite while it's at it. With this comes all of the benefits of using real 3D geometry I mentioned above.

    I was thinking something similar to what you suggested with maybe just one or a few very large render textures and some sectioned-off rects dedicated to each on-screen character. I'm not the best at shaders or matrix math so for the proof-of-concept I decided to go with many render textures instead. The cost of rendering 3D model to the RT every frame will be a bit of a hit since those can't be batched. But they are skinned meshes anyway so not a huge loss I suppose? I haven't profiled yet but I'm really hoping there won't be any massive hits for constantly swapping render targets. As for the sprite quads, I'm thinking they won't be too much of an issue since the shader I wrote for them supports SRP batching.

    Like I said, I'll do some work and test it out this weekend and post back here with some results. Hopefully by then I'll be in a better position to see what the next step will be. In the meantime, if anyone happens to have any experience or advice, I'd still love to hear it! I really feel like I'm speeding down a dark road at night with my pants over my head.
     
  6. halley

    halley

    Joined:
    Aug 26, 2013
    Posts:
    2,431
    You might look into Amplify Impostors. They're designed to take objects and render them to billboards for performance reasons. Not sure if it covers your need or not.
     
  7. burningmime

    burningmime

    Joined:
    Jan 25, 2014
    Posts:
    845
    I didn't see the post you made just above mine where you explained it to McDev; sorry for makign you explain it twice.

    You'll have to test, but yeah, I think juggling render targets is not *great* because (at least on older GPUs and maybe some mobile/APUs/tile renderers), it breaks some parallelization and issues a lot of extra state switching. But it's not terrible, and effects like bloom use many RTs with long dependency chains. So YMMV. I'd say if you have 20 RTs it's a non-issue but if you have 2000 it might start to be a problem. It also matters if you're generating mipmaps for them or not and also whether or not you're using MSAA (MSAA is going to issue an extra resolve pass on each RT)

    Drawing many quads instead of batching sprites is slow, but that's not specific to your prerender thing; that's just how sprites work. As long as it's a few dozen and not a few thousand, it shouldn't be a problem. You can batch them later if it becomes a performance problem.
     
    Last edited: Dec 15, 2021
  8. Sluggy

    Sluggy

    Joined:
    Nov 27, 2012
    Posts:
    983
    So I did a little bit of testing with a very rough prototype and it appears to work well enough. I can get about 100 sprites on the screen this way and still keep my framerate above 60. But that doesn't leave much headroom. Also, I'm not using the greatest hardware (most of my stuff is at least 5 years old) but I still wouldn't mind improving this to ensure weaker laptops wouldn't have an issue. I have yet to do an in-depth deep profiling to see where the CPu is hit hardest so that will likely be my next step.

    All that being said, I think it would probably still be a better option to use a handful of large render texture rather than tons of little ones. Though I'm not at all sure how to accomplish this using Graphics.DrawMesh(). The only way I can think to ensure each model is rendered to it's own little rect of the rendertexture is to use some matrix math in the shader but I have no idea how to accomplish that. Would anyone have any suggestions on what I'd need to do to a MVP matrix to accomplish this?

    Another odd thing that stuck out to me is that the SRP batcher was not working with the quads. They were all using their own copy of an otherwise identical material, each with their own private render texture supplied as _MainTex. But they were all using an identical SRP-Batcher compatible shader so I'm not sure what gives there. Probably has something to do with my dynamically generating the materials at runtime, I guess? This would be a huge area for improvement as it adds an additional SetPass call for each sprite quad rendered to the screen.
     
  9. burningmime

    burningmime

    Joined:
    Jan 25, 2014
    Posts:
    845
    You might be able to do it with a matrix (which would work for all shaders). But the easy-peasy way to do it in a vertex shader would just be:

    Code (CSharp):
    1. // first make a property...
    2. float4 _spriteTextureOffsetAndScale; // offset in XY, scale in ZW
    3. // both in 0-1 coordinate space (so divide by texture width/height on the cpu)
    4.  
    5. // then in your vertex shader
    6. positionCS.xy = (positionCS.xy * _spriteTextureOffsetAndScale.zw) + _spriteTextureOffsetAndScale.xy;
    (you might need to use a perspective divide there if you are doing that, but I get the impression that this would be a post-process pass, so wouldn't be using perspective, right?)

    Just quick sanity checks:
    * your shader says it's SRP-batcher compatible
    * you're using frame debugger ot renderdoc to check (SRP batches don't show up in "saved by batching" in the top-right of the game view)
    * since you're using a custom SRP, have you made sure other things besides the quads batch?
     
    Sluggy likes this.
  10. Sluggy

    Sluggy

    Joined:
    Nov 27, 2012
    Posts:
    983
    Oof, the SRP batcher issue was just a dumb mistake on my part. I had the shader set to Unity's built-in Unlit/Transparent Cutout, which apparently is not SRP compatible. I was under the impression that most if not all of the Unity shaders were but it appears that only the Standard shader works with SRP batching. No biggie though. I was planning to write my own shader anyway since there are some effects I want to apply to the sprites themselves.

    You made me realize that I was really over complicating the offset/rect thing as well. I'm already passing an offset value into the shader for each 3D mode being pre-rendered so that I can adjust where in the RT the model appears. It should be absolutely trivial to make that also take into account fixed rects on a larger RT and adjust the offset accordingly. Thanks for pointing that out!

    Once I get those two issues sorted I think I'll be a good part of the way towards my goal! After that I still have some other hopefully minor issues to tackle. For one, the shader I use for doing the pre-rendering has a very ugly, hard-coded vertex function that probably is not cross API compatible. I had to do some ugly vertex axis flipping in model space to get the lighting to visually agree with the orientation of the camera. I also need to come up with a culling system. As it is right now, the 3D models will always be rendered to RTs even if they can't be seen on camera. Since they are never actually rendered through normal means I can't simply tap into Unity's built-in events such as OnWillRenderObject. However there is a simple Culling API I've used in the past that should allow me to solve this issue as well.
     
  11. burningmime

    burningmime

    Joined:
    Jan 25, 2014
    Posts:
    845
    If you're talking about the SRP CullResults, I think you can't actually extract much data from it.

    You might need to cull manually. Here's some code I wrote a couple months ago for stencil shadows, you might be able to get some use from it; the setup needs to be done outside of Burst since it uses the Camera class, but the actual culling is burst-compatible:

    Code (CSharp):
    1.  
    2.             private readonly struct Frustum
    3.             {
    4.                 private readonly Plane left;
    5.                 private readonly Plane right;
    6.                 private readonly Plane down;
    7.                 private readonly Plane up;
    8.                 private readonly Plane near;
    9.                 private readonly Plane far;
    10.              
    11.                 private static readonly Plane[] _planesTemp = new Plane[6];
    12.                 public Frustum(Camera camera, float inflate)
    13.                 {
    14.                     GeometryUtility.CalculateFrustumPlanes(camera, _planesTemp);
    15.                     left = _planesTemp[0];
    16.                     right = _planesTemp[1];
    17.                     down = _planesTemp[2];
    18.                     up = _planesTemp[3];
    19.                     near = _planesTemp[4];
    20.                     far = _planesTemp[5];
    21.                  
    22.                     left.distance += inflate;
    23.                     right.distance += inflate;
    24.                     down.distance += inflate;
    25.                     up.distance += inflate;
    26.                     near.distance += inflate;
    27.                     far.distance += inflate;
    28.                 }
    29.  
    30.                 public bool contains(Bounds bounds)
    31.                 {
    32.                     float3 bMin = bounds.min, bMax = bounds.max;
    33.                     return testPlane(bMin, bMax, left) &&
    34.                         testPlane(bMin, bMax, right) &&
    35.                         testPlane(bMin, bMax, down) &&
    36.                         testPlane(bMin, bMax, up) &&
    37.                         testPlane(bMin, bMax, near) &&
    38.                         testPlane(bMin, bMax, far);
    39.                 }
    40.  
    41.                 [MethodImpl(MethodImplOptions.AggressiveInlining)]
    42.                 private static bool testPlane(float3 bMin, float3 bMax, Plane plane)
    43.                 {
    44.                     float3 n = plane.normal;
    45.                     float d = plane.distance;
    46.                     float3 test = new(n.x < 0 ? bMin.x : bMax.x, n.y < 0 ? bMin.y : bMax.y, n.z < 0 ? bMin.z : bMax.z);
    47.                     return math.dot(n, test) + d >= 0;
    48.                 }
    49.             }
    50.  
     
    Last edited: Dec 21, 2021