Search Unity

Question How to implement MegaCity LODs and Culling system

Discussion in 'Entity Component System' started by logic-cpp, Sep 12, 2022.

  1. logic-cpp

    logic-cpp

    Joined:
    Mar 11, 2013
    Posts:
    24
    BH

    I'm trying to make thorough use of DOTS/ECS, trying follow this video I believe from @mike_acton about ECS Track: LOD and Culling Systems That Scale


    tl;dr
    1. Need more help with what he's explaining, I understood maybe 35%. Timestamped questions below.
    2. How applicable to -and- how to implement for my project
    /tl;dr

    Disclaimer: I haven't even tried opening the MegaCity project in Unity Editor on my current low-end PC... I'm waiting for a new custom rig to arrive soon.
    However - I am studying the code furiously. Over & over & over.

    I'm trying to prototype an FPS something kinda similar to FarCry 1: Big outdoor open world tropical islands, long draw distances, LOTS of nature trees/vegetation/foliage/bushes/grass etc., some structures, and linear mission through sparse pockets of enemy locations across the map.



    High quality graphics is not a priority, I am using URP performant renderer and prioritizing scene density over quality & realism. i.o.w. I don't mind if it looks like 2004 quality, not FarCry 6 quality.
    Performance is not great at the moment... Profiler is showing highest work happening in Draw Opaque and Draw Transparent.
    To be clear - my scene map is not anywhere nearly as densely filled as MegaCity. I'm aiming for maybe 200k or so total objects rather than 5 million. Having said that;

    Comparison with MegaCity:
    • ✅ Big Scale, far distance views
      • Player's current location viewing frustum can see an impressive landscape of distance & detail. Pulling up binoculars or zooming down sniper scope can show you even farther distances, although FOV is reduced significantly when doing so
    • ✅ Scenery is all static
      • (This concept is discussed in the video.) My project has static mesh terrain map containing static trees/vegetation/grass/props/etc all static.
    • ✅ Lots. Of. Detail/Objects.
      • Vegetation, props and so on, perhaps similar to MegaCity building sections, shacks, props etc.
    • ⏹ MegaCity is Subdivided into Sub-scenes
      • Should I do this?
        I can use a plugin to cut my large open-world map into any-by-any grid (2x2, 8x10, 16x4, 21x33 etc.) Would it be a good idea for performance to divide into a grid of sub-scenes each sub-scene containing its own positionally-related objects/props/trees etc.? How to determine what are good sub-divisions though.. in MegaCity it makes sense to divide by mega-building. What about my kind of map, how large/small, how many sub-divisions, should it be an arbitrary "grid" or maybe more organic custom-cut sub-divisions?..
    • ⏹ Physics Colliders
      • Although I mentioned an all-static scene similar to MegaCity, but to clarify my scene needs (static) colliders of course, terrain mesh collider, trees, rocks & boulders, structures etc, I know it might seem obvious, but I'm not even sure the MegaCity demo from 2018 had colliders for buildings - did it / does it now?
    I want to follow the wise @mike_acton's advice and not over-engineer scene optimizations like scene graph / octree etc. if we can take full advantage of re-thinking the game/scene into ECSs and boil down / flatten the bottom-line data into Components that can be performantly Systemized.

    Some specific questions on the video:
    • 6:20 "16K chunks of component data - sub-scenes are roughly spatial": It is not clear how i'm supposed to understand why & how do sub-scenes relate to data chunks? How are those chunks "aware" of their sub-scene? What part of the ECS system separates component data chunks by their sub-scene? I thought the whole idea was to flatten everything no matter which sub-scene they come from / belong to.
      As for my project: to achieve similar scene optimization in ECS, should I cut up my map into sub-scenes and flatten the data by Renderers and World Position Matrices? How do I generate in advance the "Chunk World Render Bounds" ?

    • 8:50 I also have the many of the same MeshRenderer I want to render multiple times, such as trees & grass, so that give me also "another level of hierarchy" of render batches, so does that mean I can apply this technique to my scene as well? How to do so

    • 16:00 Where in the MegaCity code can I find this culling loop logic please

    • 25:50 Where in the MegaCity code can I find this LOD loop logic please

    • 26:40 - How to put this together for my scene
    Any help & guidance would be appreciated

    P.S. I would love to pay a thoroughly experienced expert in these subjects consulting time to walk me through how to achieve this for my prototype.
     
  2. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    4,271
    Subscenes mostly relate to streaming and level loading. They don't have to be spatially organized, but usually that's what people do. Chunks are aware of subscenes through the use of ISharedComponentData.

    This can be done by adding the Static Optimize Entity to the root of each converted GameObject that should be static along with its children.

    This is done automatically and is even computed dynamically at runtime for dynamic objects.

    Also happens automatically, and is much more performant in modern DOTS than what MegaCity used, especially for dynamic objects. The technique MegaCity used would copy all dynamic properties to the GPU on the main thread. Now that happens in parallel jobs using change filters, and even static geometry can have other dynamic properties, such as animated UV coordinates. This all plays seamlessly with shader graph, allowing you to define custom properties, and modify them as ECS components.

    Forget MegaCity. This is all built into the Hybrid Renderer package, soon to be renamed Entities.Graphics.
     
    bb8_1, Elapotp and apkdev like this.
  3. apkdev

    apkdev

    Joined:
    Dec 12, 2015
    Posts:
    284
    The session you linked was about ECS internals - you don't have to do any manual work to implement these systems in your game. Simply create your subscenes and LOD + Frustum Culling should work out of the box. If you're looking for occlusion culling, that's still in the works.

    In general, you don't need to think about this. Use subscenes for organization and for splitting up scenes for streaming. Not sure what subscene granularity level is optimal, probably depends on the game. If your whole world fits in a single scene, maybe you don't even need to bother with splitting your subscenes at the moment. Get your game working first.

    ECS should "optimally" handle rendering of duplicate meshes automatically.

    The MegaCity code is going to be quite outdated - if you want to peek at the internals, I suggest you look at the Hybrid Renderer package source.

    Yeah, collider conversion should work out of the box nicely now. I think you need a workaround for terrain colliders, though.
     
    bb8_1 likes this.
  4. logic-cpp

    logic-cpp

    Joined:
    Mar 11, 2013
    Posts:
    24
    Wait - so I'm realizing I may have had a colossal misunderstanding here...
    This Unity talk was about ECS internals??? I don't know why I got the impression from the intro something to effect like "This is how we coded the custom C# code for the MegaCity project so that it works performantly with culling and LODs using ECS" - was I totally wrong? All this magic happens internally?

    Best misunderstanding ever - best discovery ever
    So I don't need to do much, just make everything happen as ECS entities instead of game objects and it will automagically benefit from many of these optimizations - am I understanding this correctly? (also - am I dreaming lol)
     
  5. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    4,271
    Pretty much!

    Some optimizations require you to set up the converted GameObjects with specific components. But in general, try things out. And if you run into performance problems, share profiler captures and we'll help you out.
     
    bb8_1 and apkdev like this.
  6. logic-cpp

    logic-cpp

    Joined:
    Mar 11, 2013
    Posts:
    24
    Thanks!
    Will be trying this out now
     
  7. hippocoder

    hippocoder

    Digital Ape

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    100% not worth doing right now, wait for DOTS 1.0, as they will roll out a completely new culling and LOD system. And it is new and quite significantly different code. This will eventually head to GameObject land in 2023 as well, if time permits.
     
    Dipak_UMX and bb8_1 like this.
  8. logic-cpp

    logic-cpp

    Joined:
    Mar 11, 2013
    Posts:
    24
    That would be amazing :D super excited for that

    Since i'm just prototyping and playing around for now I think i'll be able to make the jump to 1.0 whatever rework might be necessary to do so.

    I'm still just utterly speechless because I was expecting to have to roll up my sleeves and get into a lot of ECS coding & hacking & optimizing & data designing etc. etc. to achieve a performant MegaCity-like scene in dots/ecs. As it turns out, for right now in 0.51 and hybrid renderer - all I have to do is basically just. a. single. click. = add my scene stuff to a sub-scene. Done. All the static scenery should enjoy automagic ECS awesomeness. I'm floored :p
     
    apkdev likes this.
  9. hippocoder

    hippocoder

    Digital Ape

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    That is indeed part of the magic of DOTS :)

    Just expect so much more to come...
     
    logic-cpp likes this.
  10. logic-cpp

    logic-cpp

    Joined:
    Mar 11, 2013
    Posts:
    24
    Can you confirm this whole video about creating a custom LOD system - is superfluous and not necessary based on the fact that DOTS/ECS takes care of LOD-ing automagically?

    Or maybe the automagic dots/ecs LOD-ing happens mostly for static objects, however for dynamic / moving objects then there is place for custom coding a LOD solution?

     
  11. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    4,271
    If you want entities too far away to have game logic associated to them disabled, then you need to do something like this. Otherwise, the Hybrid Renderer handles it for you.

    Honestly, I felt really bad for Ruben when watching this video. One of the fundamental principles of DoD is to analyze the data and take a direct path to the solution first. That didn't happen here, and the result was a bunch of unnecessary overhead. The problem statement in the video would have been better solved with a MonoBehaviour manager using a handful of native containers and IJobParallelForTransform.

    Note that your problem is a different problem in that you don't have the strict requirement of having to use GameObject rendering. That means you are allowed to use the Hybrid Renderer, which does all the work for you.
     
    logic-cpp and Luxxuor like this.
  12. hippocoder

    hippocoder

    Digital Ape

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    1.0 will also have solutions here, basically depends how impatient you are.
     
    logic-cpp likes this.
  13. logic-cpp

    logic-cpp

    Joined:
    Mar 11, 2013
    Posts:
    24
    I see ok
    Thanks a lot for all this info!
     
  14. hippocoder

    hippocoder

    Digital Ape

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    Unity aim for DOTS to be a big, big beefy game engine that's AAA performance (streaming should be faster than all AAA games out there at the moment, actually, because DOTS subscenes are prebaked to Entities data layout so they only need a memcpy, which is the fastest possible thing).

    So the things I know they are working on is with culling, lods, rendering (hybrid renderer v2), streaming all that etc... so my own work here is focused purely on ECS and AI / gameplay.

    I know @DreamingImLatios is pretty hardcore though and will essentially write faster alternatives (in some cases). Unity's will be more accessible, easier to scale to different ways to work with it, but nothing stops you from combining them, I think when the time comes due to how easy it is to transform your data.
     
  15. logic-cpp

    logic-cpp

    Joined:
    Mar 11, 2013
    Posts:
    24
    Heh :p
    Super excited - can't wait

    Where to follow the latest word about when release might be? I remember coming across some rumors that release will be beginning 2023, but other rumors that it can still be even any day now towards end of 2022
     
  16. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    4,271
    Culling, LODs, and HR V2 are already here. Will it get better in 1.0? Probably. But aside from conversion/baking workflow, I'm not expecting a drastic redesign in these areas. I know the backend will switch to a newer more flexible API and there may be a few new features that come with that.

    I think the biggest thing that should dictate whether or not you should wait for 1.0 before playing around with the tech is how sensitive you are to changing "API skins" even if the underlying concepts remain the same. I expect a lot of that to be happening in 1.0 compared to what is out today. If you aren't that sensitive, I would suggest starting now. The reason being it takes some time to get into the Entities mindset. I've been using Entities regularly for three years now. And each year I learn a new power tool that has a significant impact on how I tackle problems. It really does take the time to master. And not putting in that effort is going to handicap yourself, regardless of how much more intuitive the 1.0 or 2.0 or 3.0 APIs become.
     
    gooseguava, apkdev, JesOb and 2 others like this.
  17. logic-cpp

    logic-cpp

    Joined:
    Mar 11, 2013
    Posts:
    24
    How "here" dyou mean? If I install com.unity.rendering.hybrid 0.51.1-preview.21 August 02 2022, is it in there?

    Based on your advice sounds like I'm gonna start playing around with the tech already from now :cool:
     
  18. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    4,271
    Yes

    If you have a hierarchy of environments with lots of renderers, and you want to only analyze rendering performance, check the ConvertToEntity checkbox on the root and add a Static Optimize Entity component to the root. Then press play and you should have entities being rendered and you can compare performance.

    Some may suggest moving all your environment GameObjects into a subscene. Subscenes regularly crash for me so I avoid them.

    Note that everything else related to environment in your game won't work out of the box, because your game logic still needs to be modified to work with Entities. But you can at least test rendering performance without writing a single line of code.
     
    logic-cpp likes this.
  19. eizenhorn

    eizenhorn

    Joined:
    Oct 17, 2016
    Posts:
    2,685
    How many times we've refactored our production game since first DOTS releases (even before DOTS name was a thing):confused:
     
  20. Blitzkreig95

    Blitzkreig95

    Joined:
    Jul 21, 2021
    Posts:
    20
    Where exactly are LODs supported ? I see Burst occlusion culling in Entities Graphics package, but no LODs. Are you sure ?
     
    bb8_1 likes this.
  21. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    4,271
    There's a few different systems involved, but LODRequirementsUpdateSystem.cs will get you started. They bake LODGroups.

    One thing worth noting is that skinned meshes with LODs have a negative impact on performance. I've fixed this in my own tech.
     
    lclemens, apkdev and bb8_1 like this.
  22. lclemens

    lclemens

    Joined:
    Feb 15, 2020
    Posts:
    761
    So I have a few questions on this DOTS 1.0 LOD workflow and I can hardly find any information about it. I'm sure one of the experts in here will be able to clarify.

    In the EntityComponentSystemSamples example there is a scene for LODs. Basically it has 100 LOD group parent objects, each with three children representing 3 LOD levels.

    upload_2023-7-20_11-55-10.png

    When I press play, or in the baked scene... there are 400 individual entities! So that's a little annoying because it clutters up the entities hierarchy and makes it difficult to find the entity I'm looking for. They all have transforms. I think that 100 of the entities represent the groups, and the other 300 represent the meshes.

    upload_2023-7-20_12-15-8.png

    The real thing that is bugging me is that there are now 3 renderable entities per actual entity. Say I want to add a few authoring components to the same entity that the MaterialMeshInfo component is on so that I can query the two components together. I now have to add that authoring component to all 3 child objects? If that authoring component has a bunch of parameters on it, then every time I need to tweak the parameters, I have to change them in 3 places? This seems like a messy workflow. I would much rather have ONE object and ONE entity that points to 3 different meshes.

    Another minor question I have is - what's up with a box collider being on each LOD child object? Is that necessary and used for a render bounds box or is it just there because no one removed it?
     
  23. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    4,271
    There are 400 entities because there are 400 GameObjects. It is a 1-1 bake. Inside each LODGroup are 3 renderer GameObjects.

    Propagating material properties is trickier than most would like, especially since baking is really bad at relationship-dependent components. The problem also exists for renderers with multiple materials. There's not really a good solution to this either as not all properties should be propagated automatically.

    If you want to go deeper into the details of the specific material property problems you are trying to solve, I can provide some tips on how to make the system as low-overhead as possible.
     
    lclemens likes this.
  24. lclemens

    lclemens

    Joined:
    Feb 15, 2020
    Posts:
    761
    Thanks!!

    Because of the triple mesh renderers, it's rather weird integrating it into my texture baked vertex animation library. The animation System is setting various custom Material Properties to advance the animation frame, determine which clip is playing, set playback speed, etc. But now with LODs I suspect that system is running on 3x the entities. I guess maybe that's just part of the tradeoff for having LODs? I probably shouldn't complain since it did bring 100k entities with 1890 vertices each from 8fps to 50fps on my laptop.

    This is the first time I've used LODs, so I'm just stumbling around here. The main issue is that it makes the texture animation baking workflow cumbersome.
     
  25. lclemens

    lclemens

    Joined:
    Feb 15, 2020
    Posts:
    761
    I have another question... so unlike the demo which has all the boxes static and fixed, i need my entities to be movable (ie they need a localtransform). If I didn't have a LinkedEntityGroupAuthoring script on the parent, the entities didn't appear after pressing play. When I looked at the profiler, the penalty was huge for CmputeChildLocalToWorldJob ... like 4 or 5ms compared to the animation System which is at 0.01ms. Is there a way to split them out into separate entities without a parent or get the child transform stuff to run faster?
     
  26. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    4,271
    lclemens likes this.
  27. lclemens

    lclemens

    Joined:
    Feb 15, 2020
    Posts:
    761
    Oh wow, that's crazy - I had no idea you reworked the transform system! It seems like it would be difficult to swap my whole game over to use those transforms instead, especially since I'm using unity physics every now and then?

    Because the transform system is so heavy those 3 child objects were just hammering the CPU even when zoomed in and only a few entities were in the same field of view. I attempted to use TransformUsageFlags.WorldSpace to break the children out, but the LODs would end up not being rendered - I think it's because unlike the Unity example where all entities are fixed and never move, my LODs need to move when the parent moves.

    So I did a little experiment... I tried manually flipping through the LODs based on distance-to-camera using MaterialMeshInfo.

    upload_2023-7-21_15-9-8.png

    I simply registered the meshes and materials and then wrote a system to possibly swap them every 0.25 seconds.

    Code (CSharp):
    1.  
    2. [BurstCompile]
    3. public partial struct SimpleLodJob : IJobEntity
    4. {
    5.     public float3 CamPos;
    6.  
    7.     [BurstCompile]
    8.     public void Execute(ref MaterialMeshInfo mmi, in SimpleLodData lod, in LocalToWorld ltw, in SimpleLodDistData lodDist)
    9.     {
    10.         if (math.distancesq(CamPos, ltw.Position) < lodDist.Dist0Sq) {
    11.             mmi.MaterialID = lod.MatId0;
    12.             mmi.MeshID = lod.MeshId0;
    13.         } else if (math.distancesq(CamPos, ltw.Position) < lodDist.Dist1Sq) {
    14.             mmi.MaterialID = lod.MatId1;
    15.             mmi.MeshID = lod.MeshId1;
    16.         } else if (math.distancesq(CamPos, ltw.Position) < lodDist.Dist2Sq) {
    17.             mmi.MaterialID = lod.MatId2;
    18.             mmi.MeshID = lod.MeshId2;
    19.         } else {
    20.             mmi.MaterialID = new BatchMaterialID { value = 888 }; // invalid number hides the material
    21.             mmi.MeshID = new BatchMeshID { value = 888 }; // invalid number hides the mesh
    22.         }
    23.     }
    24. }
    25.  
    Result: The 4ms taken by the child-transforms (using 100k animated entities) is gone! When zoomed out the results are similar, but when zoomed in I gained 80 FPS. For some reason, the Unity LOD system was causing big spikes that made camera movement jerky by dropping FPS really low every so often - those are now all gone and everything is smooth as butter. Also, the Unity LOD jobs were taking between 2ms and 3ms (not including the child transform stuff), so that performance hit is now down to the 0.45ms taken by SimpleLodJob.

    So there are some obvious difference between my blue-collar LOD system and Unity's fancy one.
    1. The Unity LOD system is based on screen-percentages instead of distance, so I suspect their LODs are probably able to adjust for different screen sizes better. I tried my LOD system with a few different screen sizes and honestly it looked pretty good.
    2. The Unity LOD system works in the Scene tab, but mine only works at runtime. Most of my LOD items are spawned at runtime and don't exist in a scene, so for my purposes I don't see that as much of a disadvantage.
    3. The Unity LOD authoring tool is a lot fancier, but that doesn't seem like a big deal.
    4. It's possible that the Unity LOD is able to avoid calculation for entities in the frustum while I must run the LOD for every entity regardless of if frustum location. But mine is way faster anyway so that factor is negligible.
    5. Unity's LOD stuff is able to do fancy settings tweaks like disable shadows and light-probes per LOD. I think I could make a few tweaks to the material since there is one for each LOD, but I have less control over those settings since I just use one renderer. Even with leaving shadows and everything on, mine is still outperforming the Unity one by a longshot.
    6. Unity supports > 3 LODs - I could easily do the same with a list. My current implementation is just a proof of concept.
    7. Maybe games with multiple cameras or cameras with weird field-of-views would have a problem with dist-to-cam instead of field of view?
    8. Unity LODs support smooth cross-fade transitions, (at least with game objects). I don't know if it's supported with entities. That would be a nice feature.

    At this point, I'm leaning heavily towards using my own LOD. However, I'm still new at this LOD stuff. Can you think of any reasons why it would be better to use Unity LODs instead?
     
    Last edited: Jul 25, 2023
  28. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    4,271
    The technique I describe can also be applied to Unity Transforms. You only need to swap out the systems, and all the components and rules stay intact. That's what I did for Transforms V1. And I will probably do it for Unity's latest Transforms if I have enough of an incentive to.
    The big difference is that you are evaluating the LODs for just the main camera at a sparse rate (so you have to be careful about camera cuts). Unity is evaluating LODs for each culling pass, whether they be cameras, shadows, probes, ect.

    With that said, their actual LOD algorithm is super old, and probably has some nasty performance edge cases since I think its caching was designed for static scenes and a single culling pass. I have the entire Entities Graphics stack refactored so that I can easily swap out the LOD algorithm with something else if I ever end up with a project where LODs are costly. That hasn't been the case yet, but if you wanted to see how your algorithm fairs directly in the culling loop, it wouldn't take me long to wire it up into a custom Latios Framework version.

    As for cross fades, those aren't supported in Entities Graphics. I haven't taken the time to figure out how the technique works and can be adapted to ECS. But if I do figure it out I'll definitely implement it.
     
    lclemens likes this.
  29. lclemens

    lclemens

    Joined:
    Feb 15, 2020
    Posts:
    761
    That's a good point about camera cuts, so if I ever need to deal with those I'll bump it back up to every frame because I easily can adjust the interval. Either way, per frame it's 5.5x faster than Unity's LOD calculation (15x if you count child-transforms) so it doesn't really matter too much.

    That code I pasted above is the core algorithm of my LOD system... it's embarrassingly trivial! There is a little bit of setup code by way of authoring components and such. I'll probably convert it to an array instead of mesh1, mesh2, mesh3, but that's about the only change I have planned and it'll only take about 15 minutes to implement.

    From what I read, they do crossfades by showing both meshes during the transition window and increase the transparency of the new one while decreasing the transparency of the old one. I'm guessing shaders would be used, but tackling that is above my paygrade.

    Anyway, thanks again for all your help!! You always have the best info!
     
  30. Jack_Martison

    Jack_Martison

    Joined:
    Jun 24, 2018
    Posts:
    143
    Sorry for necro, but perhaps you had success after 1.0 release? I'm still not satisfied with instancedindirect way, as it's still halving my FPS, I'm also making tropical island with kinda same goals as you.