Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. We have updated the language to the Editor Terms based on feedback from our employees and community. Learn more.
    Dismiss Notice

Question Best practices on dealing with large amounts of culled, passive entities?

Discussion in 'Entity Component System' started by Multiainen, Jul 24, 2023.

  1. Multiainen

    Multiainen

    Joined:
    Oct 9, 2019
    Posts:
    16
    Is there a "correct" way to minimize the performance hit of a lot of off-camera entities not queried by any systems? I'm working on a city builder / TBS where there may be up to hundreds of thousands of such entities on larger maps (buildings and foliage patches, mostly), and wondering whether it would make more sense to use the Disabled component, just leave them as is or if there are any other sensible options (searched for these, but couldn't find anything viable). The entities in this case only contain an empty tag component and an ID component with a single int value, so I'm unsure whether disabling or pooling them would even bring any benefit.
     
  2. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    3,993
    First off, is there a performance hit? Check the profiler to make sure you aren't trying to optimize for a problem that doesn't exist.

    Second, do the entities have any transform or rendering components, or are they truly just the entity and value which would give you 24 bytes per entity? If the latter, consider using a NativeList instead, because you want entities to be close to 128 bytes in size for efficient chunk utilization.
     
  3. Multiainen

    Multiainen

    Joined:
    Oct 9, 2019
    Posts:
    16
    Apologies, I should've been more specific.

    The reason I posted this was that I was noticing a hit of 2-3 ms in the stats with a test of 300 000 culled entities, as well as an overhead of about 1.5GB. I tried it with the profiler now, and the hit comes closer to 4 ms with that. Not unbearable on either front, but not ideal either.

    I formulated the last sentence of my original post poorly; I meant the entities contain only those simple components additionally to the ones required for rendering, since they're static, rendered objects in the terrain. So the full list of components would be Local to world, Chunk world render bounds, Entities graphics chunk info, Material mesh info, Render bounds, Render filter settings, Render mesh array, World render bounds, the tag and the ID. So the total size is probably quite large, but I understood the CPU load of the rendering components should be practically void while the entity is culled.

    EDIT: The size of each entity is 608 bytes, to be specific; at least according to the archetype inspector.
     
    Last edited: Jul 24, 2023
  4. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    3,993
    That makes more sense. No worries.

    I think I know what you are seeing, but just to be certain, I suggest posting a screenshot of the timeline view of the profiler with both the main thread and worker threads expanded. In addition, make sure you are profiling with safety checks disabled (or profiling in a build).

    If safety checks aren't the main issue, one thing you can do is divide your world into a coarse grid, and assign each static entity a shared component with an index into the grid. It doesn't need to be super precise. Effectively what you'll be doing is making static entities nearby each other lie in the same ECS chunks, which will help Entities Graphics use a fast path for culling most of those chunks.

    As for memory, if you want that to be lower, you may want to consider subscene streaming.

    This isn't actually true. Culling hurts CPU performance. But it massively improves GPU performance, which is why people do it. Same applies for LODs.
     
    Arnold_2013 and Multiainen like this.
  5. Multiainen

    Multiainen

    Joined:
    Oct 9, 2019
    Posts:
    16
    It looks like you were absolutely correct, turning off safety checks both with and without the entities reduced the hit to only around 1.5ms. About half of that came from the editor loop, the other half looks to be from the graphics system, mostly batch updates (from nothing to 0.53ms). This is already an incredible improvement, but I'll attach the screenshot regardless (I'm not great with the profiler UI so I couldn't figure out how to get the worker thread details up simultaneously, sorry; they were largely idle, though.)

    upload_2023-7-24_21-32-58.png

    Interesting, so the chunk division of entities is based entirely on their archetype? This sounds exactly like what I was looking for, thank you; with gameobjects, mesh combining and chunk rendering worked great aside from the horrific memory cost of generated meshes, so I was indeed wondering if there would be an elegant equivalent with ECS.

    I'll have to look into that, thanks! That current memory hit in an extreme case isn't the end of the world (it's very much a relief compared to the hundredfold equivalent I had with gameobjects), but more optimization couldn't hurt since the game will ultimately include switching dynamically between several maps.

    That's very good to know, I still get confused on the inner workings of nearly anything computational. With these figures and in this case, it definitely seems worth it.
     
  6. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    3,993
    On the left side, there's a spot where it says "Hierarchy". Click on that, and there should be an option for "Timeline".
    Specifically, Entities are grouped into chunks by archetypes, and as an optimization, Unity computes bounds for each chunk as a preliminary culling check. The tighter those bounds (by having entities closer together in the same chunk), the more likely the whole chunk can be skipped and per-entity culling doesn't need to happen.