Search Unity

Best practices for big data sets. Arrays on entities?

Discussion in 'Data Oriented Technology Stack' started by illinar, Mar 26, 2018.

  1. illinar

    illinar

    Joined:
    Apr 6, 2011
    Posts:
    563
    If I have big level made out of voxels, should I keep voxel data in arrays on entities? Can entities contain arrays?
     
    Last edited: Mar 26, 2018
  2. rastlin

    rastlin

    Joined:
    Jun 5, 2017
    Posts:
    100
    I never worked with voxel worlds, but what stops you from actually making each voxel an Entity?

    Ultimately this is how ECS is ultimately meant to be used as, the voxel is set of data/components (position, material? etc.), makes perfect sense to have it as an Entity to me.
     
    superpig and illinar like this.
  3. SubPixelPerfect

    SubPixelPerfect

    Joined:
    Oct 14, 2015
    Posts:
    184
    I think best way to store voxels is Texture3d and best way to iterate over it is a fragment shader
     
    illinar likes this.
  4. illinar

    illinar

    Joined:
    Apr 6, 2011
    Posts:
    563
    Thank you for answers things are starting to get clearer. To me, this really requires a new way of thinking.

    After some experimentation, I think I established that I can have fixed-size arrays in component structs. (EDIT: nope, only SharedComponentData, but be careful, I'm not sure how they are intended to be used, instead of adding fixed arrays to entities might be better, see details in the next posts.) Since I have 16x16x16 voxels per meter and some decent-sized levels, I will probably make blocks for each meter as entities, with 4K voxel arrays inside a component, to save memory. And a voxel is probably an int8 or int16 pointing to static immutable data from some kind of voxel library with colors and materials.

    1) Now I reckon I can rely on the entity order, so when a system gives me all the loaded blocks I can pull the subsets I need just by index. Then I can put them into an array and give it to jobs. Or mark them with components and let other systems deal with them.

    2) Alternatively, if I will be making a sparse block grid where blocks would have position values instead of fixed index, I can have systems emitting requests for blocks and one system then iterating over all blocks, checking positions, and marking requested blocks, then the next systems would work on the marked blocks. In that case, marking means adding components and that requires a sync point? How bad are those? In some cases, I could mark blocks one frame and then work on them the next frame.

    3) Or I can have Entities for pages of blocks with 512 block arrays (block entity "references"). So I can access a page by its index or by iterating over them and then access block "directly" by entity index, but now I don't know if that is efficient because I will have to use EntityManager.GetComponent for each block. I assume it's very fast..

    I kinda like how approach number 2 sounds.
     
    Last edited: Apr 8, 2018
  5. illinar

    illinar

    Joined:
    Apr 6, 2011
    Posts:
    563
    How would 3D texture approach integrate with ECS?
     
  6. SubPixelPerfect

    SubPixelPerfect

    Joined:
    Oct 14, 2015
    Posts:
    184
    nature of ecs speed is very similar to gpu,
    same approach is used - paralleled procedures on vectorised data
    the difference is that modern gpu can have 1000+ cores

    so i suggest to do it this way: any big operations (like world generation) should run on gpu,
    and some subset of data (like visible now) may be on ecs
     
    illinar likes this.
  7. illinar

    illinar

    Joined:
    Apr 6, 2011
    Posts:
    563
    Ok, I see. I have no world generation though so I'll stick with ECS.
     
  8. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    4,671
    I would either store one voxel in one entity. The overhead per entity compared to just the actual data of all the entities, is 16 bytes. So its really quite cheap in memory and the guranteed linear memory layout give great perf for iterating over all voxels.

    If you really need something tighter, store the data ready for the GPU in a ComputeBuffer / 3DTexture or something.
     
    SubPixelPerfect and illinar like this.
  9. tertle

    tertle

    Joined:
    Jan 25, 2011
    Posts:
    1,667
    Adrian-Anta likes this.
  10. illinar

    illinar

    Joined:
    Apr 6, 2011
    Posts:
    563
    @Joachim_Ante 16 bytes, Interesting. Thanks. The voxel is 1-2 bytes. With 16x16x16 voxels per meter with data that's about 70Kb overhead per cubic meter and that's too much.

    I'll have to use blocks of voxels. Or I could use entities for voxel objects but those aren't fixed-size.

    @tertle that's definitely a case where you should use entities. looks pretty straightforward, basically a voxel particle system, perfect for ECS, you just need to iterate. in my case though, I have thousands of times more voxels and I need to also navigate in 3d voxel space, check neighbors all the time.
     
  11. illinar

    illinar

    Joined:
    Apr 6, 2011
    Posts:
    563
    Is it possible to use multiple instances of the component per entity? If 16 bytes overhead per entity is too much and my component is just 1 byte, can I add 1000 of them (same component) to the entity?
     
    Last edited: Mar 30, 2018
  12. recursive

    recursive

    Joined:
    Jul 12, 2012
    Posts:
    591
    On this topic, I'm writing a custom collision system.

    If I were to store the raycast hit results for afiltered set of a character touched to use in the next frame, what's the best way to do it? Have a NativeArray or NativeList on a component? The results are not guaranteed to stay the same frame to frame.

    I'm doing this so that my Actor entities can track multiple moving platforms that can affect movement, since some of them may be semi-segmented with sub-collision detectors that can rest on more than 1 platform at a time.

    EDIT: Also to mention, a collider could be part of the entity system, or it could just be geo in the world. Since we don't have a full collision API in ECS yet, it could go either way (moving platforms would have entities, pure static geo would just be there to collide with and would have no other components or data other than maybe probuilder components).
     
  13. illinar

    illinar

    Joined:
    Apr 6, 2011
    Posts:
    563
    More news. The order of entities is not reliable.
     
  14. illinar

    illinar

    Joined:
    Apr 6, 2011
    Posts:
    563
    [QUOTE="recursive, post: 3444700, member: 117849"If I were to store the raycast hit results for afiltered set of a character touched to use in the next frame, what's the best way to do it? [/QUOTE]

    Not very clear. Store a bunch of raycasts for each character? Each raycast result should be an entity. And it can have a value or a component that points to a character. System would iterate through all of them next frame and will know what to do.

    If there is only one character then it's trivial. Just a bunch of GroundRaycastHit (or whatever they are) components.

    And I don't think you can store any arrays on entities.
     
    Last edited: Mar 31, 2018
    recursive likes this.
  15. recursive

    recursive

    Joined:
    Jul 12, 2012
    Posts:
    591
    Not very clear. Store a bunch of raycasts for each character? Each raycast result should be an entity. And it can have a value or a component that points to a character. System would iterate through all of them next frame and will know what to do.

    If there is only one character then it's trivial. Just a bunch of GroundRaycastHit (or whatever they are) components.

    And I don't think you can store any arrays on entities.[/QUOTE]

    I messed up with my previous reply and it wasn't clear (don't write forum posts tired and on a phone) - Is it safe to put a native container on n IComponentData or ISharedComponentData? I know entities aren't containers themselves.
     
  16. illinar

    illinar

    Joined:
    Apr 6, 2011
    Posts:
    563
    It's not unsafe, it impossible, it won't compile in Unity, because IComponentData have to be blittable and arrays are not. In some cases, it might be a good idea to put data on an array when it's simply too much data but then it has to be not an IComponentData but something else. Monobehaviour? I don't want to store it in systems. In most cases though, it seems like you can do without arrays.
     
    recursive likes this.
  17. recursive

    recursive

    Joined:
    Jul 12, 2012
    Posts:
    591
    Ah, I mistakenly thought Native containers could be blittable for some reason. "Results" entities it is, thanks!
     
  18. illinar

    illinar

    Joined:
    Apr 6, 2011
    Posts:
    563
    I thought the same. They mostly just have some security checks and custom allocation so they can be multithreaded and not leave garbage.
     
  19. illinar

    illinar

    Joined:
    Apr 6, 2011
    Posts:
    563
    Last edited: Apr 1, 2018
    andywatts and recursive like this.
  20. PlayingPolicy

    PlayingPolicy

    Joined:
    Dec 20, 2013
    Posts:
    14
    Be aware of general best practices, regardless of ECS. I mess around with voxel representations all day at work. The secret sauce of any advanced voxel stuff is compression, typically lossless. This is equally true for real-time and offline processes.

    RLE is commonplace. A slight variant where you distinguish runs of the same value and runs of values where compression doesn't help (I call them "gradient runs") is even more effective. In any case, pay close attention to your space-filling curve. The naive
    (a * bDim + b) * cDim + c
    is fine for most cases, but pay close attention to which axes are assigned to the high, middle and low orders. You want the low-order (a in the code there) to map to the most coherent axis, which for terrain in a Minecraft-like game is the y or up/down axis. Alternatively, use Morton order, which gives you full 3D locality for compression.

    There are fancier sparse encodings that better support random access, like the VOLA format (also used by Remedy in Quantum Break). It's basically a 64-ary binary trie. In fact, even just a single "flat" store of 64-bit masks (for a 4^3 mini-chunk of voxels) with corresponding e.g. 16-bit prefix sums only consumes just over a bit per voxel. 5 KB overhead for a 32^3 chunk, vs 32 KB of dense storage even at a tiny 1 byte per voxel. Beyond that, you only pay for the voxels that are really present or interesting. This is a tool in my toolbox I often use.

    Note that you can make some pretty bold assumptions about the sparsity of real voxel data. A chunk doesn't even need to allocate conservatively. In practice, no vaguely coherent voxel boundary, no matter how it's generated/modified, will ever contain more than 1/8 of the full dense storage. Take that figure with a grain of salt as it's sensitive to the chunk size (finer chunks -> more density), and do your own testing, but the point is you don't even need to allocate the worst-case number of voxels per chunk. All chunks that scrape close to needing full dense storage just look like noise.
     
  21. illinar

    illinar

    Joined:
    Apr 6, 2011
    Posts:
    563
    @PlayingPolicy Thank you very much, this is very interesting. I liked the Quantum Break approach. I like the branching factor of 64. I want to implement this.

    Even before, I started thinking that I might not get away with dense voxel grid because of pathfinding. Now I think that with sparse voxels I can also do an efficient custom raycasting and collisions.

    And I was worried about poly count in a destructible environment above all else, and I guess I'm just gonna take the same approach and do lower res voxels under the top layers. (Of course, I was planning to use lower res for distant objects anyway) I might need branching factor of 8 though.

    The max array size that an Entity can handle is 16KB, so what chunk size we are looking at? Keeping in mind the destruction element although lower layers are lower rez. Seems like 32^3 chunks or 2m^3. (and what if branching factor is 8?).

    Now how to organize chunks in the level (a tree again or a fixed grid?). There might not be a need for compression beyond 2m chunks. I think I could fit my levels into ~100k chunks. Chunks will be entities in a grid. The grid can potentially be fixed and used as a 3D array (gotta test it) or might be better to use chunks sparsely.

    Could be nice to have one voxel overlap with the neighbor chunks, so that I could generate mesh without needing to visit the neighbors. Chunk would store neighbors but I wonder if loading them up while processing current chunk will slow down the process a bit.



    Tested entity order. It seems to remain intact as long as long as the archetype remains unchanged.

    A little worried that accessing data in 8-factor trees takes up to 6 steps already.

    I can have 4 bits of voxel data with a 16-byte dictionary per chunk, instead of 8bits per voxel.
     
    Last edited: Apr 2, 2018
  22. PlayingPolicy

    PlayingPolicy

    Joined:
    Dec 20, 2013
    Posts:
    14
    Yeah 32^3 seems the only way to go for a 16KB limit. For chunks that small, there's not much point in storing a hierarchy. Consider the "flat" masks-with-prefix-sums sparse representation mentioned before. Not sure how to tie in LODs with that though. In any case, with that chunk size and a 16KB limit, you should be able to store pretty rich voxels.


    Since you're storing an environment, it's likely essential to store them sparsely. Remember the point about smaller chunks requiring denser storage? The same principle holds for any structured data. Largeness implies sparseness, smallness implies denseness. Most people use hashing to index chunks, which works out just fine, and seems to be the preferred method for spatial indices in Unity's new ECS anyway.


    I've done this and I can't recommend it. For one thing, the legwork of meshing the voxels on the CPU is minor compared to cost of shuttling that generated mesh data around, especially through the Unity engine. So you don't lose much by inspecting neighboring chunks' voxels. Worse, the overlap thing introduces complexity everywhere else in the system, even for things that ought to be simple. You want to index a voxel? Consider the overlap. You want to modify a voxel? Make sure you mirror those changes to all chunks' voxels.

    Finally, it's hard to say a-priori how much overlap you even need. In almost all cases you can probably get away with a 1 voxel overlap per side, to make the 26-ary face/edge/vertex connected voxels available to every voxel. But that works out to be a tremendous waste of memory and bandwidth. Long story short: don't store the overlap.

    Of interest here is the idea of voxel rasterization. See Gavan Woolery and Alex Evans on the topic. Meshing and ray tracing are both fundamentally flawed; rasterization maps better to the hardware. Voxel rasterization works so well that I'm confident it's the future of real-time rendering in general. One nice benefit if you go this route is that you don't even need to access voxel neighbors when rendering, until you're working in screen-space.
     
    andywatts likes this.
  23. illinar

    illinar

    Joined:
    Apr 6, 2011
    Posts:
    563
    @PlayingPolicy I agree on all points. And awesome links, thanks, I remember Woolery, I wanted to do voxel rasterization for a long time but I'm not prepared to do graphics programming for now.

    After a day of considering all the options, I think I figured out the best solution for me.

    My requirements:
    • LoDs for pathfinding and physics raycasting.
    • Mixed voxel sizes per chunk, 2x scale factor, for aesthetic and polycount reasons.
    • Destructible environments.
    • Quick access to voxel data.
    • Sparse chunks. On the scale of 1-2m chunks being sparse saves way more memory than voxels being sparse.
    • Sometimes pretty dense geometry.
    • 1-2 million cubic meter levels.
    Solution:
    • A fixed grid of entities with ChunkIndex components. Accessed directly by wold position. (~50MB)
    • Entity chunks. Used sparsely. Chunk array indexes are stored in ChunkIndex grid cells.
    • EDIT: since then I've learned about arrays on ISharedComponentData, so I would probably use an array instead of entities for indexing chunks.
    • Chunks size is 1m and ~2-4KB. They contain a dictionary, a flat 4-8bit voxel array, and LoDs.
    Exactly how LoDs are created I have to still figure out, but it looks good so far.

    So here to access voxels a system has the grid indexes array injected and chunks injected separately. It gets desired chunk index from the grid by flattening its 3D index. And takes the chunk of that index from chunk array. Then can directly access any voxel of any LoD.

    I can even scale chunks to my needs. I can add and remove them almost at no cost.

    LoDs can be stored as separate components which is great because some systems only need LoDs and not the actual voxel data.
     
    Last edited: Apr 9, 2018
  24. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    2,373
    Not sure how much is applicable it would be for this specific problem, but there is actually quite a bit of work being done on integer compression outside of games. This is an interesting read.

    https://lemire.me/blog/2017/09/27/stream-vbyte-breaking-new-speed-records-for-integer-compression/

    So off the top of my head the first thing I'd want to do is get rid of floats. They are expensive in both time and space, mostly space. I'd probably create my own number struct with two fields, one for the whole number and the other for the fraction. Both would be integers. The extra math involved to reason about precision is going to be cheap, cpu operations are cheap anyways, the cheapest part of the overall equation.

    Smart partitioning to lower the precision and overall number lengths I need would be part of that.

    For serialization then just varint encode. I'd probably just use protobuf with the packed format.

    I've used variants of the above for general multiplayer space optimization quite a bit. But never tackled a voxel system so specific approaches for voxel structures I have no idea.
     
  25. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    2,373
    Another approach that might be interesting to test. If voxels are all uniform size then create a structure were location is implied. So you trade less space for more time. Given how cheap cpu operations are and how relatively expensive moving memory around is, I wonder if that would be a good trade off.

    So for example knowing the chunk index, chunk dimension, and voxel index, you can calculate the voxel position.
     
  26. illinar

    illinar

    Joined:
    Apr 6, 2011
    Posts:
    563
    That would work in some cases, but too much complexity, lower performance, and almost no benefits in my case.