Search Unity

Question Looking for NativeCollection advice: "Buffer of Buffers"

Discussion in 'Entity Component System' started by PublicEnumE, May 9, 2021.

  1. PublicEnumE

    PublicEnumE

    Joined:
    Feb 3, 2019
    Posts:
    729
    Hello. I've been struggling a bit to find a solution for something. I'm looking for advice about which Native or Unsafe collection types might be used to solve this problem:

    For the sake of example:
    • My game has a restaurant which can serve 100 different menu items.
    • The game has customers, who can order items from the menu. Each customer can order one item per frame (it's a business-tycoon style sim game).
    • Each character that wishes to order that frame has a "PendingOrder" component, with data about which item they want (represented by an int).
    I have a job in which I want to collect all of these orders into a central structure. This structure would allow customers to submit their orders in parallel. The structure would be ordered in such a way that, when this job is complete, it would be easy to iterate over every order for a specific menu item.

    One big requirement for this collection is that I don't want to resize it each frame, based on the number of customers. Unfortunately, I believe that rules out NativeMultiHashMap.

    Here's a general idea of how I was hoping to lay out the data:
    • I was hoping for something like an UnsafeList<UnsafeStream>.
    • The UnsafeList would contain a separate UnsafeStream for each menu item type.
    • Each of the UnsafeStreams would be allocated with a ForEach count = to JobsUtility.MaxThreadCount.
    The idea is that each thread could safely append orders to any of the UnsafeStreams in parallel. In my ordering job (an IJobChunk), I would read the order type for a given customer, access the NativeStream for that menu item type, and then write my order to the ForEach index of that stream which represents my current thread.

    That's what I wanted, anyway. As you might have figured out, that doesn't work. The main reason it seems not to work, is that UnsafeStream only stores the values you added during the last time you called BeginForEachIndex() for a given index. There's no way to append items later, to the same ForEach index that you used prior. So there's no good way to have one ForEachIdex represent each thread.

    By this point, I bet I've said many stupid things that are ignorant of the way UnsafeStream is meant to work.

    If you understand this better than me, please help. I'm interested in any way to get the results I'm after: A NativeCollection setup which:
    • Stores orders by type
    • Can be safely written to in parallel
    • Doesn't need to be resized based on the producer count
    • Can be easily read later, to iterate over all of the orders for one type.
    Thank you very much for reading all of this.
     
    Last edited: May 9, 2021
  2. varnon

    varnon

    Joined:
    Jan 14, 2017
    Posts:
    52
    My first thought based on the thread title was a dynamic buffer of entities (each with their own dynamic buffer). The individual entities could be menu item entities with buffers for customers (it sounds like this is what you want). From what I understand, that seems like that should work for your purposes.

    If you didn't need to have the orders grouped by type, I would have thought use a dynamic buffer with a custom struct for order type and customer.
     
  3. PublicEnumE

    PublicEnumE

    Joined:
    Feb 3, 2019
    Posts:
    729
    I would expect this design to perform much slower than a NativeCollection approach. Am I wrong?
     
  4. varnon

    varnon

    Joined:
    Jan 14, 2017
    Posts:
    52
    Yeah, it seems possibly slower, but I'm not sure how the performance would really compare. From your initial statement it seemed important to group things by order type, and that seemed the most direct way to get there.
     
  5. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    4,264
  6. PublicEnumE

    PublicEnumE

    Joined:
    Feb 3, 2019
    Posts:
    729
    This is exactly the type of structure I was looking to write. It's expertly made - Thank you for sharing it!

    After digging into NativeStream, I was starting to work on something like this, but it would have taken me a long time, as my NativeCollection skills are still shakey. Since a design of 'growing block list per thread' didn't already exist, I was starting to worry that there was some mathematical reason why. But your design does exactly that, with seemingly no real compromises.

    There didn't seem to be a license or permissions given at those links. Is this code free to use in all contexts? Thank you again.
     
    Last edited: May 10, 2021
  7. tertle

    tertle

    Joined:
    Jan 25, 2011
    Posts:
    3,761
    https://github.com/Dreaming381/Latios-Framework/blob/v0.3.1/LICENSE.md
     
    Occuros and PublicEnumE like this.
  8. PublicEnumE

    PublicEnumE

    Joined:
    Feb 3, 2019
    Posts:
    729
    Last edited: May 10, 2021
  9. PublicEnumE

    PublicEnumE

    Joined:
    Feb 3, 2019
    Posts:
    729
    @DreamingImLatios - a question about your code. In your constructor for UnsafeParallelBlockList, you create a null pointer, and then increment it (lines 12 and 13, below):

    Code (CSharp):
    1. public UnsafeParallelBlockList(int elementSize, int elementsPerBlock, Allocator allocator)
    2. {
    3.     m_elementSize = elementSize;
    4.     m_elementsPerBlock = elementsPerBlock;
    5.     m_blockSize = elementSize * elementsPerBlock;
    6.     m_allocator = allocator;
    7.  
    8.     m_perThreadBlockLists = (PerThreadBlockList*)UnsafeUtility.Malloc(64 * JobsUtility.MaxJobThreadCount, 64, allocator);
    9.     for (int i = 0; i < JobsUtility.MaxJobThreadCount; i++)
    10.     {
    11.         m_perThreadBlockLists[i].lastByteAddressInBlock = null;
    12.         m_perThreadBlockLists[i].nextWriteAddress = null;
    13.         m_perThreadBlockLists[i].nextWriteAddress++;
    14.         m_perThreadBlockLists[i].elementCount = 0;
    15.     }
    16. }
    I was under the impression incrementing a null pointer was undefined behavior. Maybe I'll learn something here - was there a reason for this? Why not just leave it as null?
     
  10. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    4,264
  11. PublicEnumE

    PublicEnumE

    Joined:
    Feb 3, 2019
    Posts:
    729
    No worries - Thanks for the response.

    Where are you expecting nextWriteAddress to point after like 13? What I don’t understand is what incrementing a null pointer is supposed to do. I believe that’s undefined behavior in C derived languages. I would expect the address of null++ to be unreliable.

    Update: Ok, I finally grokked what you're doing. You don't care what it's pointing to, only that's it's > lastByteAddressInBlock for that first add. That's how you're doing the JIT approach to adding, rather than allocating one block ahead. Smart. :)

    I'm not sure how reliable this type of thing is in general. I'm found evidence that some compilers may optimize null++ out, to avoid hardware faults on some machines. On other compilers, it'll just == 4 or 8 (which is likely what's happening in Unity). Since it's UB, this doesn't seem wise as a general c# approach. But if it's working in Unity, then I guess there's no harm!

    Once again, thank you for sharing this.
     
    Last edited: May 10, 2021
  12. PublicEnumE

    PublicEnumE

    Joined:
    Feb 3, 2019
    Posts:
    729
    @DreamingImLatios followup questions for you:
    1. What is an optimum block size? Is that even a thing? Or is it largely preferential?
    2. How do you personally go about choosing an optimal element count per block?
    3. Is there a mathematical way to calculate an optimal count, based on the element type’s size?
    4. Would an optimum block size be different for writing vs. reading?
     
    Last edited: May 10, 2021
  13. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    4,264
    https://docs.microsoft.com/en-us/do...guage-specification/unsafe-code#pointer-types
    So unless C# manually checks for null on increment operators, this seems pretty well-defined.

    Block size is largely a tradeoff in wasteful allocations versus frequency of allocations and jumps in memory.
    For large volumes of data, you want a block size greater or equal to 4kB, since that's the page size and the maximum span that the prefetcher can operate on. But that can be wasteful if most blocks only ever get 1 or 2 elements. I haven't done much experimentation with block sizes personally. So far it behaves similar to innerLoopBatchCount where non-extreme values have negligible differences compared to the rest of the job.
     
    Occuros likes this.
  14. Micz84

    Micz84

    Joined:
    Jul 21, 2012
    Posts:
    451
    I was wondering wouldn't it be easier to just create ISharedComponetData with an item index. Then when an order is placed create an entity with CustomerComponet that points to the customer entity and item shared component data? Then you could easily find all items that have order using GetAllUniqueSharedComponentData. The main drawback of this solution I see is the creation of these order entities may not be parallel.