Question Looking for NativeCollection advice: "Buffer of Buffers"

PublicEnumE · May 9, 2021

Hello. I've been struggling a bit to find a solution for something. I'm looking for advice about which Native or Unsafe collection types might be used to solve this problem:

For the sake of example:

My game has a restaurant which can serve 100 different menu items.

The game has customers, who can order items from the menu. Each customer can order one item per frame (it's a business-tycoon style sim game).

Each character that wishes to order that frame has a "PendingOrder" component, with data about which item they want (represented by an int).

I have a job in which I want to collect all of these orders into a central structure. This structure would allow customers to submit their orders in parallel. The structure would be ordered in such a way that, when this job is complete, it would be easy to iterate over every order for a specific menu item.

One big requirement for this collection is that I don't want to resize it each frame, based on the number of customers. Unfortunately, I believe that rules out NativeMultiHashMap.

Here's a general idea of how I was hoping to lay out the data:

I was hoping for something like an UnsafeList<UnsafeStream>.

The UnsafeList would contain a separate UnsafeStream for each menu item type.

Each of the UnsafeStreams would be allocated with a ForEach count = to JobsUtility.MaxThreadCount.

The idea is that each thread could safely append orders to any of the UnsafeStreams in parallel. In my ordering job (an IJobChunk), I would read the order type for a given customer, access the NativeStream for that menu item type, and then write my order to the ForEach index of that stream which represents my current thread.

That's what I wanted, anyway. As you might have figured out, that doesn't work. The main reason it seems not to work, is that UnsafeStream only stores the values you added during the last time you called BeginForEachIndex() for a given index. There's no way to append items later, to the same ForEach index that you used prior. So there's no good way to have one ForEachIdex represent each thread.

By this point, I bet I've said many stupid things that are ignorant of the way UnsafeStream is meant to work.

If you understand this better than me, please help. I'm interested in any way to get the results I'm after: A NativeCollection setup which:

Stores orders by type

Can be safely written to in parallel

Doesn't need to be resized based on the producer count

Can be easily read later, to iterate over all of the orders for one type.

Thank you very much for reading all of this.

varnon · May 9, 2021

My first thought based on the thread title was a dynamic buffer of entities (each with their own dynamic buffer). The individual entities could be menu item entities with buffers for customers (it sounds like this is what you want). From what I understand, that seems like that should work for your purposes.

If you didn't need to have the orders grouped by type, I would have thought use a dynamic buffer with a custom struct for order type and customer.

PublicEnumE · May 9, 2021

varnon said: ↑

My first thought based on the thread title was a dynamic buffer of entities (each with their own dynamic buffer). The individual entities could be menu item entities with buffers for customers (it sounds like this is what you want). From what I understand, that seems like that should work for your purposes.
Click to expand...

I would expect this design to perform much slower than a NativeCollection approach. Am I wrong?

varnon · May 9, 2021

Yeah, it seems possibly slower, but I'm not sure how the performance would really compare. From your initial statement it seemed important to group things by order type, and that seemed the most direct way to get there.

DreamingImLatios · May 9, 2021

I think what you are looking for would be an UnsafeList of this: https://github.com/Dreaming381/Lati...Core/Core/Internal/UnsafeParallelBlockList.cs

This is a description of how it works: https://github.com/Dreaming381/Lati... Command Buffers 1.md#unsafeparallelblocklist

I plan on turning the list of block lists into a full-fledged NativeContainer at some point for exactly the type of problem you are running into.

PublicEnumE · May 10, 2021

This is exactly the type of structure I was looking to write. It's expertly made - Thank you for sharing it!

After digging into NativeStream, I was starting to work on something like this, but it would have taken me a long time, as my NativeCollection skills are still shakey. Since a design of 'growing block list per thread' didn't already exist, I was starting to worry that there was some mathematical reason why. But your design does exactly that, with seemingly no real compromises.

There didn't seem to be a license or permissions given at those links. Is this code free to use in all contexts? Thank you again.

tertle · May 10, 2021

PublicEnumE said: ↑

This is exactly the type of structure I was looking to write. It's expertly made - Thank you for sharing it!

I really just wanted to allocate a growing series of blocks, per thread. Since that didn't already exist, I was starting to worry that there was some mathematical problem with that design that I wasn't aware of. My knowledge of memory management is still growing, and I have run face-first into unforeseen gotchas before. But your design does exactly that, with seemingly no compromises.

There didn't seem to be a license or permissions given at those links. Is this code free to use in all contexts? Thank you again.
Click to expand...

https://github.com/Dreaming381/Latios-Framework/blob/v0.3.1/LICENSE.md

PublicEnumE · May 10, 2021

Thank you @tertle.

PublicEnumE · May 10, 2021

@DreamingImLatios - a question about your code. In your constructor for UnsafeParallelBlockList, you create a null pointer, and then increment it (lines 12 and 13, below):

Code (CSharp):

public UnsafeParallelBlockList(int elementSize, int elementsPerBlock, Allocator allocator)

{

m_elementSize = elementSize;

m_elementsPerBlock = elementsPerBlock;

m_blockSize = elementSize * elementsPerBlock;

m_allocator = allocator;

m_perThreadBlockLists = (PerThreadBlockList*)UnsafeUtility.Malloc(64 * JobsUtility.MaxJobThreadCount, 64, allocator);

for (int i = 0; i < JobsUtility.MaxJobThreadCount; i++)

{

m_perThreadBlockLists[i].lastByteAddressInBlock = null;

m_perThreadBlockLists[i].nextWriteAddress = null;

m_perThreadBlockLists[i].nextWriteAddress++;

m_perThreadBlockLists[i].elementCount = 0;

}

}

I was under the impression incrementing a null pointer was undefined behavior. Maybe I'll learn something here - was there a reason for this? Why not just leave it as null?

DreamingImLatios · May 10, 2021

Sorry for the delay. I was out today. I do this to trigger this codepath for the first add per thread: https://github.com/Dreaming381/Lati.../Core/Internal/UnsafeParallelBlockList.cs#L40

PublicEnumE · May 10, 2021

No worries - Thanks for the response.

Where are you expecting nextWriteAddress to point after like 13? What I don’t understand is what incrementing a null pointer is supposed to do. I believe that’s undefined behavior in C derived languages. I would expect the address of null++ to be unreliable.

Update: Ok, I finally grokked what you're doing. You don't care what it's pointing to, only that's it's > lastByteAddressInBlock for that first add. That's how you're doing the JIT approach to adding, rather than allocating one block ahead. Smart.

I'm not sure how reliable this type of thing is in general. I'm found evidence that some compilers may optimize null++ out, to avoid hardware faults on some machines. On other compilers, it'll just == 4 or 8 (which is likely what's happening in Unity). Since it's UB, this doesn't seem wise as a general c# approach. But if it's working in Unity, then I guess there's no harm!

Once again, thank you for sharing this.

PublicEnumE · May 10, 2021

@DreamingImLatios followup questions for you:

What is an optimum block size? Is that even a thing? Or is it largely preferential?

How do you personally go about choosing an optimal element count per block?

Is there a mathematical way to calculate an optimal count, based on the element type’s size?

Would an optimum block size be different for writing vs. reading?

DreamingImLatios · May 10, 2021

PublicEnumE said: ↑

I'm not sure how reliable this type of thing is in general. I'm found evidence that some compilers may optimize null++ out, to avoid hardware faults on some machines. On other compilers, it'll just == 4 or 8 (which is likely what's happening in Unity). Since it's UB, this doesn't seem wise as a general c# approach. But if it's working in Unity, then I guess there's no harm!
Click to expand...

https://docs.microsoft.com/en-us/do...guage-specification/unsafe-code#pointer-types

A pointer with value null is represented by all-bits-zero.
Click to expand...

So unless C# manually checks for null on increment operators, this seems pretty well-defined.

PublicEnumE said: ↑

@DreamingImLatios followup questions for you:

What is an optimum block size? Is that even a thing? Or is it largely preferential?

How do you personally go about choosing an optimal element count per block?

Is there a mathematical way to calculate an optimal count, based on the element type’s size?

Would an optimum block size be different for writing vs. reading?

Click to expand...

Block size is largely a tradeoff in wasteful allocations versus frequency of allocations and jumps in memory.
For large volumes of data, you want a block size greater or equal to 4kB, since that's the page size and the maximum span that the prefetcher can operate on. But that can be wasteful if most blocks only ever get 1 or 2 elements. I haven't done much experimentation with block sizes personally. So far it behaves similar to innerLoopBatchCount where non-extreme values have negligible differences compared to the rest of the job.

Micz84 · May 11, 2021

I was wondering wouldn't it be easier to just create ISharedComponetData with an item index. Then when an order is placed create an entity with CustomerComponet that points to the customer entity and item shared component data? Then you could easily find all items that have order using GetAllUniqueSharedComponentData. The main drawback of this solution I see is the creation of these order entities may not be parallel.

Search Unity

Unity ID

Useful Searches

Question Looking for NativeCollection advice: "Buffer of Buffers"