NativeArray vs DynamicBuffer : Which access data faster?

Antypodish · Nov 17, 2018

Well, I know similar topics has been discussed over.
But I would like have NativeArray and DynamicBuffer little confirmation.

In one example, lets say I have NativeArray with some values. And per entity I access data with index offset.
In other example I have entities with DynamicBuffer instead.
In either example I can access same data.

While we know buffer is more flexible, do actually accessing data via NativeArray is faster, than buffer?
If anyone knows?

fholm · Nov 17, 2018

If you are talking purely about read/write performance when get/set:ing indices, then NativeArray<T> is faster because it need to do less.

Edit: Most likely not by much tho.

Antypodish · Nov 17, 2018

I suspect difference will be minimal.

But I think, because in job for DynamicBuffer to get data I need first fetch array from entity

Code (CSharp):

DynamicBuffer <SomeBufferElement> someDynamicBuffer = someBufferElement [entity] ;

Then I use index.
Which is adds extra step, in comparison to NativeArray.

Unless accessing entity buffer array is somehow optimized. I.e. with Burst.

I am trying establish best performant route, as I do expect have many iterations in job, accessing different elements in array.

fholm · Nov 17, 2018

Antypodish said: ↑

I suspect difference will be minimal.

But I think, because in job for DynamicBuffer to get data I need first fetch array from entity

Code (CSharp):

DynamicBuffer <SomeBufferElement> someDynamicBuffer = someBufferElement [entity] ;

Then I use index.
Which is adds extra step, in comparison to NativeArray.

Unless accessing entity buffer array is somehow optimized. I.e. with Burst.

I am trying establish best performant route, as I do expect have many iterations in job, accessing different elements in array.
Click to expand...

Generally speaking in most languages a regular flat array that's not trying to add any abstraction on top is always going to be the fastest

Antypodish · Nov 17, 2018

Yep. you are right.
I perhaps will stay with my arrays.
I was thinking maybe convert to buffer, for better flexibility. But is not big deal, since I don't need flexibility often.

fholm · Nov 17, 2018

If you want absolute performance, the fastest way of reading an NativeArray<T> is this:

Code (csharp):

var targetsArray = new NativeArray<Target>(....);

var targetsArrayPtr = (Byte*)NativeArrayUnsafeUtility.GetUnsafeReadOnlyPtr(targetsArray);

for (Int32 i = 0; i < targetsArray.Length; ++i) {

var targetPtr = (Target*)(targetsArrayPtr + (sizeof(Target) * i));

}

This is what the regular array indexer property does when it calls UnsafeUtility.ReadArrayElement, but using this way you don't have to go through two methods (the indexer getter + ReadArrayElement) and you don't need to copy the value either since you can just use the pointer as is.

Edit: You can also use NativeArrayUnsafeUtility.GetUnsafePtr if you want a read/write pointer and not just a read pointer.

Antypodish · Nov 17, 2018

That is valid point. Thx
I forgot about pointer, even I see topics about it quite often.

Seams like good deal to me.

Is there a reason for using Byte rather than byte as a pointer?

fholm · Nov 17, 2018

Antypodish said: ↑

Is there a reason for using Byte rather than byte as a pointer?
Click to expand...

Nope, same thing, just personal preference on my end when writing the code

Antypodish · Nov 17, 2018

Thx

JooleanLogic · Nov 17, 2018

If you have entities that change archetypes, could you end up with fragmented data access using a NativeArray?

I haven't used dynamic buffers but my understanding is that they're laid out in a stream within the chunk just like other components?
If so, then you will benefit from contiguous linear access of the buffers even if entities change archetypes.
With an index offset into a NativeArray, the data is static so if the order of your entities changes, the memory access into the NativeArray will become fragmented. Unless I've misunderstood what you're doing.

The code overhead of accessing a dynamic buffer is possibly less than the penalty of cache misses.

Also can you not just cast it directly to a Target*?

Code (CSharp):

var targetsArray = new NativeArray<Target>(....);

Target* pTarget = (Target*)NativeArrayUnsafeUtility.GetUnsafePtr(targetsArray);

for (Int32 i = 0; i < targetsArray.Length; ++i, pTarget++) {

pTarget->whatever += ...

}

Antypodish · Nov 17, 2018

jooleanlogic said: ↑

With an index offset into a NativeArray, the data is static so if the order of your entities changes, the memory access into the NativeArray will become fragmented. Unless I've misunderstood what you're doing.
Click to expand...

Well no, since I am not moving NativeArray data with entities. Entites hold only offset index information to access the array. So I can shuffle entities as much as I like.

However, not sure how dynamic buffer arrays are structured. I would expect more them to be fragmented, when shifting, copy, delete entities. But I may be wrong here.

jooleanlogic said: ↑

Also can you not just cast it directly to a Target*?
Click to expand...

Yep, looking into pointer solutions. But I need be careful, as I haven't used them before. So I need validate in my design, they don't conflict. That could be, if I only do own coded error, since I don't write to same index, or even entity from multiple jobs at same time.

JooleanLogic · Nov 17, 2018

Antypodish said: ↑

Well no, since I am not moving NativeArray data with entities. Entites hold only offset index information to access the array. So I can shuffle entities as much as I like.
Click to expand...

My explanation probably wasn't very good. What I'm referring to is just the possible performance effect of the memory access pattern using your NativeArray approach, not that it won't work at all.

E.g. If you have entities A to E with a component that stores an index offset into a NativeArray like so

Code (CSharp):

A B C D E // Entities

0 1 2 3 4 // Index offset

...

NativeArray<int> NA = new NativeArray<int>(5);

// If the order of your entities changes like this

E B D C A

// Then the memory access pattern into NA becomes

4 1 3 2 0 // NA[4], NA[1] etc

It'll still work just fine, but your memory access pattern won't be linear. The pre-fetch penalty for that non-linear access might outweigh any code overhead of dynamic buffers.

With dynamic buffers, I'm assuming it's just like ComponentData (but not sure)

Code (CSharp):

A B C D E // Entities

0 1 2 3 4 // Dynamic buffer data

// If the order of entities changes to this

E B D C A

4 1 3 2 0 // Dynamic buffer data

it doesn't matter because all your dynamic buffer data is still accessed linearly.

Antypodish said: ↑

I would expect more them to be fragmented, when shifting, copy, delete entities. But I may be wrong here..
Click to expand...

They won't be fragmented but yes perhaps all the copying overhead when entities move might be a factor if you have large buffer arrays.

fholm · Nov 17, 2018

jooleanlogic said: ↑

Also can you not just cast it directly to a Target*?
Click to expand...

Yes you are 100% correct, i converted my example from a piece of code which uses a generic type <T>, and you can't do generic pointers, so missed that

JooleanLogic · Nov 17, 2018

fholm said: ↑

Yes you are 100% correct
Click to expand...

Swish.

Antypodish · Nov 17, 2018

jooleanlogic said: ↑

It'll still work just fine, but your memory access pattern won't be linear. The pre-fetch penalty for that non-linear access might outweigh any code overhead of dynamic buffers.
Click to expand...

Nice example.
But I maybe misunderstood memory accessing principles for NA.
For what I understand, If I have linearly allocated NA, I know where is its start, and my index, it shouldn't be matter where I access it from, or in which order. NA technically never moves, once is allocated. Wouldn't be i-5, i+1, i+32 exact same speed? Or does that introduces cache misses as well?

I really assume both NA and buffer allocate similarly, providing buffer data is not shifted along with entities. Which I think is the case for buffer with moving chunks, or entities reordering.

But unless someone will confirm, how buffer actually behaves, we can write hypothesis

JooleanLogic · Nov 17, 2018

Antypodish said: ↑

Wouldn't be i-5, i+1, i+32 exact same speed? Or does that introduces cache misses as well?
Click to expand...

Maybe and maybe. The speed advantage comes from the pre-fetcher so accessing data linearly does matter and you might also get cache misses. How much of an impact it has though totally depends on the number of entities and buffer size vs cache size.
I was just raising it within the context of the question as something to think about.

Antypodish said: ↑

I really assume both NA and buffer allocate similarly, providing buffer data is not shifted along with entities
Click to expand...

I imagined they were stored in chunks (unless they're too big) as they form part of the archetype, however from the docs here, it doesn't specifically say that. I'll have to look at the source.
In the event of heap allocation, then maybe the NativeArray approach might be better.

Antypodish · Nov 17, 2018

jooleanlogic said: ↑

I was just raising it within the context of the question as something to think about.
Click to expand...

Well, definitely you gave to think about it

jooleanlogic said: ↑

I imagined they were stored in chunks (unless they're too big) as they form part of the archetype, however from the docs here, it doesn't specifically say that. I'll have to look at the source.
In the event of heap allocation, then maybe the NativeArray approach might be better.
Click to expand...

What you think in case of static NativeArray? That seams to be a candidate for fast linear access, no matter what.

Joachim_Ante · Nov 18, 2018

NativeArray is faster than DynamicBuffer because it is just a pointer + length. Essentially writing an int to a NativeArray results in a single instruction to write the value.

DynamicBuffer is resizable and both supports chunk storage for good linear memory access & out of chunk storage for large arrays.

Fortunately DynamicBuffer can be casted to NativeArray, using DynamicBuffer.ToNativeArray(), which lets you read / write all the data of a DynamicBuffer in the most efficient way. Essentially the usage is to use DynamicBuffer to attach the data to an entity, resize it to the desired size and then use ToNativeArray() and innerloops using NativeArray to perform efficient modification of data in the DynamicBuffer.

Antypodish · Nov 18, 2018

Thank you @Joachim_Ante for response.

Joachim_Ante said: ↑

Fortunately DynamicBuffer can be casted to NativeArray, using DynamicBuffer.ToNativeArray(), which lets you read / write all the data of a DynamicBuffer in the most efficient way. Essentially the usage is to use DynamicBuffer to attach the data to an entity, resize it to the desired size and then use ToNativeArray() and innerloops using NativeArray to perform efficient modification of data in the DynamicBuffer.
Click to expand...

I like this idea, since I can combine both flexibility of dynamic buffer and performance of NativeArray.
And if I want to, I can go from there further with pointers.

I think I will go with that. Sounds good.
Mostly appreciated.

meanmonkey · Dec 4, 2018

@Antypodish since you are well into dynamic buffers I want to ask you something.

I'm step by step converting my oop to ecs code. My world is sector divided/based, whereas each sector can hold a variable number of entities.

In my former oop approach I stored the reference to the (former) gameobjects in lists (wrapped in sector objects) which was very convinient, so I easily could objectpool / destroy GOs on per sector basis.

Now I'm at the point where I have to decide if I should use Nativearrays (still wrapped in sector objects) or use dynamic buffers (and convert to nativearray for iteration) to reference to the entities.

I'm not quite sure about the maximum capacity of dynamic buffers, but I in my case I would need up to a few thousand referenced entities per dynamic buffer.

Mostly I would use this collection to batch-destroy entites.

Would you recommend using dynamic buffers in this case ?

tertle · Dec 4, 2018

meanmonkey said: ↑

I'm not quite sure about the maximum capacity of dynamic buffers, but I in my case I would need up to a few thousand referenced entities per dynamic buffer.
Click to expand...

max length = int.maxvalue / sizeof<T>

i.e. 2GB per buffer.

I doubt you'll hit the cap under any reasonable circumstance.

meanmonkey · Dec 4, 2018

tertle said: ↑

max length = int.maxvalue / sizeof<T>

i.e. 2GB per buffer.

I doubt you'll hit the cap under any reasonable circumstance.
Click to expand...

allright thx

Spy-Shifty · Dec 4, 2018

tertle said: ↑

max length = int.maxvalue / sizeof<T>

i.e. 2GB per buffer.

I doubt you'll hit the cap under any reasonable circumstance.
Click to expand...

Sure?

I thought it's limited by the size of the chunk.

https://github.com/Unity-Technologi...er/Documentation/reference/dynamic_buffers.md

Declaring Buffer Element Types
To declare a Buffer, you declare it with the type of element that you will be putting into the Buffer:

// This describes the number of buffer elements that should be reserved
// in chunk data for each instance of a buffer. In this case, 8 integers
// will be reserved (32 bytes) along with the size of the buffer header
// (currently 16 bytes on 64-bit targets)
[InternalBufferCapacity(8)]
public struct MyBufferElement : IBufferElementData
{
// These implicit conversions are optional, but can help reduce typing.
public static implicit operator int(MyBufferElement e) { return e.Value; }
public static implicit operator MyBufferElement(int e) { return new MyBufferElement { Value = e }; }

// Actual value each buffer element will store.
public int Value;
}
Click to expand...

JooleanLogic · Dec 4, 2018

meanmonkey said: ↑

I'm not quite sure about the maximum capacity of dynamic buffers, but I in my case I would need up to a few thousand referenced entities per dynamic buffer.

Mostly I would use this collection to batch-destroy entites.
Click to expand...

Something to consider is that if it's your intention to delete all entities within a sector at once, you might be better off with either a sector based SharedComponent or tag Component so that all entities within a sector are contiguous. That way you can delete entire chunks in one hit and avoid the need for an array altogether.

If you maintain an array of several thousand entities per sector, then depending on how those entities might have moved between chunks/sectors, you could end up with a very fragmented array of entities for deletion.
Unity does batch optimise deletions under the hood but it could still result in a lot of work and random memory access vs the chunk approach.

Spy-Shifty said: ↑

Sure?

I thought it's limited by the size of the chunk.
Click to expand...

If the buffer size exceeds the chunk size, then it is heap allocated outside the chunk. Just the buffer header remains in the chunk.
Something to consider if you're using large buffers because if you use a large [InternalBufferCapacity] that takes up a lot of chunk space, then increase the size of the buffer afterwards to cause a heap allocation, I'm pretty sure that initial chunk space becomes void and thus wasted. It might be better to use the default minimum buffer size, then re-allocate to the heap straight away.

meanmonkey · Dec 4, 2018

jooleanlogic said: ↑

Something to consider is that if it's your intention to delete all entities within a sector at once, you might be better off with either a sector based SharedComponent or tag Component so that all entities within a sector are contiguous. That way you can delete entire chunks in one hit and avoid the need for an array altogether.
Click to expand...

I think dynamic buffers are fine for me, as I have a very large amount of sectors streamed in runtime, so I can't really create a component per sector.

jooleanlogic said: ↑

If you maintain an array of several thousand entities per sector, then depending on how those entities might have moved between chunks/sectors, you could end up with a very fragmented array of entities for deletion.
Unity does batch optimise deletions under the hood but it could still result in a lot of work and random memory access vs the chunk approach.
Click to expand...

I only do this for static entites anyway. Dynamic / moving entites are treated separately. And that's the nice thing with dynamic buffers: I can convert the dynamic buffer to a native array and destroy entites using unitys batch destroy method in the mainthread.

jooleanlogic said: ↑

If the buffer size exceeds the chunk size, then it is heap allocated outside the chunk. Just the buffer header remains in the chunk.
Something to consider if you're using large buffers because if you use a large [InternalBufferCapacity] that takes up a lot of chunk space, then increase the size of the buffer afterwards to cause a heap allocation, I'm pretty sure that initial chunk space becomes void and thus wasted. It might be better to use the default minimum buffer size, then re-allocate to the heap straight away.
Click to expand...

With "thousands" entities I wanted to make sure it won't be a problem at all, size wise. At the end, it will be a few hundred per sector.

Antypodish · Dec 4, 2018

@meanmonkey, seams you have already received an answer.

But no worry about buffer size. as @jooleanlogic said, if oversizes the chunk, then heap is used.
Then as you pointed out, once you know size of buffer and size most of the time stays constant, you can push data into NativeArray.

Search Unity

NativeArray vs DynamicBuffer : Which access data faster?

Antypodish

fholm

Antypodish

fholm

Antypodish

fholm

Antypodish

fholm

Antypodish

JooleanLogic

Antypodish

JooleanLogic

fholm

JooleanLogic

Antypodish

JooleanLogic

Antypodish

Joachim_Ante

Unity Technologies

Antypodish

meanmonkey

tertle

meanmonkey

Spy-Shifty

JooleanLogic

meanmonkey

Antypodish

Search Unity

Unity ID

Useful Searches

NativeArray vs DynamicBuffer : Which access data faster?

Unity Technologies