[See proposed memclear solution] Clear() on large NativeMultiHashMaps is causing performance issues.

Mr-Mechanical · Feb 12, 2019

Hi,

I'm developing an algorithm which heavily relies on NativeMultiHashMaps (at least somewhat similar to BoidsSystem use case).

The main concern is that the HashMap has to be reset every frame which causes a major performance obstacle when using Clear(). This issue only applies to large hashmaps. Perhaps this doesn't have to be the case.

Any suggestions for a workaround on clearing hashmaps? I've tried disposing and reallocating with Allocator.TempJob and it is understandably less optimal for performance than Clear(). Is Clear() doing something significant with regards to performance? How shall I address this issue? Is there another path to "resetting" the HashMap without affecting performance?

Thanks, the advice is appreciated!

tertle · Feb 12, 2019

Mr-Mechanical said: ↑

Any suggestions for a workaround on clearing hashmaps? I've tried disposing and reallocating with Allocator.TempJob and it is understandably less optimal for performance than Clear(). Is Clear() doing something significant with regards to performance? How shall I address this issue? Is there another path to "resetting" the HashMap without affecting performance?
Click to expand...

Clear requires iterating the entire hashmap so there are definitely performance issues for large maps.
Allocating a hashmap also calls Clear() so it's going to have the same issues and just adds an extra allocation on top of Clear.

I'm unaware of any workarounds for large hashmaps.

sngdan · Feb 12, 2019

@tertle - I never used you event system, but if I understand it right, I could create an entity for each "Key" with a buffer array for the values?

Q1: would that be any faster?
Q2: would it introduce 1-frame delay?

tertle · Feb 12, 2019

No idea if that would be faster, it's an interesting use I haven't considered.
They are usually 1 frame delayed, but you could keep it same frame by executing after the EndBarrierSystem

tertle · Feb 12, 2019

If you've ever looked at source, it's pretty obvious what the problem is

Code (CSharp):

public static unsafe void Clear(NativeHashMapData* data)

{

int* buckets = (int*) data->buckets;

for (int i = 0; i <= data->bucketCapacityMask; ++i)

buckets[i] = -1;

int* nextPtrs = (int*) data->next;

for (int i = 0; i < data->keyCapacity; ++i)

nextPtrs[i] = -1;

for (int tls = 0; tls < JobsUtility.MaxJobThreadCount; ++tls)

data->firstFreeTLS[tls * NativeHashMapData.IntsPerCacheLine] = -1;

data->allocatedIndexLength = 0;

}

It needs to allocate everything to -1 to mark it not used.
If you could make 0 the unused state you could just memclear everything and it'd be super fast. I'm not sure what the consequences are but for example you could start the arrays at 1 or offset when indexing by -1.

sngdan · Feb 12, 2019

I had a quick glance at the source but I don't have the skill & patience to fiddle through it, since I only do this as a hobby. But it seemed like with some effort one could memcpy the values out of the buckets...

i.e. i store collision candidates (values) in a spacial grid (gird = key) and i need to get this into native arrays per grid to iterate pair by pair. currently i just unroll by iterating through all keys/values in a job.

Mr-Mechanical · Feb 12, 2019

tertle said: ↑

If you've ever looked at source, it's pretty obvious what the problem is

Code (CSharp):

public static unsafe void Clear(NativeHashMapData* data)

{

int* buckets = (int*) data->buckets;

for (int i = 0; i <= data->bucketCapacityMask; ++i)

buckets[i] = -1;

int* nextPtrs = (int*) data->next;

for (int i = 0; i < data->keyCapacity; ++i)

nextPtrs[i] = -1;

for (int tls = 0; tls < JobsUtility.MaxJobThreadCount; ++tls)

data->firstFreeTLS[tls * NativeHashMapData.IntsPerCacheLine] = -1;

data->allocatedIndexLength = 0;

}

It needs to allocate everything to -1 to mark it not used.
If you could make 0 the unused state you could just memclear everything and it'd be super fast. I'm not sure what the consequences are but for example you could start the arrays at 1 or offset when indexing by -1.
Click to expand...

@Joachim_Ante This seems like a cool idea, do you think you guys can optimize hashmap Clear()?

sngdan · Feb 13, 2019

I tried my idea from above, but without using the event system from @tertle

I basically replaced my nativemultihashmap with the "sngdan entity->buffer dictionary" - it seems to be faster in my use case, although I can not write to the buffer concurrently, which I could to the hashmap.

Will test this more and if there is something interesting to report post it here

sngdan · Feb 13, 2019

forgot to mention earlier - besides the speed increase, it is also nice that you can watch the "Dictionary" contents in the entity debugger.

JooleanLogic · Feb 13, 2019

Mr-Mechanical said: ↑

This issue only applies to large hashmaps.
Click to expand...

Do you know just how big it is? Are we talking megabytes? I'm surprised this is so slow when it's just iterating a couple of arrays.

tertle said: ↑

It needs to allocate everything to -1 to mark it not used.
If you could make 0 the unused state you could just memclear everything and it'd be super fast.
Click to expand...

Would MemSet make any difference or is zero a special case that gets optimised?

Code (CSharp):

//int* buckets = (int*) data->buckets;

//for (int i = 0; i <= data->bucketCapacityMask; ++i)

// buckets[i] = -1;

UnsafeUtilityEx.MemSet(data->buckets, 0xFF, data->bucketCapacityMask * sizeof(int));

Curious if replacing those two loops with MemSet would make any difference.

sngdan · Feb 13, 2019

@Mr-Mechanical

Try the approach that I suggested, there is a good chance that you benefit from this as well.

I have not checked at what point it allocates on the heap (so I guess, there will be a speed penalty at a certain amount of values stored per "key") - I have to read up on this at one point

Clearing the buffer is fast ( m_Buffer->Length = 0; )

It's fast to access. I.e. you can just get all values per key as a native array .AsNativeArray() -- no copying of data

Considerations: You cannot write from different threads / in parallel and you might have to complete the job that fills the "Dictionary" (I had to, because I parallel process the values per key later)

edit: I have now 50,000 random AABBs moving, colliding, color change on collision, rendering (with a bare bone custom render system that used MPB for the colors) on my IMac 3.3GHz (4 cores) at 60fps

Mr-Mechanical · Feb 13, 2019

sngdan said: ↑

@Mr-Mechanical

Try the approach that I suggested, there is a good chance that you benefit from this as well.

I have not checked at what point it allocates on the heap (so I guess, there will be a speed penalty at a certain amount of values stored per "key") - I have to read up on this at one point

Clearing the buffer is fast ( m_Buffer->Length = 0; )

It's fast to access. I.e. you can just get all values per key as a native array .AsNativeArray()

Considerations: You cannot write from different threads / in parallel and you might have to complete the job that fills the "Dictionary" (I had to, because I parallel process the values per key later)

Click to expand...

I am fascinated with your approach. However I am curious, how would you control the values of each "key"?

sngdan · Feb 13, 2019

I am not sure, I understand you question. What do you mean control?

edit: I clear the buffers each frame, before I write to them again - similar to what you would also do with the hashmap

Code (CSharp):

// MultiHashMap

public struct CollisionInfo: IComponentData

{

public Entity entity;

public Box box;

}

// in the job

[WriteOnly] public NativeMultiHashMap<int, CollisionInfo>.Concurrent entitiesPerGridMultiHashMap;

entitiesPerGridMultiHashMap.Add(key, new CollisionInfo {entity = e, box = box}); // key is an int referring to a grid position

// "Entity Dictionary" or whatever you want to call it

public struct CollisionInfoBuffer : IBufferElementData

{

public Entity entity;

public Box box;

}

// in the job

[WriteOnly] public BufferArray<CollisionInfoBuffer> keyArray;

keyArray[key].Add(new CollisionInfoBuffer{entity = e, box = box}); // key = as above

sngdan · Feb 13, 2019

Why don’t you describe a little better, what exactly you are trying to do and if I have an idea how to approach it, I will let you know.

Mr-Mechanical · Feb 13, 2019

@tertle - I never used you event system, but if I understand it right, I could create an entity for each "Key" with a buffer array for the values?
Click to expand...

Ok, that makes sense. Initially, I thought you were creating many entities with dynamic buffers to represent key/values. I have made sense of your implementation based on your source provided. Only concern is the potential hash collisions but that isn't a problem for use in some algorithms.

jooleanlogic said: ↑

Would MemSet make any difference or is zero a special case that gets optimised?

Code (CSharp):

//int* buckets = (int*) data->buckets;

//for (int i = 0; i <= data->bucketCapacityMask; ++i)

// buckets[i] = -1;

UnsafeUtilityEx.MemSet(data->buckets, 0xFF, data->bucketCapacityMask * sizeof(int));

Curious if replacing those two loops with MemSet would make any difference.
Click to expand...

I've tested this replacement for Clear() using memset, however, unfortunately, according to my tests it has not improved the performance:

Code (CSharp):

UnsafeUtilityEx.MemSet(data->buckets, 0xFF, data->bucketCapacityMask * sizeof(int));

UnsafeUtilityEx.MemSet(data->next, 0xFF, data->keyCapacity * sizeof(int));

for (int tls = 0; tls < JobsUtility.MaxJobThreadCount; ++tls)

data->firstFreeTLS[tls * NativeHashMapData.IntsPerCacheLine] = -1;

data->allocatedIndexLength = 0;

However, very I'm curious if MemClear however improves performance. I attempted to write a MemClear version (see below) however it crashes in builds and in editor testing. @tertle Any tips on how to use MemClear properly? I have never used MemClear before.

Code (CSharp):

UnsafeUtility.MemClear(data->buckets, data->bucketCapacityMask * sizeof(int));

UnsafeUtility.MemClear(data->next, data->keyCapacity * sizeof(int));

for (int tls = 0; tls < JobsUtility.MaxJobThreadCount; ++tls)

data->firstFreeTLS[tls * NativeHashMapData.IntsPerCacheLine] = -1;

data->allocatedIndexLength = 0;

This may be a general problem and not specific to this implementation:
https://bugs.openjdk.java.net/browse/JDK-4343436

I may consider creating something similar to a HashMap but very focused on efficiently clearing and reuse.

Thank you all for the suggestions! It means a lot.

sngdan · Feb 14, 2019

I create many entities with dynamic buffers.
KeyArray[key] -> the keyarray holds the different entities and I pick the one with index key to store a value

tertle · Feb 14, 2019

Mr-Mechanical said: ↑

However, very I'm curious if MemClear however improves performance. I attempted to write a MemClear version (see below) however it crashes in builds and in editor testing. @tertle Any tips on how to use MemClear properly? I have never used MemClear before.

Code (CSharp):

UnsafeUtility.MemClear(data->buckets, data->bucketCapacityMask * sizeof(int));

UnsafeUtility.MemClear(data->next, data->keyCapacity * sizeof(int));

for (int tls = 0; tls < JobsUtility.MaxJobThreadCount; ++tls)

data->firstFreeTLS[tls * NativeHashMapData.IntsPerCacheLine] = -1;

data->allocatedIndexLength = 0;

Click to expand...

You can't just use memclear, the nativehashmap expects the values to be -1 and won't work unless they are initialized to -1. You need to redesign the entire hashmap to expect 0 to be default.

Mr-Mechanical · Feb 14, 2019

tertle said: ↑

You can't just use memclear, the nativehashmap expects the values to be -1 and won't work unless they are initialized to -1. You need to redesign the entire hashmap to expect 0 to be default.
Click to expand...

I see. Do you think it would be somehow possible to test a broken hashmap using memclear for performance gains? I want to see if memclear would make the difference before considering an attempt to redesign on my own.

Currently, I have a test comparing Clear() of a Persistent Hashmap and Dispose of a TempJob NativeArray. Here is the size of each when the performance is the same:

NativeArray: 40000
NativeMultiHashMap: 10000

Could this 4x difference be natural as hashmap Clear() has to do more work? Or could there be some potential for optimization? This discussion so far has been great. Thanks. I will continue to look further into this as well.

tertle · Feb 14, 2019

Just create a simple test of iterating 3 large sets of data and setting them to -1 and then compare it to 3 memclears of the same length.

RecursiveEclipse · Feb 14, 2019

I'm working on a NativeHashSet based on NativeHashMap and got close enough done today to be able to test this. For now I'm using memclear to 0, then point all array set/get indexers to a function that either adds/subtracts for set/get respectively. Still haven't touched Reallocate() yet, could use memclear there also.

Clearing then adding 10000 ints per frame, before memclear(.135-.165ms):

After(.012-.018ms):

NativeHashMap has 4 arrays while a NativeList only needs 1 so a ~4x hit compared to NativeList makes sense.

sngdan · Feb 14, 2019

Most of the time, I post from mobile like now. I could post an example later.

@Mr-Mechanical
Do you have pre-determined keys / hashes or do you need those flexible in (a) number and (b) key value? Is the key value of importance to you later or is is just for grouping?

sngdan · Feb 14, 2019

Code (CSharp):

using Unity.Burst;

using Unity.Collections;

using Unity.Entities;

using Unity.Jobs;

using Unity.Mathematics;

namespace EntityHashMapExample

{

public struct DummyData: IComponentData

{

public int Value;

}

[InternalBufferCapacity(16)]

public struct EntityHashMapValues : IBufferElementData

{

public Entity entity;

}

[UpdateBefore(typeof(EntityHashMapJobSystem))]

public class SetDummyDataSystem: ComponentSystem

{

protected override void OnCreateManager()

{

for (int i = 0; i < EntityHashMapJobSystem.numberOfValues; i++)

{

EntityManager.CreateEntity(ComponentType.Create<DummyData>());

}

}

protected override void OnUpdate()

{

ForEach( (ref DummyData data) =>

{

data.Value = UnityEngine.Random.Range(0,EntityHashMapJobSystem.numberOfKeys);

}, GetComponentGroup(typeof(DummyData)));

}

}

public class EntityHashMapJobSystem : JobComponentSystem

{

public ComponentGroup key_Group;

public ComponentGroup value_Group;

public const int numberOfKeys = 1000;

public const int numberOfValues = 50000;

[BurstCompile]

private struct ClearEntityHashMapJob : IJobParallelFor

{

[NativeDisableParallelForRestriction][WriteOnly] public BufferArray<EntityHashMapValues> keyEntities;

public void Execute(int i)

{

keyEntities[i].Clear();

}

}

[BurstCompile]

private struct FillEntityHashMapJob : IJobProcessComponentDataWithEntity<DummyData>

{

[WriteOnly] public BufferArray<EntityHashMapValues> keyEntities;

public void Execute(Entity e, int i, [ReadOnly] ref DummyData data)

{

var key = data.Value;

keyEntities[key].Add(new EntityHashMapValues{entity = e});

}

}

protected override void OnCreateManager()

{

for (int i = 0; i < numberOfKeys; i++)

{

EntityManager.CreateEntity(ComponentType.Create<EntityHashMapValues>());

}

key_Group = GetComponentGroup(ComponentType.Create<EntityHashMapValues>());

}

protected override JobHandle OnUpdate(JobHandle inputDependencies)

{

var valuesArray = key_Group.GetBufferArray<EntityHashMapValues>();

inputDependencies = new ClearEntityHashMapJob

{

keyEntities = valuesArray

}.Schedule(valuesArray.Length, 8, inputDependencies);

inputDependencies = new FillEntityHashMapJob

{

keyEntities = valuesArray

}.ScheduleSingle(this, inputDependencies); // one thread only ! (because, I can not gurantee not to write from multiple threads to same key)

inputDependencies.Complete();

// you can open the entity inspector and watch them

// get all the values stored in "key 0" as a native array ! (note: key is the position in array - my system simplifies the key part, can be done differently)

var valuesOfKeyAsNativeArray = valuesArray[0].AsNativeArray(); // this is a pointer to the buffer, nothing is copied

return inputDependencies;

}

}

}

Mr-Mechanical · Feb 14, 2019

RecursiveEclipse said: ↑

I'm working on a NativeHashSet based on NativeHashMap and got close enough done today to be able to test this. For now I'm using memclear to 0, then point all array set/get indexers to a function that either adds/subtracts for set/get respectively. Still haven't touched Reallocate() yet, could use memclear there also.

Clearing then adding 10000 ints per frame, before memclear(.135-.165ms):

After(.012-.018ms):

NativeHashMap has 4 arrays while a NativeList only needs 1 so a ~4x hit compared to NativeList makes sense.
Click to expand...

This is a very impressive (10 fold) performance improvement you have implemented here, it's clear memclear would make the difference in this case. I hope a similar optimization is made in the official MultiNativeHashMap. Thank you for sharing.

sngdan · Feb 14, 2019

the buffer solution removes the clearing bottleneck all together, you wont be able to fill it fast enough

Mr-Mechanical · Feb 14, 2019

sngdan said: ↑

the buffer solution removes the clearing bottleneck all together, you wont be able to fill it fast enough
Click to expand...

Considerations: You cannot write from different threads / in parallel and you might have to complete the job that fills the "Dictionary" (I had to, because I parallel process the values per key later)

Click to expand...

This is interesting and thank you for the suggestion. Though unfortunately, being able to add to the hashmap from multiple threads is an absolute must for me.

sngdan · Feb 15, 2019

Parallel only works, if you can ensure not to write to the same hash from multiple threads (like, the clear job in my example)

In my collision system, I have tried 4 different approaches and funnily my first, fully parallel, choice was not the fastest (which I thought it would) - the problem is that I only test on my computer (4 cores) and I don’t know how it would behave on different hardware.

The memclear proposal seems great, optimizing one bottleneck, hopefully there is also one for getting the values out into an array (instead of unrolling sequentially)

sngdan · Feb 19, 2019

@Mr-Mechanical

I updated to the new ECS version and as if by magic, it seems parallel write now works with my entity buffer dictionary (must be thread safe now) - i did not see this in the release notes.

Have to test this thoroughly in the evening - maybe I am just hallucinating...

sngdan · Feb 19, 2019

@tertle - did you not maintain a diff or so? can you see anything there?

False alert - it seems like it is still not working.
- I believe that Unity used to crash or did not compile before
- I can now schedule a job that was previous scheduled as "single" in "parallel" and it mostly works, although I get console errors from time to time with index out of range (it almost looks like the atomic write to the buffer works but the buffer length/capacity is not adjusted across threads)
- I guess at one point I have to dig into this or make a simple test case to figure out what is happening

GilCat · Feb 23, 2019

I'm having the same problem and what i did was to clear the NativeMultiHashMap inside a job with burst:

Code (CSharp):

[BurstCompile]

public struct ClearNativeHashMap<TKey, TValue> : IJob

where TKey : struct, IEquatable<TKey>

where TValue : struct {

[WriteOnly]

public NativeMultiHashMap<TKey, TValue> MultiHashMap;

public void Execute() {

MultiHashMap.Clear();

}

}

Mr-Mechanical · Feb 23, 2019

GilCat said: ↑

I'm having the same problem and what i did was to clear the NativeMultiHashMap inside a job with burst:

Code (CSharp):

[BurstCompile]

public struct ClearNativeHashMap<TKey, TValue> : IJob

where TKey : struct, IEquatable<TKey>

where TValue : struct {

[WriteOnly]

public NativeMultiHashMap<TKey, TValue> MultiHashMap;

public void Execute() {

MultiHashMap.Clear();

}

}

Click to expand...

I figured out how to make my hashmap size smaller so Clear() is less of a dramatic problem now. I am interested in your approach. Are you seeing notable gains just by having a burst compiled job or is the gains from not running on the main thread?

sngdan · Feb 23, 2019

With the help of @RecursiveEclipse, it is possible to write to the buffers from a parallel job (with interlock waiting in case of collision) - in my test case with similar of slightly better performance than single.

I have a number of versions now with queues, multihashmap, buffers and varying degrees of parallelism.

It has been fun but I reached a level of micro optimization, that i believe is now hardware specific (ie dependent on cores, etc)

My preference and good performance is the buffer solution, mainly because of the ease of access to the stored values (asnative array) and the fast clear

Razmot · May 28, 2019

Could you use two hashmaps, switch the active one, clear the inactive one later, like a double buffering pattern ?

o1o1o1o1o2 · Oct 14, 2019

It is strange that I was not having problems with MultiHashMap.Clear on PC, only on Android device, also other jobs perform just about the same (on PC and Mobile Phone), so it is strange, and I add this

to NativeHashMap.cs

Code (CSharp):

[BurstCompile]

public unsafe struct ClearJob : IJobParallelFor

{

[NativeDisableUnsafePtrRestriction]

public int* arr;

public void Execute(int index)

{

arr[index] = -1;

}

}

and instead of Clear() i add this

Code (CSharp):

public static unsafe void Clear(NativeHashMapData* data)

{

int* buckets = (int*)data->buckets;

JobHandle clearBucketsJob = new JobHandle();

if (data->bucketCapacityMask > 0)

{

clearBucketsJob = new ClearJob()

{

arr = buckets,

}.Schedule(data->bucketCapacityMask + 1, 1024);

}

JobHandle clearNextPtrsJob = new JobHandle();

if (data->keyCapacity > 0)

{

int* nextPtrs = (int*)data->next;

clearNextPtrsJob = new ClearJob()

{

arr = nextPtrs,

}.Schedule(data->keyCapacity, 1024);

}

for (int tls = 0; tls < JobsUtility.MaxJobThreadCount; ++tls)

data->firstFreeTLS[tls * NativeHashMapData.IntsPerCacheLine] = -1;

JobHandle.CombineDependencies(clearBucketsJob, clearNextPtrsJob).Complete();

data->allocatedIndexLength = 0;

}

i don't know if it is right or wrong but on my android device hashmap.Clear from 100 ms. (i have a big hashmap of structs) become 2.5ms and i'm happy with that

o1o1o1o1o2 · Oct 16, 2019

sngdan said: ↑

With the help of @RecursiveEclipse, it is possible to write to the buffers from a parallel job (with interlock waiting in case of collision) - in my test case with similar of slightly better performance than single.

I have a number of versions now with queues, multihashmap, buffers and varying degrees of parallelism.

It has been fun but I reached a level of micro optimization, that i believe is now hardware specific (ie dependent on cores, etc)

My preference and good performance is the buffer solution, mainly because of the ease of access to the stored values (asnative array) and the fast clear
Click to expand...

it's been a long time ago, but can you show how you use interlock in your buffer approach

sngdan · Oct 16, 2019

here you go, api not up to date... + I don't recommend this, this was for testing only

Code (CSharp):

var bufferLocks = new NativeArray<int>(collisionBufferCandiateArray.Length, Allocator.TempJob);

jobHandle = new SpriteToGridBufferInterlockedJob

{

grid = myGrid,

keyArray = collisionBufferCandiateArray,

bufferLocksArray = bufferLocks

}.ScheduleGroup(sprite_Group, jobHandle);

public struct SpriteToGridBufferInterlockedJob : IJobProcessComponentDataWithEntity<Box>

{

[ReadOnly] public ColGrid grid;

[NativeDisableParallelForRestriction, WriteOnly] public BufferFromEntity<CollisionInfoBuffer> keyArray;

[ReadOnly] public NativeArray<Entity> keyIndexArray;

[NativeDisableParallelForRestriction, DeallocateOnJobCompletion] public NativeArray<int> bufferLocksArray;

public void Execute(Entity e, int i, [ReadOnly] ref Box box)

{

var boxMinGrid = (int2) ((box.Center - box.Extends - grid.Min) * grid.OneOverCellSize);

var boxMaxGrid = (int2) ((box.Center + box.Extends - grid.Min) * grid.OneOverCellSize);

for (int x = boxMinGrid.x; x <= boxMaxGrid.x; x++)

{

if (x >= 0 && x < grid.Dim.x)

{

for (int y = boxMinGrid.y; y <= boxMaxGrid.y; y++)

{

if (y >= 0 && y < grid.Dim.y)

{

var pos = x + y * grid.Dim.x;

var key = keyIndexArray[pos];

unsafe

{

while(Interlocked.CompareExchange(ref ((int*)bufferLocksArray.GetUnsafePtr())[pos], -1, 0) != 0) {}

}

keyArray[key].Add(new CollisionInfoBuffer{entity = e, box = box});

bufferLocksArray[pos] = 0;

}

}

}

}

}

}

o1o1o1o1o2 · Oct 16, 2019

sngdan said: ↑

here you go, api not up to date... + I don't recommend this, this was for testing only

Code (CSharp):

var bufferLocks = new NativeArray<int>(collisionBufferCandiateArray.Length, Allocator.TempJob);

jobHandle = new SpriteToGridBufferInterlockedJob

{

grid = myGrid,

keyArray = collisionBufferCandiateArray,

bufferLocksArray = bufferLocks

}.ScheduleGroup(sprite_Group, jobHandle);

public struct SpriteToGridBufferInterlockedJob : IJobProcessComponentDataWithEntity<Box>

{

[ReadOnly] public ColGrid grid;

[NativeDisableParallelForRestriction, WriteOnly] public BufferFromEntity<CollisionInfoBuffer> keyArray;

[ReadOnly] public NativeArray<Entity> keyIndexArray;

[NativeDisableParallelForRestriction, DeallocateOnJobCompletion] public NativeArray<int> bufferLocksArray;

public void Execute(Entity e, int i, [ReadOnly] ref Box box)

{

var boxMinGrid = (int2) ((box.Center - box.Extends - grid.Min) * grid.OneOverCellSize);

var boxMaxGrid = (int2) ((box.Center + box.Extends - grid.Min) * grid.OneOverCellSize);

for (int x = boxMinGrid.x; x <= boxMaxGrid.x; x++)

{

if (x >= 0 && x < grid.Dim.x)

{

for (int y = boxMinGrid.y; y <= boxMaxGrid.y; y++)

{

if (y >= 0 && y < grid.Dim.y)

{

var pos = x + y * grid.Dim.x;

var key = keyIndexArray[pos];

unsafe

{

while(Interlocked.CompareExchange(ref ((int*)bufferLocksArray.GetUnsafePtr())[pos], -1, 0) != 0) {}

}

keyArray[key].Add(new CollisionInfoBuffer{entity = e, box = box});

bufferLocksArray[pos] = 0;

}

}

}

}

}

}

Click to expand...

thanx!!!!

MintTree117 · Apr 10, 2020

Razmot said: ↑

Could you use two hashmaps, switch the active one, clear the inactive one later, like a double buffering pattern ?
Click to expand...

Excellent idea!

Search Unity

Unity ID

Useful Searches

[See proposed memclear solution] Clear() on large NativeMultiHashMaps is causing performance issues.