Search Unity

  1. Megacity Metro Demo now available. Download now.
    Dismiss Notice
  2. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

Question Efficient NativeMultiHashMap element removal?

Discussion in 'Entity Component System' started by JH253Games, Jun 3, 2022.

  1. JH253Games

    JH253Games

    Joined:
    Apr 12, 2022
    Posts:
    5
    I'm working on a project that requires many (Vector2, int, int) tuples to be searched based on an input coordinate.
    I wanted to use jobs for the performance boost, so I'm using a NativeMultiHashMap to store these key-values (as multiple tuples are possible per coordinate).

    Problem is: I need these values to be removed after x updates and this takes a humongous chunk of time to do. We're talking about 5-6ms for removing a single entry, scaling linearly.
    On average the project adds and removes 100's of entries each update, causing massive performance spikes as seen in the profiler results.

    What can I do to make this task more performant?
    I can calculate which elements need to be removed beforehand if that would help.

    The profile results show the time needed to remove 1, 10, 100 and 400 elements respectively (for benchmarking purposes).
    ProfilerResults.png

    As seen in the code sample, I remove the elements with the Remove* function of NativeMultiHashMap.
    pheromones2 is a NativeMultiHashMap and li is a list of the relevant key value pairs.
    I'd do this within a job, but iterating a NativeMultiHashMap efficiently with that is a task by itself :)
    CodeSample.png

    *I'm using Collections 1.2.3, but the documentation shows no version of Remove that takes a key and value, however unity still gives me the option(?).
     
  2. Enzi

    Enzi

    Joined:
    Jan 28, 2013
    Posts:
    954
    The most efficient way is to utilize an enumerator and using the iterator to remove the key value pair.

    Code (CSharp):
    1.  
    2. /// <summary>
    3. /// Removes a single key-value pair.
    4. /// </summary>
    5. /// <param name="it">An iterator representing the key-value pair to remove.</param>
    6. /// <exception cref="InvalidOperationException">Thrown if the iterator is invalid.</exception>
    7. public void Remove(NativeMultiHashMapIterator<TKey> it)
    8. {
    9.     CheckWrite();
    10.     m_MultiHashMapData.Remove(it);
    11. }
    Also, this should happen inside a burst compiled job. NativeContainers outside of jobs are abysmal. You are better off just using a managed Dictionary.

    edit:
    Probably wrong info, someone else would need to clarify. Checks are only half the picture. Maybe it's just bad JIT code. /shrug Previous assumption: The problem with NativeContainers inside managed code is the marshalling overhead between managed <-> unmanaged.
     
    Last edited: Jun 4, 2022
    JH253Games and apkdev like this.
  3. Saniell

    Saniell

    Joined:
    Oct 24, 2015
    Posts:
    191
    I'm not sure about this. Most containers in Unity Collections package are implemented fully in C# except memory allocation (obviously). But after that you get a pointer it doesn't matter whether it's managed or unmanaged.

    I believe real reason they are slow in editor is because of massive amount of safety checks and those are not present in release builds actually
     
    Harry-Wells likes this.
  4. Enzi

    Enzi

    Joined:
    Jan 28, 2013
    Posts:
    954
    Nope, safety checks can be turned off. There's little difference between build and editor when Leak detection and Safety checks are turned off. (Clarification: I mean Burst only, for managed code safety checks can't be turned off in editor)
    About the marshalling, feel free to test this out for yourselves. Even if you have the pointers, you just can't get any memory address and read it in C#. For a C# value to be valid, C# has to allocate it in heap memory, therefore managed memory. (https://stackoverflow.com/questions/6873306/c-to-c-sharp-marshaling/6873489#6873489)

    If you do any NativeContainer method simply on the main thread it will be managed, unbursted and really slow. Every data that has been allocated in unmanaged memory has to be brought into managed space. (edit: I still think this is true but someone else would need to clarify. If it were not the case, why does marshalling exist and why is it so difficult to get values from C++? managed <-> unmanaged is a thing - if this is true here, I can't say for certain)

    If you call the same method from a job, bursted it will exclusively operate on unmanaged memory and won't be slowed down by any C# managed operations to track stuff.
    Were it not for Burst, a managed Dictionary outperforms a NativeHashMap.

    Build or not, your choice in how many tests you want to do. Do the same thing with a bursted and unbursted job and then from main thread in something like OnUpdate and see the difference for yourself.
     
    Last edited: Jun 4, 2022
  5. Saniell

    Saniell

    Joined:
    Oct 24, 2015
    Posts:
    191
    Adding to what was said, disabling safety checks does not actually "disable" them. There is compiler directive ENABLE_UNITY_COLLECTIONS_CHECKS which is enabled in editor no matter if you enable checks or not using top menu in editor.
    I have feeling like that option is for bursted code, maybe?

    Usually inside this directive there are calls to AtomicSafetyHandle and coming from name, I assume it does atomic operations every time you access container to check for thread safe use. You can easily see how this would make any Native container much slower
     
    Harry-Wells likes this.
  6. JH253Games

    JH253Games

    Joined:
    Apr 12, 2022
    Posts:
    5
    Alright, I've tried it this way but it let me to a new problem.
    I've passed an enumerator to the job to find the correct pairs to remove.
    However, once found I need to get an iterator pointing to said pair, for this I need to call pheromones2.TryGetFirstValue and pheromones2.TryGetNextValue.
    Now that I got the iterator I try and remove the pair with the Remove(iterator) method you suggested, but it leads to an InvalidOperationException as both the enumerator and pheromones2 refer to the same container.

    *EDIT*
    I've fixed the above issue by splitting the task into two jobs:
    1. use an enumerator to find the correct pairs and store them in a NativeList of key-value tuples
    2. use the NativeMultiHashMap to find the corresponding iterator for each of them and remove them

    However, using pheromones2.TryGetFirstValue and pheromones2.TryGetNextValue or pheromones2.Remove causes unity to give a burst error - could it have to do with my keys and values being tuples?
     

    Attached Files:

    Last edited: Jun 4, 2022
  7. Enzi

    Enzi

    Joined:
    Jan 28, 2013
    Posts:
    954
    You can tell me I'm wrong but then you have to come up with a better answer because (safety) checks are only half the picture. There's some unexplainable overhead going on or just plain bad JIT code. Which I sincerely doubt to be that massive, the code and internal code for NativeStream reading and NativeList writing is not complicated. Also doesn't explain why managed Lists and Dictionary are performing better.

    Here are some performance test results:
    This test takes a NativeStream, reads all values and writes it to a NativeList.
    upload_2022-6-4_18-46-36.png

    Feel free to explain 3.5x speed up for code that is only and reading and writing a value to "only" unmanaged memory without any vectorization.

    Bonus: Bursted test with safety checks enabled
    upload_2022-6-4_18-55-33.png

    some more pretty pictures:
    editor:
    NSTest_Editor.JPG

    build mono:
    NSTest_Build_Mono.JPG
    build il2cpp:
    NSTest_Build_IL2CPP.JPG

    Burst stays the same in all cases. As I said, turning safety checks/leak detection off nets you build performance. Now in mono build the main thread got halved but is still 2 times slower than burst. Only in il2cpp which pretty much leaves C# land and has its own optimisations the main thread task is comparable to burst. Interestingly the scheduled unbursted job has nearly double the time of the bursted one.

    @JH253Games
    Try it this way. Never had any issues with it.
    Code (CSharp):
    1. bool found = mapEntityToCombatEffects.TryGetFirstValue(element.spellOwner, out var item, out var iterator);
    2.  
    3. while (found)
    4. {
    5.     if (item.mainEffectEntity == mainEffectEntity)
    6.         mapEntityToCombatEffects.Remove(iterator);
    7.  
    8.     found = mapEntityToCombatEffects.TryGetNextValue(out item, ref iterator);
    9. }
     
    Last edited: Jun 4, 2022
  8. JH253Games

    JH253Games

    Joined:
    Apr 12, 2022
    Posts:
    5
    I'm assuming this code fragment is part of a burst compiled job?
    As mentioned before, for some reason calling TryGetFirstValue on my NativeMultiHashMap of (int, int, int) tuple keys and (Vector2, int, int) tuple values results in burst compilation errors, so I can't do that for some reason :')
     
  9. Saniell

    Saniell

    Joined:
    Oct 24, 2015
    Posts:
    191
    We were talking about difference between Native Collections and Managed Collections, burst is out of topic.
    Besides, I wasn't tying to prove that managed C# is as fast as burst, I was pointing out that performance difference between default C# containers and Unity Collections is not because of "marshaling" cost you talk about. That's all I'm saying. Please stop disproving statements noone made
     
    Harry-Wells likes this.
  10. Enzi

    Enzi

    Joined:
    Jan 28, 2013
    Posts:
    954
    Yeah, sorry. This thread got derailed a little.
    The first error message is because you use the hashmap twice. There should be no reason to have it twice in a job.

    The second is because tuple doesn't implement IEquatable<TKey> which is needed for any hash map key. I'm surprised a tuple even works in compiling.

    Maybe it's best to just scrap the tuples and use int or a struct with the keys and a custom hashing method. You can find some good code for prime hashing on stackoverflow that can take multiple values and return you an integer.
    The code I posted works for everything. Burst or not.

    Why is burst "out of topic"? I was the one who brought it up. I don't care too much if it is marshalling. If it's not, then so be it. I happily admit being wrong. There's still a large unexplained gap between it.

    My initial quote:
    Code (CSharp):
    1. Also, this should happen inside a burst compiled job. NativeContainers outside of jobs are abysmal. You are better off just using a managed Dictionary.
    Which still stands. 3-9 lost milliseconds even in IL2CPP is a lot for reading/writing and NativeContainers are not good outside of Burst. It gets worse the more math heavy it gets. Native(Multi)HashMap performance especially tanks outside of Burst. I think that's good info worth sharing.

    edit:
    I also made quite the mistake because I just rewrote a parallel test job to a single schedule and the NativeList.ParallelWriter adds quite the overhead. I rewrote it to a straight NativeList

    upload_2022-6-6_6-10-41.png
    These are the correct numbers now. 14ms overhead in a IL2CPP build. I was pretty sure that numbers are not even close and this is more like it now.
     
    Last edited: Jun 6, 2022
  11. Saniell

    Saniell

    Joined:
    Oct 24, 2015
    Posts:
    191
    You are comparing Native Hash Map to Bursted Native Has Map
    If you want to prove that "we are better off using managed dictionaries" then how about comparing to, well, dictionary. In release build.

    Because right now you are testing 2 compilers, not 2 collection types
     
  12. Enzi

    Enzi

    Joined:
    Jan 28, 2013
    Posts:
    954
    Of course I'm comparing it against Burst on the DOTS forum ... I am perplexed that doing it not in Burst is even a question, but you seem reluctant so:

    500k adds on main thread for a managed Dictionary and unmanaged NativeHashMap in il2cpp build:
    upload_2022-6-6_8-58-7.png
     
  13. vectorized-runner

    vectorized-runner

    Joined:
    Jan 22, 2018
    Posts:
    396
    It looks like you need to store your tuples as dictionary as well if you want fast removal.
    If you can index your input, you could use UnsafeList of NativeHashMap, then O(1) remove the items