Hi All, Bit new to DOTS for a project I needed a fast KNN library. I ended up doing one with DOTS! https://github.com/ArthurBrussee/KNN Was fun to experiment with. Some feedback I came across while developing this: - Burst is still super crash prone :/ - Not being able to allocate containers in bursts jobs is really a bummer - When scheduling non overlapping slices of the same array Unity still errors. Would be great if the safety system could account for that. - I really want an easier way to do ref returns. There is a way to do it now with UnsafeUtilityEx, but it's a hassle. - Burst compiled Delegates can't come soon enough! - I wish there was a way to have a job like IJobParralelForBatch, where the pre-amble can be any kind of code, and only the tight loop after is Burst-ified. Right now if you need some kind of more complex setup per batch you essentially have to resort to manually spawning a nr. of jobs. But other than that it was good fun to see performance jump up a factor 10 when using DOTS
You can allocate native containers in bursted job without problem with Allocator.Temp. (Exclude NativeQueue)
A couple thoughts... Something like KNN seems quite low level. Creating a container on the same level that NativeList / NativeArray is a container would make sense. Essentially you assume all risk for safety & leaks but you use atomicsafetyhandle etc to enforce that all usage of your API is completely safe. This also means you can allocate / deallocate as much data as you like with any allocator label on a bursted job. When doing that setup it's obviously super important to write unit tests & stress tests to ensure that your implementation is correct and pevents incorrect usage. If you like at Unit tests for BlockStream you will see half of the tests are tests against incorrect usage patterns throwing the expected exceptions. I would say that should probably be built as a container written with unsafe code. Also IJobParallelForDeferred is likely what you want. Check out the Physics BlockStream struct for an example of a specialized container.
https://jacksondunstan.com/articles/4734 https://docs.unity3d.com/Manual/JobSystemNativeContainer.html
Yeah we have some really bad stuff with exception handling in there now. It should all be resolved in 19.3.
Also if the speedup you are getting is 10x there is likely something going very wrong. For this kind of code. Combo of datalayout / burst / jobs. Should get somewhere on the order of > 100x speedups.
Thanks Joachim! Great point on equating this to a NativeXXX<> container, that makes a lot of sense. I've integrated some of Unity's safety system now. No idea if I'm missing part of it though. 10x speedup is just no Burst -> Burst (just check again, and it actually is ~18x). Vs the original pure non-jobified C# code it's definitely even more but haven't measuremed enough along the way. The main demo is mostly waiting on the tree to rebuild, which does jump all over the place in memory and is only single threaded... I'll see if I multithread that someday. @eizenhorn: Woops I see why I remember this now. Allocating is fine but calling Dispose() is what breaks Burst - I see now you just shouldn't Dispose Temp memory, makes sense! Removing that allows for a much cleaner API and I now run multiple queries as an IJobParralelForBatch, perfect! Thanks for the help!
XavierM integrated this and after some debugging saw a >200x speedup https://twitter.com/N3zix/status/1138773601858064384 DOTS is pretty fast! This is for 100.000 points running 100.000 queries
Very cool! Thanks for sharing! You have 3 big changes in this commit: https://github.com/ArthurBrussee/KN...814e89d#diff-18ac8c221cbfcbd1aa2af137198e5dfb One was to use an index, fewer lookups and using a pointer. While the index and fewer lookups seem like the more drastic performance improvement, I'm interested how much difference the pointer makes. Is it worth it to write it like this?
This was at 00:30 at night so I didn't exactly do due dilligence in measurements here, but from what I remember the change to a ref variable was really minimal if noticable at all. I think LLVM probably does a pretty good job already at not doing that copy - but then - never trust a compiler. (So to answer your question: No, it's not worth it 99% of the time, but then this is meant as a low-level lib & the KdNode struct is somewhat larger than ideal)
I'd really love to see more of this officially from Unity, what I'm realizing more and more is that game engines should provide building blocks. One problem we ran into recently on our Unreal game was wanting to swap out our Voxel based navigation system for something sparser ala Octree and KD-Tree. I really wish all of these were available with a similar interface so we could have swapped between them easily.