Search Unity

BenchmarkNet (Stress test for ENet, UNet, LiteNetLib, Lidgren, MiniUDP, Hazel, Photon and others)

Discussion in 'UNet' started by nxrighthere, Jan 13, 2018.

Thread Status:
Not open for further replies.
  1. JesseLord

    JesseLord

    Joined:
    Jan 5, 2015
    Posts:
    3
    Anymore updates on uNet?
     
  2. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    I think they are busy due to GDC, but I'll ask them how things are going.

    I several times offered Alex my help, but he said that he'll fix it himself. :oops:

    On my side it's hard to find where the problem is, because the library is working outside of .NET environment and I can't reverse-engineer it. I dig into it with WinDbg, but UNet assembly is like a black box in which you are blindly trying to find something.
     
    Last edited: Mar 27, 2018
  3. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Work on the UNet is still in-progress, and there's some good news. Alex changed a lot of code in the library and, now on his machine he successfully runs the test with 8000 simultaneous connections using only one server thread. There are still some places for improvements, but he very busy due to new top priority work. The library will be updated as soon as he has time for it (most likely next week if everything goes smoothly).
     
    Last edited: Mar 24, 2018
  4. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    @Zuntatos Valve is about to open their low-level transport layer. When they upload the source code, I'm going to write a C# bindings and try to integrate it with the application. So yea, no need to bother with the Steamworks API.
     
    Last edited: Apr 3, 2018
  5. hjupter

    hjupter

    Joined:
    Dec 23, 2011
    Posts:
    628
    Munchy2007 likes this.
  6. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Well, I'm adding support only for low-level libraries and standalone servers, because of the idea behind this test. When you are dealing with solutions that work locally on your end, you can debug everything and gather any data that you want. You can fix the bugs or try to improve something, then just recompile your stuff and test it again. Cloud services it's a very different thing. This tool is not the appropriate solution for testings them.
     
    Deleted User likes this.
  7. Zuntatos

    Zuntatos

    Joined:
    Nov 18, 2012
    Posts:
    612
  8. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    They are still working on it. I can't compile it for now due to this issue.
     
    DMeville and Deleted User like this.
  9. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    I think it might be interesting for some people here, so take a look at HumbleNet. It's lightweight, reliable P2P networking library that allows connecting peers between browsers and standalone platforms using the signaling server. HumbleNet supports Unity, pretty easy to integrate with, has an example project and it works well. The source code is available on GitHub as well as the Quake 3 demo.
     
    Last edited: Apr 1, 2018
  10. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    That's what I call a rapid issue resolution. I built it successfully. Road to C# bindings.
     
    mischa2k and Deleted User like this.
  11. mischa2k

    mischa2k

    Joined:
    Sep 4, 2015
    Posts:
    4,347
    Interesting. Is a Valve networking benchmark planned?
     
  12. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Yea, I'm working on it right now.
     
    Last edited: Apr 7, 2018
    akuno, Kirsche, Deleted User and 2 others like this.
  13. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Everything almost done, but there is another issue.
     
    Deleted User and Kirsche like this.
  14. Deleted User

    Deleted User

    Guest

    Keep it up! Thank you for your effort!
     
  15. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Unfortunately, fellas, the library doesn't work for me. It's initializing fine, but everything else is not working. I spent almost the whole day trying to debug this, but no success. I'm tired.
     
    Last edited: Apr 11, 2018
    DMeville and Deleted User like this.
  16. PrimeDerektive

    PrimeDerektive

    Joined:
    Dec 13, 2009
    Posts:
    3,090
    With regards to the unet memory leak, how long does BenchmarkNet run, eg how fast does it accumulate? Does this make unet effectively useless?
     
  17. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    ~12,5 megabytes for initialization of each client and ~700 kilobytes per second for 64 clients during the process, until they stop sending messages. Also, CPU usage is about ~84% where the .NET library does the same job for ~13% with the same amount of logical threads per client.

    The source code is closed, and all what we can do is wait the updated version for months. The verdict is up to you.
     
    Last edited: Apr 7, 2018
  18. PrimeDerektive

    PrimeDerektive

    Joined:
    Dec 13, 2009
    Posts:
    3,090
    Thanks for the response... is it only on the relay server or am I missing something? When I host a server and connect a client to myself and leave it running for an hour my memory allocation is the same as when I start.
     
  19. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    The answer is here.
     
  20. PrimeDerektive

    PrimeDerektive

    Joined:
    Dec 13, 2009
    Posts:
    3,090
    I see (I think, I've never even heard of that product). So I assume if i'm just hosting with the basic HLAPI and headless unity instances on EC2 the leak doesn't really apply to me?
     
  21. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Nah, LLAPI/HLAPI is unaffected and as far as I know, some improvements already shipped with the latest updates for Unity.
     
  22. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Well, I finally found where the problem is...

    It looks like Valve doesn't test well different building toolchains, but they are fixing everything pretty fast.
     
    Last edited: Apr 6, 2018
    DMeville likes this.
  23. goldbug

    goldbug

    Joined:
    Oct 12, 2011
    Posts:
    767
  24. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    No, I didn't. Other people tested it.
     
  25. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    I'm stuck with another problem... Debugging such things is a long process, so I don't know when (and if) it will be resolved. If any C# interop guru reads this thread, I'd be glad for any help.
     
    Last edited: Apr 7, 2018
  26. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Last edited: Apr 7, 2018
  27. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    BenchmarkNet 1.08 has been released.
    • Added new debugging options
    • Updated DarkRift to the latest version
    • Improved performance of ENet wrapper
    • Improved measurement of length for transmitted data
    • Improved overall functionality
    • Fixed clients drop for Neutrino
    • Fixed string allocations during the process

    As always the results were updated for all libraries, and here you can find information about new debugging options.

    Man, updating all this stuff manually, makes me tired every time, you know? :D

    Last week I read Writing High-Performance .NET Code written by Ben Watson and found an interesting thing about
    P/Invoke:

    (Ben Watson) Writing High-Performance .NET Code.png

    I tried to use it with the ENet wrapper, and it works. Not a huge impact in this case, but still a bit better. This attribute can be applied to a whole class where you interop native functions and they all will be affected.

    Neutrino now passed the test with 500 and 1000 simulated clients. After debugging, I found that the problem was sitting in peers connection timeout. The time interval was too short, and that was causing clients drop. Now it's fixed.

    Also, I eliminated almost all in-process memory allocations in the application's functions, except those that caused by TPL:


    I have an idea how to solve this but I need to read a couple more things before doing it.

    In general, the application is now more optimized but yea, there is still some work to do.

    By the way, we recently talked with Alex about how things are going with UNet, and they still have a few problems that must be solved. I'm also impressed that UNet has a Replay Protector and now it processing packets much faster than before due to some improvements. Can't wait for an updated version to see how it works after all changes that they made.
     
    Last edited: Jul 16, 2018
    moco2k and Deleted User like this.
  28. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    UNet Server 1.0.0.9 is out. Now I just need to change some code in the application before running the test.
     
    Last edited: Apr 14, 2018
  29. SimpuKR

    SimpuKR

    Joined:
    Apr 13, 2018
    Posts:
    1
    Hope to see big improvements! ☺
     
    nxrighthere likes this.
  30. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    BenchmarkNet 1.09 has been released.
    • Added parameters calculation for UNet
    • Added timeout calculation for Neutrino
    • Updated UNet to the latest version
    • Updated LiteNetLib to the latest version
    • Updated DarkRift to the latest version
    • Rebuilt Lidgren with optimized functionality
    • Improved detection of initial failure
    • Improved detection of server thread failure

    The results of the UNet were updated.

    Yes, it finally happened! I've updated the UNet library, tuned new parameters a lot, and now we have a completely different picture. Moderate CPU usage, no more memory leaks, faster processing time in high load scenarios, and lower bandwidth usage. However, there are still a few problems remain. First one, a high memory consumption compared to other networking libraries. Memory is not growing insanely like before, but it's a memory allocation for each client due to initialization. And second, latency is too high when more than 800 simulated clients connected to the server. @aabramychev knows about these problems, and he will try to solve them as soon as possible. We can expect an even better performance most likely this month.
     
    Last edited: Aug 21, 2018
  31. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    By the way, I implemented a pre-allocation mechanism for tasks, but the funny thing is that it didn't affect the results. The cost of memory allocation is very cheap, so yea, I didn't add this feature to the application. It's just a waste of time, and lines of source code.
     
    Last edited: Jul 17, 2018
    -chris likes this.
  32. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    3,356
    Sorry it took so long to get this out. A bit rough I'll probably polish it up some over the next week or so.

    FYI this is basically the collection of techniques for optimizing data for realtime games that I've worked out over the years.

    https://github.com/gamemachine/MultiplayerSpaceEfficiency
     
    Deleted User and nxrighthere like this.
  33. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Ah, that's nice, Chris. Personally, I've never used Protobuf for reasons. I just in love with MessagePack. :D

    By the way, I would add an integer encoding to these techniques. In some cases, this thing really helps.
     
    Last edited: Apr 19, 2018
    unlikelysurvival likes this.
  34. TwoTen

    TwoTen

    Joined:
    May 25, 2016
    Posts:
    1,168
    nxrighthere likes this.
  35. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    3,356
    That's what varint encoding does. Hence why protobuf. You get it all in one package.
     
  36. TwoTen

    TwoTen

    Joined:
    May 25, 2016
    Posts:
    1,168
    Well protobuf isn't that nice on your performance, especially not for realtime games. Mainly due to heap allocation. The BitWriter we have essentially has a list pool where you can stack objects. Thus it doesn't expand. And you don't have to allocate when writing. You can write to a pre allocated buffer. So internally, the MLAPI uses this and it results in almost no allocations when writing the headers.

    Flatbuffers is also very intresting tho, allows random read access etc.
     
  37. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    3,356
    varint encoding variants have a lot of research being done on them, like this here:

    https://lemire.me/blog/2017/09/27/stream-vbyte-breaking-new-speed-records-for-integer-compression/

    id compression is a big deal in stuff like search engines.

    FYI heap allocation is not really a protocol buffer issue per say. I have no per message allocation in my setup using protobuf-net combined with DotNetty. Combination of using ArrayPool and ByteBuffers.

    Flatbuffers is good on memory bad on space. Not really suited for realtime games.
     
    TwoTen likes this.
  38. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    @snacktime Oh, now I see.

    Great stuff guys!

    The thing that I don't like in those serialization libraries is a schema pre-compilation. This is one of the reasons why I prefer MessagePack where a class/struct itself is the schema.

    Yea, the buffer pooling is used everywhere typically. I've backported the System.Buffers from .NET Core to Unity with some changes to keep it thread-safe with .NET 3.5.
     
    Last edited: Apr 20, 2018
  39. Deleted User

    Deleted User

    Guest

    Hi there. Don't you know how to specify ZigZag encoding in .proto file, instead of runtime serialization?

     
  40. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    3,356
    .proto files aren't generally used with protobuf-net, in preference of the more idiomatic approach with attributes
     
  41. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    3,356
    Added a section on the basic approach to zero GC serialization/deserialization. Uses protobuf-net as the example but should work for any library that provides a Merge functionality for deserialization. Also uses System.Buffers, although creating your own byte[] pool isn't hard.
     
    nxrighthere likes this.
  42. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Well, since you shared this, people will need the buffers library itself. So, [link removed] backported version for Unity.

    By the way, I would like to know how you handle scope/area of interest. :rolleyes:
     
    Last edited: Oct 10, 2018
    snacktime likes this.
  43. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    3,356
    As in tracking stuff in range of a point?
     
  44. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Yep. I know that you are using a concurrent fixed array for this, so I think it would be nice if you will add more information about it.
     
  45. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Great news guys. Valve added a flat interface, so one more attempt to integrate it with the application.
     
    Deleted User likes this.
  46. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    3,356
    So I actually use a couple of different approaches depending on the context.

    Originally I started out using spatial hashing.

    Spatial hashing is popular but it has a few downsides.

    - It doesn't give you exact precision on distance.
    - You need to create a separate hash with different cell sizes for each distance range you want to query with.

    The good thing about it is it scales well. Doesn't matter whether you have 200 or 200,000 entities the performance is the same. The base cost for updating and querying is higher though. Query results have to be read from cells and written to an array. A non alloc api for it is easy enough, but it is considerably slower then just iterating over a single array. It's how it scales is where it shines.

    Just a note on spatial hashing vs quad trees. Quad trees give more precision. But most require regenerating the entire tree when you update anything. Generally these work best for static data, where you are not adding/removing entitie and the entities don't move.

    The thing is I think the norm is that you care about the precision, and are working with a relatively small number of entities, few hundred at most. And linear iteration plus Vector2 distance checking is really quite cheap in that case.

    The concurrent array thing was to find something that worked well for the linear search pattern. .Net concurrent structures that were appropriate like ConcurrentDictionary, allocate on iterating the values because everything is in buckets. There is no single backing array it has to allocate a new list on every call to Values.

    So the concurrent array has a single backing array. A concurrent queue and concurrent dictionary to manage entity id's and map those to backing array indexes. It uses an optimistic lock when writing to the backing array. You can access entities by id as well as iterating over the backing array directly.

    So it it's guaranteed to write a complete entity safely or not at all to the backing array. But it's not guaranteed that the write itself won't fail. This is done via Interlocked.CompareExchange and we just ignore the result. But we don't really care about that because the only cases where you might have two threads writing the same entity are for stuff like when you remove the entity.
     
    Last edited: Apr 20, 2018
    nxrighthere likes this.
  47. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Yea, hashing is what I'm currently using, and I'm still looking for a better approach. Thank you.
     
    Last edited: Apr 20, 2018
  48. buFFalo94

    buFFalo94

    Joined:
    Sep 14, 2015
    Posts:
    273
    @nxrighthere sorry I don't want to annoy you but @arcturgray suggest here
    So i'm a bit lost what changes is referring to?
     
  49. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    This and this one. By the way, the original wrapper is not ideal and requires a lot of changes. I would like to share my private repository, but it's no longer compatible with original ENet, unfortunately.
     
  50. buFFalo94

    buFFalo94

    Joined:
    Sep 14, 2015
    Posts:
    273
    Thanks. I'll try to make it work:)
     
Thread Status:
Not open for further replies.