Search Unity

BenchmarkNet (Stress test for ENet, UNet, LiteNetLib, Lidgren, MiniUDP, Hazel, Photon and others)

Discussion in 'UNet' started by nxrighthere, Jan 13, 2018.

Thread Status:
Not open for further replies.
  1. RevenantX

    RevenantX

    Joined:
    Jul 17, 2012
    Posts:
    148
    @nxrighthere hi! Can you add latest LiteNetLib from master to becnh results? I make some critical optimisations and want see your results)
     
  2. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Sure, Ruslan. Don't worry, I saw those changes. ;)
     
  3. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    @RevenantX Well, it's faster than the previous version but slower than interop with native sockets for about 6 seconds, and resources usage a bit higher. Can we get back the unsafe preprocessor directives? :rolleyes:
     
  4. RevenantX

    RevenantX

    Joined:
    Jul 17, 2012
    Posts:
    148
    @nxrighthere i think there is no sense in unsafe version, because who wants unsafe can use Enet library.
     
  5. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    I'm using the unsafe version in the application currently. :) All right, I'll update the results soon.
     
    Last edited: Apr 24, 2018
  6. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Done. At least in the simulation with 1000 clients memory usage is a bit lower than before.
     
    TheBrizleOne and Deleted User like this.
  7. RevenantX

    RevenantX

    Joined:
    Jul 17, 2012
    Posts:
    148
    @nxrighthere this is strange results. Because on my pc 500 clients - 14-20% cpu, 1000 - 50-60%... But anyway this is fine)
     
    TheBrizleOne and Deleted User like this.
  8. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Different hardware produces different results. There's nothing new here. :) My home's PC is pretty old, and I don't know when I'll upgrade it. Hardware prices in my country are inadequate.
     
    Last edited: May 11, 2018
    maewionn and TheBrizleOne like this.
  9. adiif1

    adiif1

    Joined:
    Jun 8, 2015
    Posts:
    5
  10. TwoTen

    TwoTen

    Joined:
    May 25, 2016
    Posts:
    1,168
  11. siliwangi

    siliwangi

    Joined:
    Sep 25, 2009
    Posts:
    303
    Never heard of NetDrone Engine being added?, it seems limited only until 30 CCU, i just wanted to know the system resources usage of NetDrone with 30 CCU.
     
  12. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    I don't add support for CCU limited solutions.
     
  13. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Quick update: LiteNetLib and DarkRift are updated to the latest versions, both libraries now a bit more optimized and work faster. Also, Lidgren was rebuilt with optimized functionality to avoid using expensive WinAPI calls for resolving a network interface. You can see the difference in resources usage here (thanks to @aienabled for the contribution).
     
    Jamster likes this.
  14. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    3,356
    So wanted to run something by this group and see if anyone else has better ideas.

    So normally I use protobuf-net for encoding. But it has some quirks and I hit a couple of cases where I couldn't work around them. Mainly it boxes value types which is why I use ref types for messaging with protobuf. But that was starting to cause pain in other areas. So over the weekend I decided to fix it, and at the same time it made sense to just look at better ways to pack the data.

    So the basic idea is I still use varints, I just ripped some existing code for that out of protobuf-net. But instead of using tags for each type which take up one varint, I use a bit packed header to say which fields have non default values. I use code generation so that the field ordering is at compile time. Which is the only clean way I could think of to make it work without using reflection. Delegates might have been another option but my gut was code generation would allow for the most optimization.

    The bit packing is the main thing I'm curious if maybe there is some better approach I'm not thinking of. The header I abstracted out into a BitVector64. It wraps 2 BitVector32's, using the second one only if needed. When serializing messages I write the header as a varint(s), so it doesn't always take up a full 4 bytes per BitVector32.

    I'll put it all up on github once I know it's working as intended. I have some basic unit testing but want to put it into our game to test against real data before putting it out into the wild.
     
  15. TwoTen

    TwoTen

    Joined:
    May 25, 2016
    Posts:
    1,168
    Is this an official patch PRed into the Lidgren repo or is it a fork? Could you link the patch.
     
  16. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Last edited: Jul 17, 2018
  17. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
  18. TwoTen

    TwoTen

    Joined:
    May 25, 2016
    Posts:
    1,168
    The MLAPI has open source implementations that offer similar compression if that's what are after. Both ranged floats and varinted floats. (We convert the float to uint using a struct with explicit memory layout, then reverse the order to make the field more likley to be small as the flag bits are in the start. Then we do normal varint).
     
  19. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    3,356
    Actual value encoding I"m good on. The challenge was more about the most efficient way to signify that specific fields should be included, like when you have logic for serializing an entire class/custom value type.

    Using a bit per field where the sum of those get encoded as a varint(s) is what I came up with. But it seems that given the context of code generation there might be a more efficient method available.
     
  20. Driiade

    Driiade

    Joined:
    Nov 21, 2017
    Posts:
    80
    So... Time to shift on DarkRift for creating Games ?
    Don't want to stay with Unet since it's deprecated, and no real information on new System.

    What do you think about choosing DarkRift for now ?
     
  21. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Well, I don't like the fact that it utilizes TCP for reliable transmission. It's not a good way to provide reliability in real-time games for many reasons. Also, I don't see support for sequencing so the packets may come out of order and it opens the doors to replay attacks. In general, it's a good library but technically loses in some aspect to the others.
     
    Deleted User likes this.
  22. Driiade

    Driiade

    Joined:
    Nov 21, 2017
    Posts:
    80
    Oh yes, very bad :s
    Was Enet your best choice right ?

    I look for a replacement to Unet LLAPI, with exact same features. Don't care about HLAPI, I have mine.
     
  23. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    For me, technically, ENet is almost a paragon. We improved it in our fork over the past few months (I mentioned here about it). ENet is a time-proven solution that's used in many successful games like League of Legends for example, and it's also integrated into the game engines such as Godot, LÖVE, etc.
     
    Last edited: Aug 3, 2018
  24. hippocoder

    hippocoder

    Digital Ape

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    And strangely enough, ECS is used in LoL for optimisation purposes! Would love to read your thoughts about the new networking in Unity.
     
  25. Driiade

    Driiade

    Joined:
    Nov 21, 2017
    Posts:
    80
    Enet and other solutions seems to lack a key feature for me :

    Timestamp ?

    Or I missed something ?
     
  26. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    @hippocoder Well, I have many questions about the new transport layer, and I don't understand the deprecation of years of work on LLAPI. What technical advantages does it provides (or planned) over the LLAPI and many other available solutions? What the idea behind it, besides the fact that they want to make it an open source project? From my point of view, this is again the reinvention of the wheel, and I'm very skeptical about this decision.

    Most people here are not interested in Multiplay/Google services, they just want to work with a solid transport layer paired with a modern high-level API that supports the MonoBehaviour/ECS workflow. People want to build and deploy their own client-server/P2P games. They only need the right tools. Will Unity provide it? Time will tell... As for me, I already made such tools for myself (thanks to @inlife360 and @snacktime).

    @Driiade This is not low-level stuff and that's why libraries don't provide it. However, ENet has enet_time_get() or ENet.Library.Time in C# wrapper (it's monotonic time in our fork), so you can timestamp the packets using this function if you want. The timestamps is not good and efficient way to keep things synchronized, it's always better to track ticks.
     
    Last edited: Jan 25, 2019
    moco2k, hippocoder and Driiade like this.
  27. Vincenzo

    Vincenzo

    Joined:
    Feb 29, 2012
    Posts:
    146
    nxrighthere, you said you forked ENet, can you update your benchmark with your fork next to the current Enet so we can compare those results?
     
  28. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Sure, why not. We also performed multiple 24-hour tests, and it works very stably.
     
    Last edited: Aug 12, 2018
  29. Driiade

    Driiade

    Joined:
    Nov 21, 2017
    Posts:
    80
    :eek:oooooooooooo

    Hummm, would I have to put your work on my HLAPI, or wait for unity
    hmmmmmm.
    Really I don't know ^^'
     
  30. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    I re-uploaded the file, forgot to change a few lines of code.
     
  31. hippocoder

    hippocoder

    Digital Ape

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    Well if it's not got a release date, this question doesn't matter. If there is a release date, do not wait for Unity.
     
  32. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    You know, even if they will release it tomorrow, the new transport layer still will be not a bug-free/ready-to-go solution. It will take a lot of time to fix things in a fresh product, and we all know how long it takes for Unity. Do you remember 2014 when LLAPI was introduced? Now they are replacing the tech.

    ENet lives about 14 years, it's designed to work for ages, and we (and many other open source developers) will support it as long as possible.
     
    Last edited: Aug 4, 2018
  33. Vincenzo

    Vincenzo

    Joined:
    Feb 29, 2012
    Posts:
    146
    Yes ENet is old, reliable and so on, but your fork is not I assume, and I'm sure a 14 year old project by itself has a lot of legacy stuff inside and uses old and outdated methods for many things.
    Your performance compare looks very promising however.
    Is your unity ready C# wrapper on some Github Project and is it supported?
     
    Driiade likes this.
  34. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    We removed the legacy stuff and improved functionality using a modern API (Windows 7 and higher/Linux/MacOS/Mobile platforms).

    Nope, it's available only here and maintained in my local repository at the moment.

    I'm always ready to help when I can.
     
    Last edited: Aug 5, 2018
  35. Vincenzo

    Vincenzo

    Joined:
    Feb 29, 2012
    Posts:
    146
    Awesome stuff. Maybe if your willing to throw it on github, that would be awesome.

    I'm currently using the LLAPI but it is full with problems so I'm considering switching to another library, Personally I'm a bit scared about putting unmanaged code into my project, but wrapping around enet is ofcourse an option. another is the other librarys, like LiteNetLib.

    The big question, what is production ready?

    I don't trust Unity to come up with something decent, that is for sure.
     
    Driiade likes this.
  36. Maxim

    Maxim

    Joined:
    Aug 17, 2009
    Posts:
    38
    It would be nice to have linux (.net core / mono builds) test results.
     
  37. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    It will be available on GitHub when I finish converting everything to a safe code. Unsafe code will be optional to keep it as the most efficient way to access and modify data since the networking layer is a performance-critical part and games is latency sensitive applications.

    I understand your fears, and that's why we are performed multiple 24-hour tests to debug unexpected behaviors and to make sure that everything is stable.

    We should always be prepared for difficulties, and the real question is how quickly we can solve them. @RevenantX very quickly fixing the bugs in his library, and I believe that LiteNetLib is a reliable choice.

    Unfortunately, I can't provide them at the moment. The source code is open, and you can perform any tests on any target platform yourself.
     
    Last edited: Aug 12, 2018
  38. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Just craft your code carefully, make your high-level stuff transport-agnostic as much as possible and you will be able to painlessly replace things when (and if) time will come.
     
  39. goldbug

    goldbug

    Joined:
    Oct 12, 2011
    Posts:
    768
    @nxrighthere have you considered measuring latency? It would be interesting to see how TCP based libraries compare to UDP based libs when it comes to latency.

    I would also love to see some websocket based protocols, since that is the only thing available for webgl games.
     
  40. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    A separate option should be implemented to measure the latency, the application in the current state is not suited for this task. I have many great ideas on my to-do list, but I don't have time to implement them, unfortunately. Maybe someday, but I can't promise.
     
  41. goldbug

    goldbug

    Joined:
    Oct 12, 2011
    Posts:
    768
    @nxrighthere From reading your code, you are connecting to 127.0.0.1 in your tests.
    This is the loopback device. There is no actual network card involved in this, it is just the kernel passing the message from one thread to another. This means no packet loss at all.

    It would be awesome if the clients and the server were on physically different machines.
     
  42. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Regardless of any device physical or not, packets will drop under different conditions since IP doesn't guarantee delivery at all. The test perfectly demonstrates it. Use any available protocol analyzer for in-depth investigation.

    If you need such a test, feel free to do it. But funny issues awaiting your NIC in high-load simulations.
     
    Last edited: Aug 14, 2018
  43. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    @Vincenzo I've finished converting the C# wrapper to safe code. To avoid trading performance for safety, I just extended the ENet functionality, and now the wrapper reads all the data directly from the unmanaged side via managed pointers, so marshalling and struct mirrors are almost not involved. The funny thing is that CPU usage dropped by ~10% in high-load simulations compared to the unsafe version with equivalent speed. Road to release after a couple of tests.
     
    Last edited: Aug 12, 2018
    Vincenzo likes this.
  44. goldbug

    goldbug

    Joined:
    Oct 12, 2011
    Posts:
    768
    @nxrighthere Could you display the amount of messages sent per second?

    I made a change to Telepathy that speed up Send and reduces bandwidth per message.
    Because the Send is faster, it ends up sending more messages per second in your tests. Less bandwidth per message but more messages per second means the overall bandwidth remain roughly the same.
     
    Last edited: Aug 15, 2018
  45. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Messages will not be sent earlier than 67 milliseconds of delay per iteration with default parameters. In some cases, subtask might be slightly delayed beyond this limit, but it will not perform its job earlier. It doesn't matter how fast the send function is. Total elapsed time will always be the same if the networking library is not slowing down itself.

    It's not possible to precisely calculate it due to non-deterministic nature of the TPL, only a simple incrementation per subtask is an option.
     
  46. goldbug

    goldbug

    Joined:
    Oct 12, 2011
    Posts:
    768
    Suppose the Send method takes 20 milliseconds. Then you sleep 67 milliseconds , you would be sending a message every 20 + 67 = 87 milliseconds.

    Suppose the Send method takes 5 milliseconds. Then you sleep 67 milliseconds, you end up sending a message every 5 + 67 = 72 milliseconds.

    This is what is happening in my tests. The test will be very favorable to slow Send methods.

    Before my change, Telepathy was sending 134000 messages per second.
    After my change, Telepathy is now sending around 156000 messages per second.
    Both results were using the same sleep and same parameters, the only difference is: Send is faster.

    To reproduce it, take any library you want, and inside that library's send method add a sleep(200), the end result will be very low bandwidth and very low CPU.

    Every time you send a message, increment a counter. After 10 seconds (calculated from a stopwatch or using time.time), you divide the counter by 10 and report how many messages per seconds were sent. Then you reset the counter and stopwatch for the next window.
     
    Last edited: Aug 15, 2018
  47. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Of course, since the send function itself is not asynchronous.

    Excuse me? How will the slow operations have a positive impact on the total elapsed time?

    What exactly this information gives you since the calculation is done not in time and it's not precise?
     
  48. goldbug

    goldbug

    Joined:
    Oct 12, 2011
    Posts:
    768
    if Send takes longer, it means less messages / second, which means less bandwidth. The slower the Send, the better the bandwidth will look.

    In my case, I implemented a massive speedup to Telepathy. Even though it is sending less data per message and sending more messages, you would not know by looking at this benchmark.

    The purpose of this information is to catch a "cheat" when slow Send methods are making the bandwidth look good.

    A better approach would be to set up a timer, and send a fixed amount of messages per second no matter how long the Send method takes (no sleep, just timer). But that would be asking for a lot more work.
     
    Last edited: Aug 15, 2018
  49. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    567
    Cost of lower throughput is a time. The slower this function, the later the whole process will be completed. It's natural, and it's acceptable.

    Each call of the send function is counted using an atomic operation that increases non-cached values. You always will know how many messages were sent since one message is sent per function call. You only will not know how many reliable packets were sent/lost during the process. In this case, you should use transport-level measurements or a protocol analyzer.
     
    Last edited: Aug 15, 2018
  50. goldbug

    goldbug

    Joined:
    Oct 12, 2011
    Posts:
    768
    Ok, so could you include the elapsed time in the results?

    Enet in your test takes 1:34 to complete (from your cmd screenshot) vs 2:04 in others

    If we were to use a timer instead of send + sleep, which is what a game would do, Enet would end up with much lower bandwidth and CPU usage.
     
    Last edited: Aug 15, 2018
Thread Status:
Not open for further replies.