Search Unity

[Blog Link] Fixed point arithmetic in C#/Unity for RTS games

Discussion in 'General Discussion' started by PhobicGunner, Jan 15, 2017.

  1. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    So, recently I started coding up an RTS prototype in Unity and one of the design goals of this is to make it support multiplayer.
    The thing about RTS games is they often have at least hundreds if not thousands of units at once, and that's way too much to just directly serialize over the network. So, instead, they usually have to use lockstep simulation (serialize player commands instead, and just ensure that everyone runs their simulations with the same inputs so that they all have the same results).

    However, floats have no determinism and are completely useless for the task, therefore I found myself having to implement my own fixed-point math library (since fixed-point math IS deterministic). I have decided, then, that I will start releasing a series of blog posts that others can read on how to implement fixed-point math like I have done, in case anyone else wants to work on RTS-style games and needs lockstep simulation.

    The first post is at the following link. It covers how fixed point works, why it's still useful, and the beginnings of a Fixed struct type in C#.

    http://jetfistgames.com/blog/post/fixed-point-math-c-part-1

    I'd appreciate any feedback and questions on this, and I hope people will find this series helpful.
     
  2. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
  3. Dave-Carlile

    Dave-Carlile

    Joined:
    Sep 16, 2012
    Posts:
    967
    Thanks for this. A need I've run into is a fixed point physics engine. One of my game ideas involves the ability to pick up (uh, no real way to say this delicately) body parts after an explosion and use them for weapons. The explosions need to be physics based to get the right look, and they all need to happen as part of the lockstep mechanism across clients so all of the pieces are in the same place. Anyone play Bungie's old game Myth? That gives the general idea.
     
  4. imaginaryhuman

    imaginaryhuman

    Joined:
    Mar 21, 2010
    Posts:
    5,834
    Thanks. I was thinking about using some fixed point myself.... I understand it uses mainly bitshifting and and'ing, which are fast operations. And then integer math functions. I would presume possibly that this is at least roughly the same performance as using real floats? Or not too far off? Did you do a performance comparison for number crunching to compare them?
     
  5. Dave-Carlile

    Dave-Carlile

    Joined:
    Sep 16, 2012
    Posts:
    967
    I asked this on your blog, but might as well do it here too...

    Code (csharp):
    1. public Int16 RawDecimalValue
    2. {
    3.     get
    4.     {
    5.         return (Int16)( this.rawValue & 0xff00 );
    6.     }
    7. }
    Shouldn't that be & 0xffff? Aren't you trying to mask off the lower 16 bits for the decimal part? Or am I missing something?
     
  6. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,572
    Please consider investigating existing fixedpoint implementations.

    Multiplication, division, subtractions and basic arithmetics are extremely simple. What I would find interesting is:
    1. Square root.
    2. Trigonometry functions (sin/cos/tan/cotan/asin/acos/atan)
    3. pow (with support with fractional power)
    4. logarithms (log2, log10, ln)
    As long as those are made without falling back to floats or doubles (which defeats the point of it being deterministic).

    int64 or int128-based fixedpoint would also be interesting (simply because C# does not have support for 128bit and 256bit integers).
     
  7. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    You know, you're probably right and I probably made a big dumb error. Gimme a few minutes to test and I'll update the blog post.

    Yeah, I'm going to be getting into rounding out the math lib in later posts. Some trig functions at the very least, plus vector math implementations. I am aware that that stuff is useful, as I also need it, so it will be done :)
     
  8. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    FPUs these days are actually very fast. You aren't likely to get a performance boost from using fixed point, at least not on today's PCs or consoles (probably not even mobile). The primary benefit of using them is, like I mentioned, having 100% deterministic math (which is required for lockstep simulation for games such as RTS, which can involve hundreds or thousands of units).
     
    PrimalCoder and BlueprintZ like this.
  9. BlueprintZ

    BlueprintZ

    Joined:
    Dec 28, 2016
    Posts:
    32
    sometimes robustness is more important than performance.
     
  10. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,572
    As far as I know, the only benefit of of fixedpoint math on modern hardware is its predictability.

    For example, see: https://randomascii.wordpress.com/2013/07/16/floating-point-determinism/

    I would expect multiplication and division to be slower than floating point counterpart, and any higher level operations order of magnitude slower.

    Basically, this is something you'd use (on modern hardware) when you want results to be identical on any device.

    in MS-DOS era things were different, and fixedpoint was used for textured renderers, for example (because loading data into FPU took time).
     
    Dave-Carlile likes this.
  11. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    Just thought I'd swing by with some actual perf metrics since that did make me curious.
    Fixed point is definitely far slower than floating point, even just doing basic arithmetic. My test was, basically, this:

    Code (csharp):
    1.  
    2. number += 2.0;
    3. number -= 2.0;
    4. number *= 2.0;
    5. number /= 2.0;
    6.  
    Implemented for both types (Fixed and float). Each of these was performed 100,000 times, measured with a Stopwatch.
    For that test, my Fixed type consistently took 13-14ms to execute, whereas float took only 2ms to execute the same operations.

    So it's apparently around 7x slower in this test. And that's fine, because for my use case (a very common one), speed is an acceptable trade for determinism (the speed of float means nothing if my entire game comes crashing down because two client sims diverged at some point).

    EDIT:
    OH, and by the way guys, I'm also going to be covering at some point how to make a custom propertydrawer for this Fixed type so that you can make a nice inspector field that just lets you type in a decimal and internally serializes the fixed point integer value for you :)
     
    Last edited: Jan 26, 2017
  12. imaginaryhuman

    imaginaryhuman

    Joined:
    Mar 21, 2010
    Posts:
    5,834
    Interesting. So basically your code is something like:

    ((number << 16) + (2 << 16)) >>16

    to do an addition? at the lowest level I'd think this would be no more than 4 cpu instructions (3 shifts, 1 add)?
     
  13. Dave-Carlile

    Dave-Carlile

    Joined:
    Sep 16, 2012
    Posts:
    967
    For adding you can just add the two values. The shifting comes into play for multiplication and division.
     
    neginfinity likes this.
  14. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    OK, actually turns out that test was a bit skewed - primarily because I ran in Debug, which disables a crapton of optimizations both in the IL and in the JIT (including inlining). Here's some even better perf metrics for you guys:

    Still running 100,000 iterations, each iteration does an add, a subtract, a multiply, and a divide. I added two new tests - a wrapped float, that is a struct containing nothing but a single float and implements the basic arithmetic operators (delegating operations to the internal float value), and an "inline fixed point" test which just performs fixed point calculations directly on a long value right inline. The results, now in Release mode:

    Which is really interesting. It turns out, at least in Release mode, fixed point isn't even remotely as slow as I had thought previously. Actually it's pretty damn fast. Float still wins, however (even at 1 million iterations float still measures as 0ms).

    But the more interesting note is how somehow a fixed point struct outperforms a simple struct which just wraps a float (and still generates the correct value - I verified just to ensure I wasn't doing something wrong).
    I can't quite figure out what's going on there, even if I look through the generated IL using ILSpy (the fixed point's multiplication and division functions are both longer by two instructions, a load instruction and a shift instruction). And maybe also interesting is that some overhead of fixed point just comes from the fact that it's wrapped in a struct - raw inline fixed point arithmetic is twice as fast. Additionally, this odd performance quirk changes if I run in .NET 4.5 instead of 3.5 - fixed point execution time leaps up to 7ms, while wrapped float execution time drops to 3ms.

    Just thought these would be interesting performance metrics to note.
     
    PrimalCoder and imaginaryhuman like this.
  15. Dave-Carlile

    Dave-Carlile

    Joined:
    Sep 16, 2012
    Posts:
    967
    For .net 4.5 you might try playing around with the method implementation attribute if you haven't already.

    Code (CSharp):
    1. [MethodImplAttribute(MethodImplOptions.AggressiveInlining)]
    Not entirely sure how wise it is, but would be interesting to see what difference it makes.
     
    Last edited: Feb 3, 2017
  16. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,572
    Yes, also for multiplication technically the best idea would be 2x sized temporary storage. Meaning 64bit word for multiplying two 32bit words, etc.

    Basically... fixed point is really just number*scale representation. With that in mind.

    Code (csharp):
    1.  
    2. a*scale + b*scale = (a+b)*scale - correct by default
    3. a*scale * b *scale = a*b*scale*scale - you need to right shift the result to keep it in correct position
    4. (a*scale) / (b*scale) = (a/b) * (scale/scale) = a/b - you need to left shift the result to keep it in correct position.
    5.  
    Fixed point was originally faster than floating point (think about 166 Mhz cpus), however, the benefits go out of the window when you start dealing with something that is not addition/substraction/multiplication. Complex operations will resolve into multiple instructions, and will be much slower. I believe that in 2000s there were new assembly instructiosn introduced, which accelerated floating point operations further.

    Also, when you switch from 32bit fixedpoints to 64bit or 128bit, you'll encounter slowdown, because even simplest operations will be resolving into multiple instructions.
     
  17. bluescrn

    bluescrn

    Joined:
    Feb 25, 2013
    Posts:
    642
    Depends on the CPU... Certainly on older ARM CPUs (thinking way back to the GBA days), you often got shifts for free, using the barrel shifter as part of a mov/add instruction. Fixed point seemed very fast back then... especially when you were actually writing assembly for performance-critical code, and you had no floating point hardware!

    Not sure if the free shifts still exist on modern ARM-based hardware, and I've not yet investigated what the il2cpp output looks like when doing fixed point in C# with overloaded operators... guessing it could be quite nasty, so performance is certainly a worry - but very much secondary to the other issues of working with lockstep simulations and without floating point...

    As well as switching to fixed point (which prevents you from doing many things 'the Unity way' - e.g. physics/collision, pathfinding, animation-driven-anything), you've also got to use a fixed timestep. Which means doing your own interpolation when rendering things (with a smaller/variable timestep) - another one of those things that's easier said than done...
     
  18. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    I don't know how much of this I'm going to cover or touch on in the tutorial series, but on a side note these are all things which the RTS kit I'm working on is going to tackle (with the exception of animation driven movement):
    • Collision and physics will probably be super simple, nothing more than a simple push force between units and AABB collision with buildings/structures. And otherwise the game will operate entirely in 2D.
    • Pathfinding will be a completely custom-written grid based system (being written by a friend of mine, actually). A custom implementation of Jump Point A*, with a few tweaks to improve performance in a few edge cases. This provides a large performance benefit over regular A*. Fun fact: when doing lockstep simulation, you actually cannot do async multithreading (for example, delegating pathfinding to another thread). Because you can't guarantee on what frame that thread's results will be available, you can't reliably use those results on the same frame on everyone's machine. Therefore you have to use single threading and timeslice everything instead (coroutine style).
    • Custom interpolation of renderers (where simulation is stepped in FixedUpdate, and game entities interpolate results in Update).
    EDIT: OK the multithreading thing isn't strictly true. If you've got multiple tasks you can run in parallel, you can still get a speed boost out of running each of those on separate threads and having the main thread explicitly wait until they're finished. Which means you still need to ensure each task doesn't run too long, but you can now run multiple in parallel (which you could combine with a timeslicing mechanism).
     
    Last edited: Feb 7, 2017
  19. Ryiah

    Ryiah

    Joined:
    Oct 11, 2012
    Posts:
    21,204
    Streaming SIMD Extensions. SSE added single-precision instructions and SSE2 brought double-precision.

    https://en.wikipedia.org/wiki/Streaming_SIMD_Extensions
     
  20. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,572
    Yes, those. I think Quake 4 was the first game that required those extensions and could not run without them.
     
  21. ashish-gogna

    ashish-gogna

    Joined:
    Oct 7, 2014
    Posts:
    1
    Sounds great!

    I am working on a Pool/Snooker style multiplayer game.
    How can i use Fixed point math with Unity's Physics Engine to make physics completely deterministic?

    Or maybe with another open source physics library?

    Thanks in advance!
     
  22. zoran404

    zoran404

    Joined:
    Jan 11, 2015
    Posts:
    520
    @ashish-gogna No, you can't.
    Please don't reply to old threads. Thanks.
     
    Ryiah likes this.