Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.

Don't use double.ToString() if you don't want to lose precision

Discussion in 'Scripting' started by thefallengamesstudio, Jul 17, 2019.

  1. thefallengamesstudio

    thefallengamesstudio

    Joined:
    Mar 7, 2016
    Posts:
    672
    Hi devs,

    So I've just spent 2 hours figuring this out and thought I could help other devs avoid this.
    This is a general C# oddity I found (java doesn't do this)

    Don't use double.ToString() without a specifier if you plan to then parse it back. Use the "G17" specifier whose sole purpose is to assure a roundtrip conversion without loss of precision.

    More info (the "R" specifier I think is kept for historical reasons, and it shouldn't be used either): Link

    Test it:
    Code (CSharp):
    1. double d = new System.Random(DateTime.Now.Millisecond).NextDouble();
    2. string asStr = d.ToString();
    3. Console.WriteLine(d == double.Parse(asStr)); // false
    It's more intuitive to not lose any information when you don't use any specifier, both on converting to and from a string. Again, java doesn't do this.
    Well, gotta live with that.
     
  2. Kurt-Dekker

    Kurt-Dekker

    Joined:
    Mar 16, 2013
    Posts:
    31,368
    You can use System.BitConverter to convert the double/float/whatever to and from a lossless string representation:

    https://docs.microsoft.com/en-us/dotnet/api/system.bitconverter?view=netframework-4.8

    It basically turns it in to and back from a hexadecimal number string, with some other code. Here's my usual use:

    https://bitbucket.org/kurtdekker/da...ks/Assets/Datasack/Core/DatasackFormatting.cs

    The above code also prefixes it with "0x" and handles the "f" and "d" suffixes gracefully if you supply a decimal representation, but you can strip that if you please.
     
    Ryiah, csofranz and Joe-Censored like this.
  3. Antistone

    Antistone

    Joined:
    Feb 22, 2014
    Posts:
    2,827
    Rounding off floating-point numbers when converting them to strings is absolutely standard and is usually what you want, because converting them to strings normally means you are displaying them to a human being, and when you say print(0.1) you don't normally want to see 0.1000000000000000055511151231257827021181583404541015625, which is the closest number to 0.1 that can be represented as an IEEE 754 double.

    Floating-point format isn't designed for precision, it's designed for flexible approximation.

    Why 0.1 + 0.2 returns 0.30000000000000004

    If you want to serialize a double so that you can get back the exact binary representation later, you don't typically convert it to a human-readable string in decimal; you save the bit pattern instead.
     
    Ryiah, SparrowGS, lordofduct and 2 others like this.
  4. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,139
    Dis, dis all da way.
     
    Ryiah, xVergilx and SparrowGS like this.
  5. csofranz

    csofranz

    Joined:
    Apr 29, 2017
    Posts:
    1,556
    Huh. Looks like someone didn't read their IEEE 754-2008 before coming here to post an opinionated work-around for a non-issue.

    Saying 'It's more intuitive to not lose any information when you don't use any specifier' is like saying 'it's more intuitive to expect a car to follow the road without steering input'. It's only intuitive if you fail to understand how either works. As an aside: being intuitive is often a far cry from being correct - but I digress.
    As Antistone so aptly put it: Most people don't want nor need a full-precision representation of a float each time you convert it to a string. Most of us are OK with '0.33' when their float contains the result of 1f/3f. If you need precision and use a conversion into a different format (one that was not designed for precision), the onus is on you to make sure that the conversion fits your requirements. The solution to your issue isn't choosing a better string representation, but to use a better representation of floats altogether. If you are using strings for convenience, that's OK - but please don't describe something as 'odd' simply because it's not what you expect.
     
    xVergilx likes this.
  6. thefallengamesstudio

    thefallengamesstudio

    Joined:
    Mar 7, 2016
    Posts:
    672
    Sorry for not being more explicit. I was referring to converting it to a readable string, which the first method can't do, but sure, it's for serialization-only purposes it fits.
    And I would definitely find a built-in way rather than using third-party libs for a task that's so basic.

    I'm familiar with how fp numbers behave and how they should be serialized, when no one needs to look at the actual serialized value. It's the C# design decision of not supporting a roundtrip conversion by default that bothers me.
    You won't get 0.1000000000000000055511151231257827021181583404541015625 anyway, if you do 0.1000000000000000055511151231257827021181583404541015625.ToString("G17"), but rather "0.10000000000000001", which is acceptably readable enough, if you also what to use that string to parse it back and you want the serialized value to look as a proper fractional number.


    With the today's massive usage of JSON, but not only, converting a double to and from a string should, by default, be done without "doing your homework". I'm for safety by default, not prettiness by default.

    The general consensus here is that usually you want to just display the value and want it to look pretty. OK, yes, this is the case, most of the time. But even so, what's more tragic, losing data or displaying it as 0.10000000000000001, instead of 0.1? How much work is there to be done in both cases, i.e., just changing a specifier for the ToString() to make it look prettier or re-creating all of the data in the medium you're storing it because precision was lost, in case you use non-standard storage formats, like JSON?

    In my case, I just wanted to present an editable double in an InputField, and when the user changes it, convert the resulted string it to double (which is changing it most of the time, because of the precision loss, and I want that), compare it to the older value and only modify it in the data source if it's different, and also updating the input field with the resolved double. Turns out, not as simple as it should be.


    Again, java doesn't do this. Here's how it goes for a very small number:
    Double d = 0.0000000000000000055511151231257827021181583404541015625;
    Double.toString(d) -> "5.551115123125783E-18"
    Double.parseDouble(Double.toString(d)) == d -> True

    On the other hand, C#:
    double d = 0.0000000000000000055511151231257827021181583404541015625;
    d.ToString() -> "5.55111512312578E-17" [Edit: seems like I either copied it wrong in my evaluator or just different CPUs give different results, because now I get a different value, 5.55111512312578E-18, but this still irrelevant, because the below line still evaluates to false]
    double.Parse(d.ToString()) == d -> False
    double.Parse("5.551115123125783E-18") == d -> False // and no, it doesn't even get back the value from the more precise string yielded by java's Double.toString()
     
    Last edited: Jul 19, 2019
  7. csofranz

    csofranz

    Joined:
    Apr 29, 2017
    Posts:
    1,556
    That's an interesting point, but IMHO one that usually doesn't apply. Loss of Information is only tragic if and when the Information that was lost is relevant. In your case it may be, in most cases it isn't. Look at the massive amount of Informationthat is lost by lossy image formats (e.g. JPEG). Here, information loss is the Feature, not a bug. Furthermore, you need to acknowledge that any form of floating point representation has a minimal resolution, and you therefore always have an inherent loss of Information.
    Also remember that FP numbers have a sliding absolute precision/resolution that is determined by the exponent. If you add two FP numbers, one small and one large (lets say 1000 times bigger), the result will have the resolution of the larger number, having alost all additional precision of the smaller one. You may even get to a point where, when you subtract the bigger number from the result, you get Zero, effectively having lost *all* Information of the smaller number. That's not a tragedy, it's inherent in the definition of FP numbers as they work. You should know the limits of tools you work with, or you'll open yourself up for expensive mistakes.

    In your case, the precision you are looking for simply doesn't exist - you are trying to preserve something that isn't there across the board, and are ascribing qualities to FP numbers that don't exist. The bast way to preserve a FP when Information integrity (note: integrity, not numerical precision) is at premium is how Kurt described: by encoding the binary representation instead of converting it.

    But that method will not guarantee a specific precision, just the relative precision of the exponent - and that becomes meaningless if there is another number with a different exponent. Unless you understand how FP work, please don't assert they have qualities that simply aren't there.

    Note further, that even though JSON is prevalent, it is up to you to how you actually transport and preserve the integrity of Data. Encoding floats (as described by Kurt) is a valid and more efficient way to do that (provided you can reliably solve lingering compatibility problems like hi-low-order byte encoding).

    So if you need to have a user manipulate and store highly precise numbers, make sure that the way you convert them is adequate for your requirements. 99 times out of 100, that precision isn't needed, and People can get by with simple string conversions and the Information loss is irrelevant. In computer games (we are after all talking in a Unity Forum), it's often better to use lower precision, and Information loss is not only advisable, it's an advantage.

    In this context - while I understand the example you show, I'm not sure I grasp the underlying use case. Where do you need numbers that are significant past the 50th Digit? If they are - are you sure you are choosing the right format?

    [to put that into perspective: the distance from Sun to Pluto is 5,906,376,272,000 meters. Your precision is sufficient for sub-atomic measurement and STILL have more than 20 digits to spare]
     
    Last edited: Jul 18, 2019
  8. Kurt-Dekker

    Kurt-Dekker

    Joined:
    Mar 16, 2013
    Posts:
    31,368
    Apart and aside from any discussions of formatting for visual vs storage, fundamentally you should never compare floating or double precision values for equality.

    Instead use something like Mathf.Approximately or do your own delta comparisons or else you are going to chase this bug forever in your code with mysterious failures for other reasons when equality tests start to fail, even if they had previously been succeeding.
     
    thefallengamesstudio likes this.
  9. Antistone

    Antistone

    Joined:
    Feb 22, 2014
    Posts:
    2,827
    5.551115123125783E-18 and 5.55111512312578E-17 differ by an order of magnitude, so you've got something different between these cases other than the behavior of toString. Probably your literals have different numbers of 0s in them.
     
    Ryiah and Kurt-Dekker like this.
  10. Kurt-Dekker

    Kurt-Dekker

    Joined:
    Mar 16, 2013
    Posts:
    31,368
    Oh come on, what's an order of magnitude between friends!?

    Good spot there @Antistone ... Not sure what's going on in the snippet above, because they actually do have the same number of preceding zeros in the literal. Go figure. Somebody hand-typed something somewhere...
     
  11. thefallengamesstudio

    thefallengamesstudio

    Joined:
    Mar 7, 2016
    Posts:
    672
    Again, I didn't criticize the imprecision of FP numbers as a concept, which anybody and their dog understands that you can't store potentially infinite precision on finite hardware. I'm just against the C#'s way of converting floats/doubles to string and back, when not using any parameters/specifiers.
    I don't care if the displayed string is exactly the same as the double it represents, I just expect that while converting it back using the same specifiers/parameters, which in this case are none, I'll get back the original double.

    I'm designing a generic Table View that's supposed to work with databases mostly, and thus should be agnostic of how precise the double/floats are. So if it's a double, it should behave as a double. No shortcuts.

    I agree with everything, except the never word. When you know all of your doubles weren't obtained as a result of arithmetic operations, you're guaranteed to obtain the expected results when comparing them, because of course their binary representation (which I guess is what's actually being compared by "==") doesn't change.
    Even if it's obtained from arithmetic operations, a particular case is when you compare a double to 0 when you know there's a possibility it could be the result of multiplying another double by 0 - and you don't want to see if it's close to zero, but if it's exactly zero, which is definitely possible in this situation.

    Seems like I either copied it wrong in my evaluator or just different CPUs give different results, because now I get a different value, 5.55111512312578E-18, but this still irrelevant, because double.Parse(d.ToString()) still evaluates to False, which is the focus here.

    Made my day.

    And yes, probably I typed something wrong in my evaluator (I used both VS and CSharpPad), but this doesn't change the outcome.
     
  12. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,139
    I would argue though that in the context of your original post. It's converting a float/double to a string and parsing it back.

    And I'd personally put parsing a string in the same camp as arithmetic operations and the sort. Since "9.33356" isn't really the binary representation and has to be interpreted into the binary representation (the entire act of parsing). That is unless you specifically serialized it while maintaining its binary representation.

    And @Kurt-Dekker was even the first person to mention that aspect.
     
  13. subramaniyanvgatmoback

    subramaniyanvgatmoback

    Joined:
    Oct 13, 2016
    Posts:
    15
    Thank you so much you solution helped me a lot.
     
    thefallengamesstudio likes this.