Search Unity

Fast loading of float or vectors

Discussion in 'Scripting' started by MadAboutPandas3, Aug 15, 2019.

  1. MadAboutPandas3

    MadAboutPandas3

    Joined:
    Jul 3, 2017
    Posts:
    29
    Hi,

    What is the fastest way to load 1,000,000 times the position x, y, z ?
    What is the best way to compress the data?

    The positions are all within 0 and 4,000 and just need a precision of 1 cm. Need to run on all platforms, including mobile.

    Currently we have a scriptable object with an array byte[]. During load it gets converted with System.BitConverter to list of vector4 for usage. Saving the other way around.

    Kind Regards,
    Chris
     
  2. palex-nx

    palex-nx

    Joined:
    Jul 23, 2018
    Posts:
    1,748
    Create a union struct with byte array and a bunch of vectors. Fill that aray from file stream, save vectors and repeat until file is over. For compression you may use sharpziplib, for instance.
     
  3. Yoreki

    Yoreki

    Joined:
    Apr 10, 2019
    Posts:
    2,605
    A Vector3 has 3 floats, each 32bit = 4 byte each, = 12byte of data. Having a million of those values means 12mb of data. To the best of my knowledge, there is not a whole lot you can do to influence how fast any given volume of data is loaded, other than changing the hardware, or of course decreasing the volume of data.

    The way values in floats are stored, is that you basically have a 7 digits number (23bit), and use the rest of the bits to represent / move the decimal dot. However, you use up a lot of this precision since your values go up to 4000. The highest precision you can get out of a float at all would be something along the lines of 3999.001. That said, technically you want a precision of 3999.01, which means you only need 6 digits instead of 7, thus only ~20 instead of 23 bits of information on the digits. You could thus potentially save some data by writing your own way of saving only relevant bits, but doing so for a 3/32 ~= 10% saving seems like a lot of effort.
    (It's actually possible to save ~40% since you dont need the other bits making up the float, see comment below by jvo3dc)

    Other than that there is compression, which would presumably be able to decrease the amount of data you have to write / read by quite a lot, but it also uses a decent amount of cpu power to do so. Since you said you want it to run on pc (compression wouldnt be a huge problem, but loading 12mb shouldnt be either) but also on mobile (compression + load may or may not take longer than simply loading), you'd have to see what is better for the hardware you target.

    Also, i'm not entirely sure if there is any overhead when storing data. Only storing the actual bits you need for your floats should be possible. If there is a overhead by default, directly store the bits only. You can check this by seeing how large the file is you store.
    So as a summary, only save the bits you actually need without overhead (=12mb), potentially decrease the amounts of bits you need based on the required precision (~=10mb), if you have cpu power to spare, compress the data (=?mb based on compression method, find a good contrast between data being stored and cpu required for compression).

    That said, why are you storing a million position anyways? If loading them is a problem you have to consider, then i have to ask: do you have to store them in the first place? With a million values i assume these are not hand-crafted, so you generated or got them from somewhere at some point. Cant you do that when you need them? Generating a million values should be faster than loading them, since it happens in RAM, not loading from HDD or SSD which is way slower.
     
    Last edited: Aug 16, 2019
    ThySpektre likes this.
  4. jvo3dc

    jvo3dc

    Joined:
    Oct 11, 2013
    Posts:
    1,520
    You can save a little more than 10% I'd say. With a range of 0 to 4,000 in steps of 0.01, you have 400,001 distinct values. That comes down to 19 bits as integer. 16 bits would be easier to handle, but 19 bits is still doable. Would lead to about 7.2 MB of data without compression.
     
    Yoreki and ThySpektre like this.
  5. Suddoha

    Suddoha

    Joined:
    Nov 9, 2013
    Posts:
    2,824
    Do you convert the values one by one?

    Have you tested / profiled whether the loading part or the conversion that takes the time?

    For the conversion, you should be able to use unsafe code, which saves those millions of calls that convert value by value. The only thing you'll need to take care about is bitness for the various platforms. Or similarly, use the struct layout approach that was suggested earlier, which could even save the time for building the vectors.

    Though I'm wondering, why are you saving it as byte array in your ScriptableObject when you need it to be Vector4 anyway? Why don't you just use a Vector4 array?
     
    Last edited: Aug 15, 2019
    palex-nx likes this.
  6. Joe-Censored

    Joe-Censored

    Joined:
    Mar 26, 2013
    Posts:
    11,847
    You might consider just moving this work to another thread and kick it off as soon as the game launches. The data would be ready to go as soon as needed most likely without any delay on the main thread.
     
    ThySpektre likes this.
  7. Yoreki

    Yoreki

    Joined:
    Apr 10, 2019
    Posts:
    2,605
    You are right. I was fixated on storing the information as a float, but considering the fixed decimal precision, OP should not need the first 8 float bits at all. Simply saving the number "as an integer" and reconstructing the actual value by casting the it to a float and dividing it by 100 to get the coordinate, should be the smallest possible way to store the data. Nice idea!
     
  8. Joe-Censored

    Joe-Censored

    Joined:
    Mar 26, 2013
    Posts:
    11,847
    Hmmm, I have a feeling all the work chopping up the binary data into 19 bit "integers", converting them into actual 32 bit integers, dividing by 100, then converting to float is not going to produce any actual performance gains over just reading 32 bit floats. I'd be interested in finding out if that is actually the case though.

    Though it will certainly compress the data. The OP wants both compressed data and the fastest performance, which are often opposing goals. Might need to choose which is more important.
     
    Suddoha likes this.
  9. Yoreki

    Yoreki

    Joined:
    Apr 10, 2019
    Posts:
    2,605
    True, but we have been given very little context to work with. Loading ~12mb should not be that huge of a deal either way, so there is probably some underlying performance issue or constraint here that we are not aware of / cant work on without seing the code or being given more information. However, technically the question was how to load a million values in the fastest way possible, and what's the best way to compress the data.
    If nothing else, we got that covered now hehe. But jeah, for any practical use you'd have to test if it actually helps.
     
    Joe-Censored likes this.