Search Unity

  1. All Pro and Enterprise subscribers: find helpful & inspiring creative, tech, and business know-how in the new Unity Success Hub. Sign in to stay up to date.
    Dismiss Notice
  2. Dismiss Notice

LZF compression and decompression for Unity

Discussion in 'Scripting' started by Agent_007, Sep 26, 2012.

  1. Eric5h5

    Eric5h5

    Volunteer Moderator Moderator

    Joined:
    Jul 19, 2006
    Posts:
    32,300
    The code in post #11 works fine. I use it with a custom editor window and it doesn't freeze the editor. (Unless you mean it blocks until it completes, but I imagine you'd need a very large amount of data to compress before it's noticeable.)

    --Eric
     
  2. ParadoxSolutions

    ParadoxSolutions

    Joined:
    Jul 27, 2015
    Posts:
    291
    At the most I'm working on half a megabyte and have 16GB of ram on the machine I'm running, maybe it locks up until its done but I've just task manager killed Unity before that after a while of waiting. The editor script is just a button that sends a string to be compressed. Of course writing a custom compressor for color32[] was enough to bring my strings into the data budget. I probably wont need additional compression methods now, but here's another option for anyone looking: https://github.com/icsharpcode/SharpZipLib
     
  3. Quatum1000

    Quatum1000

    Joined:
    Oct 5, 2014
    Posts:
    856
    All the versions CLZF/2 contain a small bug because none has tested against the element bounds. If the byte array contain 0 or 1 element for any reason, you get.

    IndexOutOfRangeException: Index was outside the bounds of the array.
    CLZF2.lzf_compress (System.Byte[] input, System.Byte[]& output) (at Assets/_SystemSettings/CaboTools/CLZF2.cs:109)
    CLZF2.Compress (System.Byte[] inputBytes) (at Assets/_SystemSettings/CaboTools/CLZF2.cs:59)
    test_CLZF2.OnValidate () (at Assets/_Tools/test_CLZF2.cs:22)


    You should add a simple if (inputBytes.Length <= 1) return inputBytes;
     
    Last edited: Jun 28, 2019
  4. Bl4ckh34d

    Bl4ckh34d

    Joined:
    Apr 19, 2018
    Posts:
    44
    Hey there,

    When decompressing, I get an OverflowException in Line 131

    "tempBuffer = new byte[outputByteCountGuess];"

    Error:
    OverflowException
    CLZF2.Decompress (System.Byte[] inputBytes) (at Assets/Scripts/CLZF2.cs:131)

    Any ideas how to solve it? Compressing works fine but decompressing never works.
    I am compressing the byte[] of two Liste<GameObject> with tons of Objects in each list (Map Tiles with procedural generated meshes, etc). Uncompressed, the Save file usually has between 4 - 40MB.
    Is there a limit to what this thing can handle?
     
  5. Bunny83

    Bunny83

    Joined:
    Oct 18, 2010
    Posts:
    832
    I don't quite understand that sentence. This method just compresses a byte array into another byte array which is usually smaller. What has this to do with lists, gameobjects, objects in general or tiles?

    The line number doesn't match the originally posted code in this thread. So which version do you use? Are you sure that you get an OverflowException? This should only be possible when you run the code in a "checked" block.

    As far as I can tell the size shouldn't really matter. Well you would have the usual 2GB object size limit for the arrays but apart from that I don't see any issues (besides the way it handles the growing of the array which is kinda inefficient).

    The two methods in the code are just the pure compression / decompression implementation. The output data does not have any kind of header. The actual LZF format was organised in chunks up to 64k and each chunk had a seperate header. I would highly recommend if you want to use this for a save file to add an actual header and include the uncompressed size in that header. This would get rid of the issue that the decompressor doesn't know the uncompressed size which it currently "guesses". Also it's highly recommended that you include some kind of version in your header in case you want to switch to a difference format in the future.

    edit:
    Ahh you used the second version he posted -.-
     
    Last edited: Oct 6, 2020
    Bl4ckh34d likes this.
  6. Bunny83

    Bunny83

    Joined:
    Oct 18, 2010
    Posts:
    832
    Ohh, I forgot to mention that when you're using this code you have to ship the original license with your game as well. The BSD license is not as "free" as for example the MIT license. You can do whatever you want with the code, but even when only distributed in binary form you have to reproduce the license either in the documentation of your application or as a seperate license file. For example the Unity editor itself uses tons of such libraries and provides all the licenses in the legal.txt file inside Data\Resources of the editor installation folder. For example
    C:\Program Files\Unity\Hub\Editor\2020.1.6f1\Editor\Data\Resources\legal.txt


    I just wanted to mention it just to avoid potential issues.
     
    Bl4ckh34d likes this.
  7. Bl4ckh34d

    Bl4ckh34d

    Joined:
    Apr 19, 2018
    Posts:
    44
    Sorry, I should have posted more info regarding this issue. I will add my code here, so you get a better idea of what I tried to do:

    Code (CSharp):
    1. void MeshDump(List<GameObject> obj1, List<GameObject> obj2)
    2.   {
    3.     System.Runtime.Serialization.Formatters.Binary.BinaryFormatter bf = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
    4.     if(!System.IO.Directory.Exists(Application.dataPath + "/Saves/"))
    5.     {
    6.       System.IO.Directory.CreateDirectory(Application.dataPath + "/Saves/");
    7.     }
    8.     System.IO.FileStream fs = new System.IO.FileStream(Application.dataPath + "/Saves/" + mapName + ".save", System.IO.FileMode.Create);
    9.     SerializableMeshInfo smi = new SerializableMeshInfo(obj1, obj2);
    10.     var objectified = (System.Object)smi;
    11.     var compressed = CLZF2.Compress(ObjectToByteArray(objectified));
    12.     bf.Serialize(fs, compressed);
    13.     fs.Close();
    14.   }
    15.  
    16.   void MeshUndump()
    17.   {
    18.     if(!System.IO.File.Exists(Application.dataPath + "/Saves/" + mapName + ".save"))
    19.     {
    20.       Debug.LogError("meshFile.dat file does not exist.");
    21.       return;
    22.     }
    23.     System.Runtime.Serialization.Formatters.Binary.BinaryFormatter bf = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
    24.     System.IO.FileStream fs = new System.IO.FileStream(Application.dataPath + "/Saves/" + mapName + ".save", System.IO.FileMode.Open);
    25.     var objectified = bf.Deserialize(fs);
    26.     var decompressed = ByteArrayToObject(CLZF2.Decompress(ObjectToByteArray(objectified)));
    27.     SerializableMeshInfo smi = (SerializableMeshInfo)decompressed;
    28.  
    29.     List<GameObject>[] gObjs = smi.GetObject();
    30.     tiles.AddRange(gObjs[0]);
    31.     blocks.AddRange(gObjs[1]);
    32.     tiles.ForEach(tile => tile.transform.parent = transform);
    33.     blocks.ForEach(tile => tile.transform.parent = transform);
    34.     gObjs[0].Clear();
    35.     gObjs[1].Clear();
    36.     fs.Close();
    37.   }
    38.  
    39.   private byte[] ObjectToByteArray(System.Object obj)
    40.   {
    41.     if (obj == null)
    42.     return null;
    43.     System.Runtime.Serialization.Formatters.Binary.BinaryFormatter bf = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
    44.     System.IO.MemoryStream ms = new System.IO.MemoryStream();
    45.     bf.Serialize(ms, obj);
    46.     return ms.ToArray();
    47.   }
    48.  
    49.   private System.Object ByteArrayToObject(byte[] arrBytes)
    50.   {
    51.     System.IO.MemoryStream memStream = new System.IO.MemoryStream();
    52.     System.Runtime.Serialization.Formatters.Binary.BinaryFormatter binForm = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
    53.     memStream.Write(arrBytes, 0, arrBytes.Length);
    54.     memStream.Seek(0, System.IO.SeekOrigin.Begin);
    55.     System.Object obj = (System.Object) binForm.Deserialize(memStream);
    56.     return obj;
    57.   }
    I was trying to compress my serialized gameObjects (blocks of a voxel map made up by randomly generated mesh colliders, meshes and tons of Vector3 arrays saving the vertices of faces that are currently not visible but could be visible when a neighboring block is mined).
    All this data is stored in serialized lists which I dump into a file. I was trying to strap the compression/decompression in between the saving/loading process, since my save files are about 4-40MB per small to medium map, which I felt is a lot and might be compressible.

    Concerning the license: Thanks for the reminder!
    I currently only develop as a hobby and not in any professional capacity. I use the steep learning curve I come across as an opportunity to learn new things, be it modelling, skinning, rigging, coding, creating sound tracks, all that comes with it. Currently I am looking into saving procedural maps and stumbled across this wonderful piece of code was struggling to implement it. Maybe the code works perfectly fine and I messed up (very likely).

    Please let me know if you need more info or if you have another idea how I could compress save files containg runtime generated mesh info. I am a bloody beginner in this area and try to learn new things in my spare time.

    About the header: I have honestly no idea how to write a header or about the inner workings behind it. My level of coding is quite inferior but if you can point me into the right direction I might learn a new thing or two :)
    Thank you for your help so far!
     
    Last edited: Oct 7, 2020
  8. Bunny83

    Bunny83

    Joined:
    Oct 18, 2010
    Posts:
    832
    You have messed up a lot in your code ^^. First of all you use the binary formatter to convert your "SerializableMeshInfo" class into a binary stream and into a byte array. That's what your "ObjectToByteArray" method does. You then compress the byte array. However at the end you again use the binary formatter to format your byte array as a binary stream. This is your issue. That means the data you write to the file is actually a single serialized byte array.

    In your MeshUndump method you do
    var objectified = bf.Deserialize(fs);
    . This actually returns your byte array You put that byte array again through your "ObjectToByteArray" method which makes no sense. Of course trying to decompress the newly formatted binary stream fails.

    You should get rid of your binaryformatter stuff to write the result to your file. You only need to use it inside your "ObjectToByteArray" and "ByteArrayToObject" methods.

    You should do something like this:

    Code (CSharp):
    1. void MeshDump(List<GameObject> obj1, List<GameObject> obj2)
    2. {
    3.     string path = Application.dataPath + "/Saves/";
    4.     string fileName = path + mapName + ".save";
    5.     if(!System.IO.Directory.Exists(path))
    6.     {
    7.         System.IO.Directory.CreateDirectory(path);
    8.     }
    9.    
    10.     SerializableMeshInfo smi = new SerializableMeshInfo(obj1, obj2);
    11.     byte[] data = ObjectToByteArray(smi);
    12.     byte[] compressed = CLZF2.Compress(data);
    13.     System.IO.File.WriteAllBytes(fileName, compressed);
    14. }
    15.    
    16. void MeshUndump()
    17. {
    18.     string fileName = Application.dataPath + "/Saves/" + mapName + ".save";
    19.     if(!System.IO.File.Exists(fileName))
    20.     {
    21.         Debug.LogError(mapName + ".save file does not exist.");
    22.         return;
    23.     }
    24.    
    25.     byte[] compressed = System.IO.File.ReadAllBytes(fileName);
    26.     byte[] data = CLZF2.Decompress(compressed);
    27.     SerializableMeshInfo smi = (SerializableMeshInfo)ByteArrayToObject(data);
    28.    
    29.     List<GameObject>[] gObjs = smi.GetObject();
    30.     tiles.AddRange(gObjs[0]);
    31.     blocks.AddRange(gObjs[1]);
    32.     tiles.ForEach(tile => tile.transform.parent = transform);
    33.     blocks.ForEach(tile => tile.transform.parent = transform);
    34.     gObjs[0].Clear();
    35.     gObjs[1].Clear();
    36. }
    37.    
    38. private byte[] ObjectToByteArray(System.Object obj)
    39. {
    40.     if (obj == null)
    41.         return null;
    42.     var bf = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
    43.     using(var ms = new System.IO.MemoryStream())
    44.     {
    45.         bf.Serialize(ms, obj);
    46.         return ms.ToArray();
    47.     }
    48. }
    49.    
    50. private object ByteArrayToObject(byte[] arrBytes)
    51. {
    52.     var bf = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
    53.     using(var memStream = new System.IO.MemoryStream(arrBytes))
    54.     {      
    55.         object obj = bf.Deserialize(memStream);
    56.         return obj;
    57.     }
    58. }
    59.  
    It seems a bit strange that your "MeshDump" method takes two lists as parameters, but your "MeshUndump" magically uses member variables of your class. This is inconsistent design. Your two lines

    Code (CSharp):
    1. gObjs[0].Clear();
    2. gObjs[1].Clear();
    seems to be useless since the whole smi object as well as the List are local variables which will be up for garbage collection once the method returns.
     
    Quatum1000 and Bl4ckh34d like this.
  9. Bl4ckh34d

    Bl4ckh34d

    Joined:
    Apr 19, 2018
    Posts:
    44
    Thank you for all the clarifications and rewriting the code! I tried and implement it and it works like a charm!
    If I understood it right, Unity only allows me to save Serialized data to drive for my save games. So player characters, buildings etc. should be created as Prefab and stored in the project and only player states, stats, positions etc. should get serialized to keep performance up, right?

    Concerning the license: If I leave the code as you provided it, with the huge text comment on the top, will that do or do I still have to create an additional legal.txt file in my project?

    Again thank you for taking your time and helping me!
     
  10. Bunny83

    Bunny83

    Joined:
    Oct 18, 2010
    Posts:
    832
    That's exactly the point. The MIT license just requires that the copyright notice remains in the source files. However when you ship the binaries (your ready compiled game) you don't have to add anything. However the BSD license requires that even in the binary form the licence text has to be reproduced in some way. The compiled code does not contain the license text anymore since it's just a comment in a source file. So you either need to ship a seperate file, or somehow add some sort of credits inside the game itself where people who receive a copy of your game can see it. It's common to add some notes to which part of your application those licenses apply. If you scroll through Unity's legal.txt you will see what I mean. They always introduce a section explaining to which part / library the following licence applies to.

    About your first question: Well Unity doesn't restrict you in any way. However Unity doesn't have any built-in mechanic to serialize gameobject hierarchies. We don't know what your "SerializableMeshInfo" class does internally. From the name it seems that you took this code and modified it? You really should refactor your classname to reflect what the class actually does ore represents. Since your class seems to handle whole Lists of gameobjects I have the strong feeling that this goes a bit beyond "mesh data".

    Note that the BinaryFormatter is not really a great way to serialize objects. Yes it's a quick and dirty solution and it just works out of the box. However the format is verbose which makes it unnecessarily large and has several security issues. Even Microsoft does not recommend its usage. It's usually better to roll your own save format with just a BinaryWriter / BinaryReader. This also simplifies the creation of a "header" as mentioned earlier, gives more control over how each bit of your data is serialized and is in almost all cases smaller than the binary formatter.

    Though without knowing what information you store inside your SerializableMeshInfo class I can't really help you any further. Serializing a whole scene is a really tricky thing to do and in most cases not necessary. Usually you just gather the information that is really required to reconstruct your scene / objects. Though that might need some additional bookkeeping of assets inside your project to be able to refer to those built-in assets from your serialized data.
     
    Bl4ckh34d likes this.
  11. sngdan

    sngdan

    Joined:
    Feb 7, 2014
    Posts:
    1,041
    Bl4ckh34d and Bunny83 like this.
  12. Bunny83

    Bunny83

    Joined:
    Oct 18, 2010
    Posts:
    832
    :) That's a neat abuse of the fact that png uses deflate to compress its image data. Since the png loader / encoder is already part of Unity it doesn't add code overhead.
     
    Bl4ckh34d likes this.
  13. Bl4ckh34d

    Bl4ckh34d

    Joined:
    Apr 19, 2018
    Posts:
    44
    So basically my map is made up of tile blocks for the floor (procedurally generated cubes (8 vertices, 16 Tris, 24 uvs)) and minable blocks on top of them (depending on detail level of subdivisions of each face up to 30 polys per block face (north, east, south, west and top). Maps can range from 25x25 blocks and 25x25 tiles up to 150x150 blocks and tiles.

    Since the blocks resemble rock with a more natural randomized face surface, I need at least 16 polys per face to make them look decent. To save resources, I only render the visible faces but since the base shape is not perfectly cubic but slightly distorted, I save the information of invisible faces inside the gameObjects and render additional faces, when needed (when a neighboring block is mined).

    So having each faces uv, tris and vertices saved in an array + backup of all those arrays for hundreds of blocks with quite a lot of vertices creates a decent amount of data. I couldnt come up with a better way of saving all the information.
    Possibly it would be easier to write an algorithm that kind of recreates the lost mesh info for the currently not visible Block sides, but I couldn't come up with one so far.
     
  14. Bunny83

    Bunny83

    Joined:
    Oct 18, 2010
    Posts:
    832
    But why would you save the actual mesh in a save file? You usually just store the block type and any necessary parameters per block (like a rotation or other modifications) that are required to reconstruct the mesh when you load the save back.

    Most voxel games, like minecraft, seperate the actual terrain / voxel data from the mesh representation. Minecraft only stores the block type as well as 4 bits of metadata per block. MC has additional information for so called block entities which allow arbitrary additional information. This is only used for a few special blocks like chests, dispensers, hoppers, ... Almost all block types are just stored by their block id (in the past it was a hardcoded id, now it's a dynamic id based on a palette of block id names) the 4 bit meta data value (used for various things like orientation of a block, growth state of a plant, signal strength of redstone, ...) and some other information like the two light levels. The actual geometry of each block is defined by each block type. So when a chunk is loaded MC just looks up the block type and recreates the actual mesh based on this information.

    That doesn't mean it has to be done that way. However it massively reduces the information you have to store.
     
  15. Bl4ckh34d

    Bl4ckh34d

    Joined:
    Apr 19, 2018
    Posts:
    44
    upload_2020-10-9_15-23-4.png

    That is the problem: The actual mesh is randomly enough to need its vertices stored and restored for reconstruction. As you can see to the right, there are quite a lot of vertices. I could just save the block types and rebuild the map, that building the map takes sometimes up to a minute while loading it only takes seconds.

    MC only features uniform blocks, all with the same count of vertices, faces and uvs.
    My blocks are on the bottom side perfectly rectangular to fit the Tile size but on the top side randomized and then fitted to each other to become seamless. Then their surfaces get subdivided a couple of times until the desired level of detail is reached (of course only visible faces). The 8 base vertices before subdivision are still being stored in backup variables to make sure that when a neighbor is destroyed, new faces can be drawn matching the distorted shape of the block, then the newly revealed faces get subdivided by an algorithm.
    This allows for a more natural look than minecraft but still preserves playability for a tile based game.

    upload_2020-10-9_15-30-24.png

    As you can see, the data is plentiful and at least the Backup Arrays are needed to be saved - for each single block.
    My Serialization class does exactly what you figured out and stems from exactly that tutorial as well. I guess you are right and I should defnitely rename it, but I am not a native speaker and actually have no clue what really is meant by serialization. I understood it as the process of breaking Vector3 etc. down into single ints that can be stored byte-wise.

    upload_2020-10-9_15-35-50.png

    I'm not a programmer by profession but just self-taught and do it for a hobby, never earned a single dime with coding.
    I'm just saying this so you know why my code looks like spaghetti bolognese :D
     
  16. sngdan

    sngdan

    Joined:
    Feb 7, 2014
    Posts:
    1,041
    Did you try the PNG compression on your 40MB file? How much is it compressed?
     
    Bl4ckh34d likes this.
  17. Bl4ckh34d

    Bl4ckh34d

    Joined:
    Apr 19, 2018
    Posts:
    44
    Didnt have the time yet, since its only a spare time project and I am quite busy with my actual job these days. Will have a look into it asap.

    How to uncompress the PNG later on? I think the code snipped you posted back then only shows how to compress it?
     
  18. sngdan

    sngdan

    Joined:
    Feb 7, 2014
    Posts:
    1,041
    I have not used unity in a long time, just following dots forum on mobile (and this thread came up in alerts) - don’t have decode at hand but just Google, this is straight forward, reading bytes from png

    you should save the length of the serialized array with the save data as the decompression size will be equal or longer and you need to trim the end. Just test if the result is acceptable (compressed size) otherwise you have to reconsider data layout as suggested here.
     
    Bl4ckh34d likes this.
  19. sngdan

    sngdan

    Joined:
    Feb 7, 2014
    Posts:
    1,041
  20. Bunny83

    Bunny83

    Joined:
    Oct 18, 2010
    Posts:
    832
    I had a bit of time and tried your png approach ^^. However the used texture format didn't work properly. When using RGBA32 at some point Unity is either interpreting it as ARGB32 or is converting it to ARGB32. When switching to ARGB32 it worked properly. Here are two methods to compress / decompress:

    Code (CSharp):
    1.  
    2.     public static byte[] Compress(byte[] aSource)
    3.     {
    4.         using (var memoryStream = new System.IO.MemoryStream(aSource.Length + 10))
    5.         using (var writer = new System.IO.BinaryWriter(memoryStream))
    6.         {
    7.             writer.Write(aSource.Length);
    8.             writer.Write(aSource);
    9.             aSource = memoryStream.ToArray();
    10.         }
    11.         int dim = Mathf.CeilToInt(Mathf.Sqrt(Mathf.CeilToInt(aSource.Length / 4)));
    12.         // Has to be ARGB32 because RGBA32 does somehow turn into ARGB32
    13.         Texture2D t = new Texture2D(dim, dim, TextureFormat.ARGB32, false, false);
    14.         byte[] temp = new byte[dim * dim * 4];
    15.         System.Array.Copy(aSource, temp, aSource.Length);
    16.         t.LoadRawTextureData(temp);
    17.         var data = t.EncodeToPNG();
    18.         Destroy(t);
    19.         return data;
    20.     }
    21.  
    22.     public static byte[] Decompress(byte[] aSource)
    23.     {
    24.         Texture2D t = new Texture2D(1, 1, TextureFormat.ARGB32, false, false);
    25.         t.LoadImage(aSource);
    26.         aSource = t.GetRawTextureData();
    27.         Destroy(t);
    28.         using (var memoryStream = new System.IO.MemoryStream(aSource))
    29.         using (var reader = new System.IO.BinaryReader(memoryStream))
    30.         {
    31.             int size = reader.ReadInt32();
    32.             return reader.ReadBytes(size);
    33.         }
    34.     }
    35.  
    This code properly round trips. However in my test I initialized the array with random data which didn't compress very well -.- It ended up larger than the source. Though It might work properly with actual data. Keep in mind that this approach has some size limitations. Which exactly is hard to tell. Though 40MB of data is a 3163x3163 pixel large image. I'm sure that Unity has a limit how large the textures can be. I think the limit is somewhere around 16k. That would be equal to about 1GB. Though it should be noted that the conversion is quite slow. That was one of the main points of LZF. It's not the best in compression size but it's quite fast.
     
    sngdan and Bl4ckh34d like this.
  21. Rachan

    Rachan

    Joined:
    Dec 3, 2012
    Posts:
    105
    Thank But it have a lot of errors
     
  22. Quatum1000

    Quatum1000

    Joined:
    Oct 5, 2014
    Posts:
    856
    What file or data are you trying to compress?
    Can you explain what you try to do... and the errors occur?
     
unityunity