Search Unity

  1. Unity 6 Preview is now available. To find out what's new, have a look at our Unity 6 Preview blog post.
    Dismiss Notice
  2. Unity is excited to announce that we will be collaborating with TheXPlace for a summer game jam from June 13 - June 19. Learn more.
    Dismiss Notice

Resolved Best Image Format?

Discussion in 'Scripting' started by AnimalMan, Oct 6, 2022.

  1. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    Good Morning,

    So I have a bunch of animation images for a character in my game. I have imported images to my games before from outside of unity directory, and I was wondering

    since Texture2D must be created first, before tex can LoadImage(filebytedata) is there anyway to obtain the dimensions of the image before creating the texture?

    I intend to create my own format to conveniently read these images, by combining all of the frames into a single file where I will be able to read this file and obtain this type of data more easily. In addition to modifying said data, for example key colours.

    if I open a png file for example in text format everything is encoded so that I cannot quickly obtain the dimension data in a single parse from a specific line.
    Are there other common readable formats where such data can already be obtained?

    or am I better off writing my own format?
    The purpose of this being that since a single character can have as many as 60 or more frames, I’d rather interact with a single file to obtain a single characters entire frame content instead of giving a single character it’s own directory full of these files; or otherwise having a mixed file bag of jumbled frames and sorting through it. I feel compacting the data of multiple frames into a single file is the way to go.

    Any thoughts?
     
    Last edited: Oct 6, 2022
  2. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    Heres a quick example of the problem

    Code (CSharp):
    1.  public void IMPORT_BULK_IMAGES()
    2.     {
    3.         IMAGEDIRECTORIES = Directory.GetFiles("C:/Users/SCHar/Desktop/COMPILER");
    4.         List<Texture2D> COMPACT_LIST = new List<Texture2D>();
    5.         List<string> FRAME_ID = new List<string>();
    6.         for (int i = 0; i < IMAGEDIRECTORIES.Length; i++)
    7.         {
    8.             if (File.Exists(IMAGEDIRECTORIES[i]))
    9.             {
    10.                 var fileData = File.ReadAllBytes(IMAGEDIRECTORIES[i]);
    11.                 // So I must read byte ^^ to obtain texture dimension?
    12.                 var tex = new Texture2D(UNKNOWN_DIMENSION_X, UNKNOWN_DIMENSION_Y);
    13.                // And now i load image >>
    14.                 tex.LoadImage(fileData);
    15.                 var KA = IMAGEDIRECTORIES[i].ToCharArray();
    16.                 string FILENAME = "";
    17.                 for (int c = KA.Length - 5; c > 0; c--)
    18.                 {
    19.                     if (KA[c] == '\\')
    20.                     {
    21.                         break;
    22.                     }
    23.                     else
    24.                         FILENAME = KA[c] + FILENAME;
    25.                 }
    26.                 FRAME_ID.Add(FILENAME);
    27.             }
    28.         }
    29.         CREATE_COMPACT_FILE(COMPACT_LIST,"INFANTRY_A", FRAME_ID);
    30.     }
    Onwards it will go ->

    Code (CSharp):
    1. public void CREATE_COMPACT_FILE(List<Texture2D> COMPACT_LIST, string COMPACT_FILE_NAME, List<string> FRAME_ID)
    2.     {
    3.         // Here interpret FRAME_ID ( Which may be Contains('W_') for walk for e.g.
    4.     }
    Ending with

    Code (CSharp):
    1.  public void IMPORT_COMPACT_IMAGES() // HOW I WILL READ THE FINAL COMPACT FILE
    2.     {
    3.         IMAGEDIRECTORIES = Directory.GetFiles("C:/Users/SCHar/Desktop/OBJECTIVE/DATA/ART");
    4.         for (int i = 0; i < IMAGEDIRECTORIES.Length; i++)
    5.         {
    6.             if (File.Exists(IMAGEDIRECTORIES[i]))
    7.             {
    8.                 var MOVE = IMAGEDIRECTORIES[i];
    9.                 var MOVELINE = File.ReadAllLines(MOVE);
    10.                 for (int l = 0; l < MOVELINE.Length; l++)
    11.                 {
    12.                      // I'll rebuild the images from the colour data, tags, and texture dimensions
    13.                 }
    14.             }
    15.         }
    Example.png
     
    Last edited: Oct 6, 2022
  3. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    FORMATS.png

    Other Format Examples, I just need Vector4 Color data, Dimension Data. and potential for Tag locations. None of this Binary encryption. I effectively going for a Non-Binary format.
     
  4. angrypenguin

    angrypenguin

    Joined:
    Dec 29, 2011
    Posts:
    15,635
    That's because it's not text, it's raw binary. However, it's a published specification, so you can look up the spec and figure out how to read it.

    According to the spec at FileFormat.info, the width and height of a PNG image are the first two DWORDs in the header, immediately following the PNG signature. So, the width is at bytes 9 through 12, and the height is at bytes 13 through 16.

    But...
    I assume that your art tool doesn't already support some kind of packaged format, otherwise you wouldn't be looking into this.

    With that in mind, my real question is: how does doing this help you?

    I'd just output folders of images from my art tool, and write an Editor extension in Unity to quickly make animation assets (or whatever you're using) from a folder of images. Yes, you still have a folder of images, but aside from when you draw them you shouldn't have to mess with the files.

    I fully understand that having associated objects packaged in one file is neater, but one thing to note is that it's also worse for most version control systems. Every time you change a frame you'd need to re-upload the whole package.

    I'm not usually a 2D developer, so please say so if I'm overlooking something.

    - - -

    If you really want to do this just for fun, here's what I'd do. (Edit: see next post.)

    In an art tool I could write a plugin for, I'd write one which does the following:
    1. Check / enforce that all images are same format.
    2. Export all files as .PNG to some temp location.
    3. Create a little JSON / XML / whatever file with metadata in it - name of animation, number of frames, timing, whatever else is relevant.
    4. Package all of the files, including the metadata, using some compression format which is also available in Unity.
    5. Write an importer for the Unity Editor which does the above backwards, but puts the extracted pixels into Texture2D assets.


    This does strike me as a fair bit of work for a programmer to create (and likely maintain) to save other people from doing a relatively small amount of work. Again, though, I could be overlooking something, as I'm not a 2D dev.
     
    Last edited: Oct 6, 2022
    Kiwasi and AnimalMan like this.
  5. angrypenguin

    angrypenguin

    Joined:
    Dec 29, 2011
    Posts:
    15,635
    On slightly further thought, I wouldn't even bother with the compression, because PNG is already compressed.

    1. Check / enforce that all images are same format.
    2. Export all files as .PNG to some temp location.
    3. Create a little JSON / XML / whatever string with metadata in it - name of animation, number of frames, timing, whatever else is relevant, and a list of the sizes of each PNG file in bytes.
    4. Open a file stream, write the metadata to it, then the contents of each PNG file.
    5. Write an importer for the Unity Editor which reads the metadata, then uses it to read each PNG in sequence.

    Note that before an image can be used on the GPU it has to be loaded into a texture format or fully decompressed. When you put an image file in the Editor it does this at build time, so your game doesn't have to do it at runtime. If you're loading images from a format such as PNG at runtime then you're making more work (they need to be decompressed at least, and then possibly put into another format) and this will slow down load times.

    Does that matter? That'll depend entirely on your game.

    Can you improve on it? Definitely, but the complexity of that is more than I can write in a couple of forum posts while waiting for a game to update. ;)
     
    Last edited: Oct 6, 2022
    AnimalMan likes this.
  6. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    It’s one of those cases

    you need a chocolate bar to make a chocolate bar

    for simplicity as it can act as a bulk image modifier, I could set all reds to yellow have them prebaked in memory instead of running real time shader to convert these colours, as another alternative.

    I know in the end I can use Unity png importer to make the file I want, but obtaining those specific byte values you mentioned I am about to pursue this route right now.
     
  7. angrypenguin

    angrypenguin

    Joined:
    Dec 29, 2011
    Posts:
    15,635
    But a shader is both more simple and more flexible...
     
    Kiwasi likes this.
  8. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    You are probably correct. But I need to know for sure. I could also arrange such frames into a custom atlas of sorts. Convert back to png, and edit the entire sheet.
     
  9. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
  10. Bunny83

    Bunny83

    Joined:
    Oct 18, 2010
    Posts:
    4,112
    I think I don't quite understand what's the problem :) The image data will be replaced on load anyways, no matter what the dimension of the Texture2D was before.

    What exact problem do you want to solve? I've once created my PNGTools which provides a way to access all chunks stored in a png file. Though I quickly created it in order to change the PPI settings of a png file as it was asked in this question. So currently I just parse the chunks into seperate parts, but the chunks itself are not further parsed. The "IHDR" which is the header chunk which is always the first chunk actually contains the width and height as the first 8 bytes. It's quite trivial to read those out. Though the question remains: What exact problem should this solve? Why not just load the image and then read the width and height?

    It is common to create spritesheets, yes. But it's usually not to simplify loading, but to get better runtime performance since using a spritesheet means you only have a single texture in the video memory and you only choose a subsection of that single texture. That means several objects can actually use the exact same material and can be batched.

    I just want to make this clear: When you use LoadImage on a Texture2D, the image data will be completely replaced. So it's common to create a Texture2D with the dimension (1, 1). After the LoadImage call it's replaced anyways.
     
    AnimalMan likes this.
  11. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    Oh Okay Now I feel stupid. Thank you for the every day genius that you provide Bunno. Honestly. How you know everything I can't imagine.


    but i did get as far as this, and i may continue the coding adventure just to see what is at the end of this nook and cranny Oo.png
     
  12. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    Code (CSharp):
    1.  
    2. for (int i = 0; i < 1000; i++)
    3.         {
    4.             if (!File.Exists("C:/Users/SCHar/Desktop/" + "INFANTRY_" + i))
    5.             {
    6.                 var COMPACTFILE = File.Create("C:/Users/SCHar/Desktop/" + "INFANTRY_" + i);
    7.                 var SW = new StreamWriter(COMPACTFILE);
    8.                 for (int c = 0; c < COMPACT_LIST.Count; c++)
    9.                 {
    10.                     SW.WriteLine(FRAME_ID[c]);
    11.                     SW.Flush();
    12.                     SW.WriteLine("d " + COMPACT_LIST[c].width + " " + COMPACT_LIST[c].height);
    13.                     SW.Flush();
    14.                     var TA = COMPACT_LIST[c].GetPixels();
    15.                     for (int t = 0; t < TA.Length; t++)
    16.                     {
    17.                         string LINE = "c " + TA[t].r + " " + TA[t].g + " " + TA[t].b + " " + TA[t].a;
    18.                         SW.WriteLine(LINE);
    19.                         SW.Flush();
    20.                     }
    21.                 }
    22.                 SW.Close();
    23.                 Debug.Log("WE DID");
    24.                 break;
    25.             }
    26.         }
    27.  
    Code (CSharp):
    1. A01 // My Frame ID
    2. d 28 32 // My Width and Height :)
    3. c 0 0 0 0 // My Colors etc etc etc
    4. c 0 0 0 0
    5. c 0 0 0 0
    6. c 0 0 0 0
    7. c 0 0 0 0
    So if the line did not begin with "d " or "c " I know it is referencing a Frame ID
    So I could now just read the entire thing for the entire data sheet and modify numbers in a single run.
    As perhaps I would create 12 different versions of the same image, where specific colour is changed at a later time.


    Again Thanks alot I mark as resolved.

    To Note: The Text viewing of this file, opens much faster than a .TIFF

    i don’t know if this is even going to be worth while, but what I thought was that we are standing on the shoulders of people who built image formats with the intention for the image to be compressed for the likes of storage methods such as floppy disc and they are of a time when space saving to that nature was more valuable. I am also certain that certain encodings and encryptions existed initially primarily to protect a software developers product from exploitation or replication in the early days of software development. Things that are as of today perhaps un-necessary.
     
    Last edited: Oct 6, 2022
  13. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    In a twist of Fate. as it turns out, A single PNG file here is worth 15kb. and there are 55 of them. So that is 825 Kb.
    My final file size in the new compiled format is 576kb.
    That is 249kb of extra data present just to encode it.

    That is a saving in Bytes of approximately 184,307

    And with that;
    I will speak no more of it.



    Edit
    My mistake

    my file is in fact nearly 42x the size of the original encoded files. But like I said the preservation of this space next to my organisation here is not an issue for me:
     
    Last edited: Oct 6, 2022
  14. Bunny83

    Bunny83

    Joined:
    Oct 18, 2010
    Posts:
    4,112
    This is a complete red herring. Interpreting binary data as text makes no sense. A text viewer has to interpret and organise that data in memory for display. Most text viewers work on a per-line basis which is completely violated by binary data. Also as text viewers try to interpret the data as unicode data (usually utf8) it would hit all sort of error cases and invalid unicode characters. Though it may accidentally recognise some rare unicode characters from other languages which means it has to use a unicode font and has to load all those character representations in order to view them. Opening binary data in a text viewer just makes no sense. Install a hex editor / viewer if you want to view binary files (personally I use this old hex editor which is a german product, but it has an english lang pack).

    That doesn't sound right. The "format" you just presented is one of the most inefficient formats possible. It's far worse than BMP files and those are uncompressed files with minimal header data and just the color information in binary.

    The single png you showed above had a size of 251 bytes! (not kilo bytes). Further more you currently use the defaul ToString conversion of the float values of each pixel. It may be encoded to two bytes in case the color information is 0. However any actual value would be something like 9 bytes per component. So more like 40 bytes per pixel where you would only need 4 bytes in an uncompressed format. Since you use the default ToString conversion, it means the number will be formatted in your local culture setting. If you run this code on different machines, you would get a different output. In germany, france, spain and many other countries (actually about half of the world) uses a comma as decimal point and points as number seperators whereas the english speaking world usually uses a dot as decimal point and a comma as number seperator. If you really want to use a text encoding, you should use the invariant culture.

    That's also wrong ^^. Yes, in the past disk space was more valuable than it is today. You could argue that is doesn't matter that much today. However there is much more to it than disk space. Sending data over the internet is also a factor. Apart from that any text format requires parsing of the text which is always slower than a binary uncompressed format and the binary format is generally smaller.

    No, this was never the intention and there is no encryption going on, jst compression. Any data needs encoding. Your format uses a text encoding (ASCII or UTF8) and on top of that the actual numbers are encoded in base 10 decimal numbers which need to be converted in order to be useable.

    Most image formats only support 8 bit per color channel. There are only very few exceptions where you would have more than 8 bits per channel. That's why for the majority of image formats it makes much more sense to represent a color as a Color32 (that is 4 byte). A BMP image uses 3 or 4 bytes per pixel + the header which is a few bytes (54 bytes).

    What benefit does it give you when you save the image in a human readable format? I don't quite understand the reasoning behind that?
     
    AnimalMan likes this.
  15. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    Forgive me, But does anybody have any insight into why those PNG format characters, and the TIFF format characters are cheaper in byte data than usage of the roman numeral?
    I had reduced file size via compression so it is merely 14x as large as the original PNG. And it contains less physical text. Instead we just store unique colors, and then their index number in accordance to dimension.

    14x.png

    So presumably if i write Byte instead of Write Line. then we should be even smaller and closer to the PNG.
     
    Last edited: Oct 6, 2022
  16. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    Code (CSharp):
    1.  
    2.                 for (int m = 0; m < COMPACT_LIST.Count; m++)
    3.                 {
    4.                     for (int F = 0; F < FRAME_ID[m].Length; F++)
    5.                     {
    6.                         BYTE.Add(Convert.ToByte(FRAME_ID[m][F]));
    7.                     }
    8.                     BYTE.Add(Convert.ToByte('d'));
    9.                     BYTE.Add(Convert.ToByte(COMPACT_LIST[m].width));
    10.                     BYTE.Add(Convert.ToByte(COMPACT_LIST[m].height));
    11.                     for (int a = 0; a < COL[m].Count; a++)
    12.                     {
    13.                         BYTE.Add(Convert.ToByte('R'));
    14.                         BYTE.Add(Convert.ToByte(COL[m][a].r));
    15.                         BYTE.Add(Convert.ToByte(COL[m][a].g));
    16.                         BYTE.Add(Convert.ToByte(COL[m][a].b));
    17.                         BYTE.Add(Convert.ToByte(COL[m][a].a));
    18.                         var C = COL_REFERENCE[m][a];
    19.                         BYTE.Add(Convert.ToByte('i'));
    20.                         for (int v = 0; v < COL_COORDINATES[m][C].Count; v++)
    21.                         {
    22.                             var B = COL_COORDINATES[m][C][v].ToString().ToCharArray();
    23.                             for (int BA = 0; BA < B.Length; BA++)
    24.                             {
    25.                                 BYTE.Add(Convert.ToByte(B[BA]));
    26.                             }
    27.                         }
    28.                     }
    29.                 }
    30.                 for (int b = 0; b < BYTE.Count; b++)
    31.                     COMPACTFILE.WriteByte(BYTE[i]);
    A Reduction.png

    I cant make much more gains. Im going to call it off. Interesting learning experience :)
    Thanks Bunny

    I suppose I could iterate through my bytes look for duplicate bytes make a new byte list and reference index of duplicate bytes but this is turning into fractal bytes. The conclusion is unless i am prepared to fractal byte and god knows what else i am better off keeping my files loose in the directory ^_^
    Edit: Fractal Bytes failed ofc it would result the same byte number :|
     
    Last edited: Oct 6, 2022
  17. Bunny83

    Bunny83

    Joined:
    Oct 18, 2010
    Posts:
    4,112
    You still have the wrong idea about data stored in files ^^. Data is just data. PNG files do not contain text. They contain just binary data. As you may know, text may be represented in a computer file either as ASCII characters where each character requires 1 byte (8 bits) or in UTF8 which may use a variable number of bytes per character, but is essentially backwards compatible to ascii for the first character plane.
    Have a look at this ascii table to better understand what we're talking about. A text file does not contain "text". It contains binary data that is interpreted as text. So the word "Hello" would be this:


    Code (CSharp):
    1. //  01001000 01100101 01101100 01101100 01101111
    2. //  |  H   | |  e   | |  l   | |  l   | |  o   |
    So human readable text is just one way how to interpret the data in that file. The file itself just contains those 40 binary bits (or 5 bytes) which when interpreted with the ASCII table comes out as the string "Hello". A binary format does not use ascii characters but store data directly in those bytes.

    A single byte value (so a group of 8 bits) can represent values between 0 and 255. For example the BMP format stores each color channel in a single byte. When you store a number as a decimal number represented as ASCII text, you can only store the values 0 to 9 in a single digit. So just storing values in the range of a byte as decimal text would require up to 3 characters. Furthermore you have to insert space characters as well. A space character is just another character (the number "32" or in hexadecimal 0x20). When storing numbers in decimal form you need a way to seperate the values. That's why you inserted those spaces. With a variable length per number you need some sort of seperator, otherwise you would not know how to interpret the values. Imagine the values 240, 42, 7. If you write them without spaces as decimal values you get "240427" How would you know how to interpret that string as 3 numbers? It could be 2, 404, 27 or 24,042, 7 or 2404, 2, 7 or many other interpretations. So storing numbers in text requires a seperator or a fix number of digits. So if you know the largest values have 3 digits you could store them as 240042007. Knowing that each value always has 3 digits makes it easy to read those as 240, 042, 007. Though storing 3 values would require 9 bytes. If the range you want to store is between 0 and 255, you would only need 3 bytes. So it's already 3 times as much data.

    On windows machines a new line character (yes, a new line also requires a character) is actually represented by two characters. A "carriage return" (\r or 0x0D) followed by a "line feed" (\n or 0x0A). So each line you produce, even if those lines are empty would require two bytes.

    You actually store floating point values as strings which can have around 13 characters as a decimal number + the space we just talked about to seperate the values. A float inside the computers memory is represented with just 4 bytes. Have a look at this website. It allows you to play around with float values. You see all the 32 bits(4 bytes) that make up the floating point number. So here you're also over 3 times the required space.

    Though the most important difference is that your data is uncompressed. Compression can reduce the required memory significantly, especially if there's a lot repetition in the data. That's what compression algorithms do. They search for repeating patterns and essentially replace those repeated patterns with a single version and some additional information how often that pattern occurs and how to put it back together. Note that this was just a very rough simplification. Different algorithms use quite different techniques to reduce the size of the data.

    In the end it's all just about information entropy, but I guess this goes too far now ^^. Note that other file formats like gif for example use a color palette and each pixel is just an index into that palette. For black and white (or simply two color images) that means we can store a single pixel in a single bit. So one byte of data actually contains 8 pixels. So a 8x8 image would only require 8 bytes of raw data. But of course you have to know how to read it. Of course gif is very limited as the maximum number of colors in the palette is 256 distinct colors. It may work well for small pixel art images, but not for any kind of photo like images. Usually we work with 24 bit color images. That's 8 bit per color channel. So an RGB value would be made up by 3 bytes (8*3 bits) which gives us a total of 2 billion distinct colors.

    Your format could be improved by using Color32 values where each value is a byte rather than a float. However storing it as text is still extremely inefficient. And I mean inefficient in both, memory and parsing / processing.

    So I'm still wondering what actual problem you try to solve with your own format? Again, what benefit does it actually give you?
     
    angrypenguin and AnimalMan like this.
  18. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164

    Struggling setting security for the file to allow for this >>
    Code (CSharp):
    1.   public void COMPRESSFILE(string PATH)
    2.     {
    3.         FileStream originalFileStream = File.Open(PATH, FileMode.Open);
    4.         FileStream compressedFileStream = File.Create(PATH + "_C");
    5.         var Compressor = new DeflateStream(compressedFileStream, CompressionMode.Compress);
    6.         originalFileStream.CopyTo(Compressor);
    7.         compressedFileStream.Close();
    8.         originalFileStream.Close();
    9.         Compressor.Close();
    10.     }
    To be honest bunny I am only playing around I am not doing anything meaningfully productive here I am just interested in how these things work. What you explained up there just now is really well written.

    I will probably bail on this venture pretty soon. I just keep coming back to it this whole day.
    I might try the color32 in a little bit. This whole type of thing is pretty new to me. Like I might be able to do all this eventually, but then I still need to decompress and obtain bytes and rebuild :) I honestly can't give you a good reason why it's so important that I am attempting this. Its not important, its just one of those things I guess i just gotta know.

    I guess what you are hinting at is to convert to binary then write the bytes :O

    :O

    I'll probably sleep on it tonight and decide if i needs to be that much of a genius. Do i?
    I'll sleep on it.
     
    Last edited: Oct 6, 2022
  19. orionsyndrome

    orionsyndrome

    Joined:
    May 4, 2014
    Posts:
    3,153
    I'm jumping in the midst of this quality conversation, but just wanted to add (after Bunny explained how text files are actually "text", because that's really required) that PNG format has a long history of being VERY complicated, internally, unlike TIFF which is effectively VERY simple, by contrast.

    Comparing the two directly is very unhealthy. That's my first point, and I'll explain why shortly.
    My second point is that it's important to understand the difference between raw, lossy and lossless.

    Raw image is what the hardware is hungry for. You stream that bitch, you get its contents on the screen. Typically what we call as bitmaps (not the BMP format per se, that's Windows Bitmap) are very near to what you actually want, plus minus some rearranging, reordering, flipping axes, and whatnot. You do almost nothing special to prepare such a stream, and voila, there's a show on your screen or your printer. But there is a problem, raw data is HUGE! Even for today standards.

    Lossless compression works like a zip, it packs HUGE data to much less data, via magic and interdimensional portals, and you can get your information back in a pristine condition, but you have to trade some CPU time (and memory) to push stuff through these portals.

    Lossy compression is even better. It uses smoke and tricks to fool the brain into seeing one thing, by taking advantage of our limited perception, thus it can crunch raw data even further, but the price you pay is that some information has been permanently lost.

    Ideally, why use raw other than for speed, when you can use lossless if you can cope with the coder/decoder performance (aka codec). Lossy is only really good for dumb memes, who cares about that right, let's conserve some bandwidth instead.

    Both PNG and TIFF are lossless formats, by their design and intention, and given their purpose, you'll see why PNG is much more complicated than TIFF.

    PNG was intended as a replacement for GIF, historically, and because it was intended for the internet, it employs some heavy duty compression that verges on astrophysics (not literally) established from doctoral papers on computer graphics. Sadly it turns out it doesn't really do its job, not in terms of compression (that part is brilliant), but in terms of decoding performance. GIF was legally in a limbo for the time, being patented (and the patent changed hands), and it was also obsoleted because the hardware became more powerful, so the authors of PNG were pushed to make something that is universally acceptable on the internet, and for this they pursued a masterpiece in lossless compression.

    In the end PNG (portable network graphics) format lost to JPEG (joint photographic experts group), which is a lossy compression, but it became so ubiquitous that it seriously destroyed every image it ever encoded. It took us some time to recover from the early internet, and to get better processing power until PNG became as regular as it is nowadays.

    TIFF was historically designed to be very portable and not so much compressed. It's an archival format intended for professional usage in desktop publishing and printing industry by Aldus that was later gobbled up by Adobe. The idea behind it was not the internet, but the actual raw storage, so it supported various color spaces from the get go, and is designed to streamline the color separations as independent bitmaps. But because space was a problem back then (and printing quality demands HUGE data), they've decided to go with a plug-in based model for compressing images in the codec directly. So the format was designed in such a way to encourage 3rd party meddling with all kinds of encoders and compression algorithms, of which only two had actually won. One is LZW, or Lempel-Ziv-Welch image compression algorithm (though there are numerous algorithms in this family, these guys were very productive together and on their own), the other is "plain" ZIP. It turns out LZW works slightly faster with CMYK separations (specifically at or beyond 300 dpi), but the differences in compression itself are negligible. Anyway this explains why TIFF (tagged image file format) has much better internal specs of the header and the way it's laid out, and it even contains some human-readable data.

    For the end, I wanted just to share this almost-documentary by Reducible (an excellent YT channel if you're into computer engineering) on how exactly PNG works internally, where also another file format is shown near the end, called QOI (quite ok image) discovered only recently by some random guy, who managed to pull off only slightly worse compression than PNG's (on a stochastic 32-bit image database in a direct one-to-one comparison) however the codec itself is trivially simple and turns out to possess an incredible performance rating. By contrast, none of the formats mentioned so far were made by some random guy, but by big consortiums and academia, so there is obviously a lot of room for improvement.

    There you go, if you really want to learn more about this stuff, I've been doing what you're doing in the early 90's, so all of this might be useful to you.
     
    angrypenguin, Bunny83 and AnimalMan like this.
  20. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    That is some quality info my friend. Bravo! Il take a watch of that doc. I had watched a little documentary on jpeg maybe 6 months ago. Maybe it plays a part in what I’m trying to do here. I will likely chase that png byte data a little further. But I expect I will settle for the large readable file for reasons my brain is too melted to get into right now. But chasing that png will be like chasing the dragon. Eventually you would assume that logically there could only be one superior way to do it. But

    does size matter?

    Of course as bunny had said it matters for internet related matters. But in the context in which I would use my own custom file in unity? Does the size of that file matter? Personally it somewhat matters. That is to say; assuming I complete the project, and sold it on my steam account, if somebody pried it open to extract my graphics I guess I would prefer that they have a hard time and at the end, I guess; respect me for it.

    But I am totally 100% not assuming I will get close to making the best file type our people have ever seen. It’s highly unlikely in-fact. Or at least; the odds of that would at the current moment be pretty slim.

    but I will chase that png.
    I will attempt the binary version. And measure size difference.
    I will then hopefully be at some grade to apply and Unapply compression. And then I can really gauge how close I am getting. And how much a binary format byte encode + compression does.

    so far it went down 42x the size 14x the size and then 10x the size and I’m assuming with a compression maybe its one 3rd smaller or at least I would be very happy if it was a significant amount coming from the compression. Leaving me in a situation where the remaining data reduction is a matter of algorithm.
     
  21. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    So what I have learned

    Break up the rgba into individual arrays
    Look for matching values
    Tag coordinates with a cheap costing byte to represent a loop of that value, and that applies uniformly to all arrays across all colour channels on rebuild. So you reduce colour repetition loop, and reference of location repeat values per r g b and a , so I don’t need to write coordinate 1 2 3 4 = colour white. I just have to say coordinate 1 repeats 3 times. Oh and the value of that white on the Red was used here in this Blue channel Etc. etc.

    But I have 10 mins left on the vid. Just want to write this one down while it’s hot in my head ^



    yeah I see. I see. So I see indeed. I may be able to achieve something pretty good: but I am not trying to break a world record here. But maybe if I can get the code in a state where I can progress. I guess I could drop the MS compressor and persist on the :

    binary conversion, check for loop assign negative binary value for loopers, convert to byte and see the results. - if at this point the file is comfortably small, I will go ahead and try to rebuild the correct binary value before converting back into usable values to rebuild that image.
     
    Last edited: Oct 6, 2022
  22. Bunny83

    Bunny83

    Joined:
    Oct 18, 2010
    Posts:
    4,112
    How exactly did you arrive at that conclusion ^^. It's actually more the other way round. TIFF is an extremly flexible format which can even work as a container format to include jpeg data. Also TIFF supports tons of "tags" and specifications and different compression algorithms. You mentioned that TIFF also supports ZIP compression. You do realise that the ZIP format is the deflate algorithm that is used in PNG. Deflate wasn't invented for PNG as it was invented by Phil Katz long before.

    To quote wikipedia:
    . TIFF supports a hell lot of different pixel bit formats or color spaces. PNG on the other hand is restricted to RGB images.

    PNG actually is quite similar to TIFF as it also consists of named chunks. Though the standard is much simpler and doesn't have that many variations

    Note that PNG wasn't meant to replace GIF. Since GIF was one of the most popular formats back then, they wanted a format that doesn't have patent issues like GIF. So the intention wasn't to replace GIF but to provide an alternative. GIF can do things PNG can't (animations for example) and PNG can do things GIF can't (proper color images with more than 256 colors).

    Note that PNG is actually simple enough that they explained almost all details on the wikipedia page. TIFF is that complicated that the wiki page is only contains rough descriptions of some of the various features and a list of supported compression algorithms.

    Deflate is one of the go-to standards when we talk about compression nowadays. It's used literally everywhere. That's why PNG support can be seen almost everywhere. Gzip and deflate are the most widely supported transfer compression algorithms for http (though brotli will probably take over at some point, though it's also just a variant of LZ77).

    Don't get me wrong, I really like older formats. I've written this GIF loader, just for fun. I like to dive into file formats and what neat solutions they came up with in the past. (I remember loading dune2 paks some 25 years ago). Though I stay away from TIFF if possible :). Yes, Tiff allows higher quality images, that's why it was and is widely used in the professional industry. Also a lot of custom, industry specific extensions has been developed for TIFF, so they continue using it. Though even formats like GIF or PNG allow extensions pretty much the same way, but almost nobody does it because they usually want maximum compatibility. When I was writing my gif loader I was thinking about using gif as a save game format. So we could store a screenshot at the save point as the actual image and use extension blocks to store the actual save data :) That way the image could still be opened in any image program but still works as a savegame file ^^. Pretty pointless feature, but hey,
     
  23. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    Code (CSharp):
    1. for (int m = 0; m < COMPACT_LIST.Count; m++)
    2.                 {
    3.                     for (int F = 0; F < FRAME_ID[m].Length; F++)
    4.                     {
    5.                         BYTE.Add(Convert.ToByte(FRAME_ID[m][F]));
    6.                     }
    7.                     BYTE.Add(Convert.ToByte(COMPACT_LIST[m].width));
    8.                     BYTE.Add(Convert.ToByte(COMPACT_LIST[m].height));
    9.                     for (int a = 0; a < COL[m].Count; a++)
    10.                     {
    11.                         BYTE.Add(Convert.ToByte(COL[m][a].r));
    12.                         BYTE.Add(Convert.ToByte(COL[m][a].g));
    13.                         BYTE.Add(Convert.ToByte(COL[m][a].b));
    14.                         BYTE.Add(Convert.ToByte(COL[m][a].a));
    15.                         var C = COL_REFERENCE[m][a];
    16.                         BYTE.Add(Convert.ToByte('i'));
    17.                         bool SkipValue = false;
    18.                         float NEXTVALUE = -1;
    19.                         for (int v = 0; v < COL_COORDINATES[m][C].Count; v++)
    20.                         {
    21.                             var B = COL_COORDINATES[m][C][v].ToString().ToCharArray();
    22.  
    23.                             if (v != 0)
    24.                             {
    25.                                 if (v + 1 < COL_COORDINATES.Count)
    26.                                 {
    27.                                     if (COL_COORDINATES[m][C][v + 1] == COL_COORDINATES[m][C][v])
    28.                                         BYTE.Add(Convert.ToByte('-'));
    29.                                     else
    30.                                     {
    31.                                         if (COL_COORDINATES[m][C][v] > 99)
    32.                                         {
    33.                                             for (int BA = 0; BA < B.Length; BA++)
    34.                                             {
    35.                                                 float VALUE = -1;
    36.                                                 float.TryParse(B[BA].ToString(), out VALUE);
    37.                                                 BYTE.Add(Convert.ToByte(B[BA]));
    38.                                                 BYTE.Add(Convert.ToByte("+"));
    39.                                             }
    40.                                         }
    41.                                         else
    42.                                             BYTE.Add(Convert.ToByte(COL_COORDINATES[m][C][v]));
    43.                                     }
    44.                                 }
    45.                             }
    46.                             else
    47.                             {
    48.                                 for (int BA = 0; BA < B.Length; BA++)
    49.                                 {
    50.                                     float VALUE = -1;
    51.                                     float.TryParse(B[BA].ToString(), out VALUE);
    52.                                     BYTE.Add(Convert.ToByte(B[BA]));
    53.                                 }
    54.                             }
    55.                         }
    56.                     }
    57.                 }
    Byter.png

    Not had time to read your reply just

    I misunderstood everything i previously wrote, please check algorithm for me and scrutinize.

    Important bit
    Code (CSharp):
    1.  
    2.                         for (int v = 0; v < COL_COORDINATES[m][C].Count; v++)
    3.                         {
    4.                             var B = COL_COORDINATES[m][C][v].ToString().ToCharArray();
    5.                             if (v != 0)
    6.                             {
    7.                                 if (v + 1 < COL_COORDINATES.Count)
    8.                                 {
    9.                                     if (COL_COORDINATES[m][C][v + 1] == COL_COORDINATES[m][C][v])
    10.                                         BYTE.Add(Convert.ToByte('-'));
    11.                                     else
    12.                                     {
    13.                                         if (COL_COORDINATES[m][C][v] > 99)
    14.                                         {
    15.                                             for (int BA = 0; BA < B.Length; BA++)
    16.                                             {
    17.                                                 float VALUE = -1;
    18.                                                 float.TryParse(B[BA].ToString(), out VALUE);
    19.                                                 BYTE.Add(Convert.ToByte(B[BA]));
    20.                                                 BYTE.Add(Convert.ToByte("+"));
    21.                                             }
    22.                                         }
    23.                                         else
    24.                                             BYTE.Add(Convert.ToByte(COL_COORDINATES[m][C][v]));
    25.                                     }
    26.                                 }
    27.                             }
    28.                             else
    29.                             {
    30.                                 for (int BA = 0; BA < B.Length; BA++)
    31.                                 {
    32.                                     float VALUE = -1;
    33.                                     float.TryParse(B[BA].ToString(), out VALUE);
    34.                                     BYTE.Add(Convert.ToByte(B[BA]));
    35.                                 }
    36.                             }
    i note a greater than 100 value. and if our next byte is is our byte, then we basically put a '-' single byte. It maybe checks out with image content, number of background space, number of colours etc. Its midnight. maybe i am crazy right now.

    Maybe some junk code in there aswell. Sorry
     
    Last edited: Oct 6, 2022
  24. angrypenguin

    angrypenguin

    Joined:
    Dec 29, 2011
    Posts:
    15,635
    What does that have to do with image size?

    Anyway, for development it's not just about how many bytes the finished file takes. If you're using version control (and any software developer worth their salt should be) then you're not just storing the finsl image, you're storing every version of it.

    So if you've got 55 frames and you tweak one, now you're storing 56 frames. But if you bundle all of those together in one file and then twrak one, instead of 56 you've got 110. And for your stated use case of changing colours, every time you change a colour in the files, it's another 55 images' worth of space eaten by your version history.

    For a 2D game that'll get huge, fast.

    Compare that to a shader based approach, as you mentioned earlier. Only one colour value changes. Basically no load time impact. If you want to use the same images with different colours you can with just a Material - no new copy of the files needed. And the shader itself is probsbly trivially cheap.

    Playing with reading / writing data is fun, and it's great to see people willing to dig into it. Most people treat data as black magic and assume that if there isn't a library it can't be done. So carry on with that for fun if you'd like. But when you get back to your keyed colour thing... give the shader a go. ;)
     
    AnimalMan likes this.
  25. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    Sorry here is a cleaner one Junk Removed ( i think )

    Edit even cleaner ;
    Code (CSharp):
    1.  
    2.  
    3.                 for (int m = 0; m < COMPACT_LIST.Count; m++)
    4.                 {
    5.                     for (int F = 0; F < FRAME_ID[m].Length; F++)
    6.                     {
    7.                         BYTE.Add(Convert.ToByte(FRAME_ID[m][F]));
    8.                     }
    9.                     BYTE.Add(Convert.ToByte(COMPACT_LIST[m].width));
    10.                     BYTE.Add(Convert.ToByte(COMPACT_LIST[m].height));
    11.                     for (int a = 0; a < COL[m].Count; a++)
    12.                     {
    13.                         BYTE.Add(Convert.ToByte(COL[m][a].r));
    14.                         BYTE.Add(Convert.ToByte(COL[m][a].g));
    15.                         BYTE.Add(Convert.ToByte(COL[m][a].b));
    16.                         BYTE.Add(Convert.ToByte(COL[m][a].a));
    17.                         var C = COL_REFERENCE[m][a];
    18.                         BYTE.Add(Convert.ToByte('i'));
    19.                         for (int v = 0; v < COL_COORDINATES[m][C].Count; v++)
    20.                         {
    21.                             var B = COL_COORDINATES[m][C][v].ToString().ToCharArray();
    22.                             if (v != 0)
    23.                             {
    24.                                 if (v + 1 < COL_COORDINATES.Count)
    25.                                 {
    26.                                     if (COL_COORDINATES[m][C][v + 1] == COL_COORDINATES[m][C][v])
    27.                                         BYTE.Add(Convert.ToByte('-'));
    28.                                     else
    29.                                     {
    30.                                         if (COL_COORDINATES[m][C][v] > 99)
    31.                                         {
    32.                                             for (int BA = 0; BA < B.Length; BA++)
    33.                                             {
    34.                                                 BYTE.Add(Convert.ToByte(B[BA]));
    35.                                                 BYTE.Add(Convert.ToByte("+"));
    36.                                             }
    37.                                         }
    38.                                         else
    39.                                             BYTE.Add(Convert.ToByte(COL_COORDINATES[m][C][v]));
    40.                                     }
    41.                                 }
    42.                             }
    43.                             else
    44.                             {
    45.                                 if (COL_COORDINATES[m][C][v] > 99)
    46.                                 {
    47.                                     for (int BA = 0; BA < B.Length; BA++)
    48.                                     {
    49.                                         BYTE.Add(Convert.ToByte(B[BA]));
    50.                                         BYTE.Add(Convert.ToByte("+"));
    51.                                     }
    52.                                 }
    53.                                 else
    54.                                     BYTE.Add(Convert.ToByte(COL_COORDINATES[m][C][v]));
    55.                             }
    56.                         }
    57.                     }
    58.                 }
    59.    
    60.  
    61.  
    Looks like 0 + 1 won't be able to turn back into a value it actually should be if it does not = Length.

    Clean algorithm checks out byte increase, Is cool. If it can be read and produce the image il write it tomorrow. Maybe i have made ultra-noob mistake that i am unable to see and i have lost alot of byte data that is irreversibly unobtainable. Maybe there are too many + in the hundred beat, and not enough + (or even existent) for a tens beat.
    Nice.png

    I hope i am progressing. Anyway i am self centered and only focussing on myself. I'l read ya'll stuff but i probably sleep first :)

    If you want to test with my image here it is
    A01.png
    But i think it looks all spick and span bar a few minor individual byte handlings for the tens on values > 1-9 and then values > 99 for a different byte symbol '-' & '+' so we know the next three are 100, or the next two are 10's

    this is what i did with compact list

    Code (CSharp):
    1.  
    2.                 for (int m = 0; m < COMPACT_LIST.Count; m++)
    3.                 {
    4.                     COL_COORDINATES.Add(new List<List<int>>());
    5.                     COL.Add(new List<Color>());
    6.                     COL_REFERENCE.Add(new List<int>());
    7.  
    8.                     var TA = COMPACT_LIST[m].GetPixels();
    9.                     for (int t = 0; t < TA.Length; t++)
    10.                     {
    11.                         if (!COL[COL.Count - 1].Contains(TA[t]))
    12.                         {
    13.                             COL_COORDINATES[COL_COORDINATES.Count - 1].Add(new List<int>());
    14.                             if (!COL_COORDINATES[COL_COORDINATES.Count - 1][COL_COORDINATES[COL_COORDINATES.Count - 1].Count - 1].Contains(t))
    15.                             {
    16.                                 COL_COORDINATES[COL_COORDINATES.Count - 1][COL_COORDINATES[COL_COORDINATES.Count - 1].Count - 1].Add(t);
    17.                                 COL_REFERENCE[COL_REFERENCE.Count - 1].Add(COL_COORDINATES[COL_COORDINATES.Count - 1].Count - 1);
    18.                             }
    19.                                 COL[COL.Count - 1].Add(TA[t]);
    20.                         }
    21.                         else
    22.                         {
    23.                             if (!COL_COORDINATES[COL_COORDINATES.Count - 1][COL_COORDINATES[COL_COORDINATES.Count - 1].Count - 1].Contains(t))
    24.                             {
    25.                                 COL_COORDINATES[COL_COORDINATES.Count - 1][COL[COL.Count - 1].IndexOf(TA[t])].Add(t);
    26.                             }
    27.                         }
    28.                     }
    29.                 }
    But this is all BS if i cant reassmble the thing properly
     
    Last edited: Oct 7, 2022
  26. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    I see what’s missing
    All the data will be missing

    line 24 Needs to reference col MCV it only checks col on v + 1

    noob mistake
     
  27. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    I just divide > 99 by 10; and put a '-' symbol so i know that it was divided by 10;
    Code (CSharp):
    1.  
    2.                         for (int v = 0; v < COL_COORDINATES[m][C].Count; v++)
    3.                         {
    4.                             var B = COL_COORDINATES[m][C][v].ToString().ToCharArray();
    5.  
    6.                             if (v != COL_COORDINATES[m][C].Count - 1)
    7.                             {
    8.                                 if (v + 1 < COL_COORDINATES[m][C].Count)
    9.                                 {
    10.                                     if (COL_COORDINATES[m][C][v + 1] == COL_COORDINATES[m][C][v])
    11.                                         BYTE.Add(Convert.ToByte('+'));
    12.                                     else
    13.                                     {
    14.                                         if (COL_COORDINATES[m][C][v] > 99)
    15.                                         {
    16.                                             BYTE.Add(Convert.ToByte('-'));
    17.                                             BYTE.Add(Convert.ToByte(COL_COORDINATES[m][C][v] / 10));
    18.                                         }
    19.                                         else
    20.                                             BYTE.Add(Convert.ToByte(COL_COORDINATES[m][C][v]));
    21.                                     }
    22.                                 }
    23.                             }
    24.                             else
    25.                             {
    26.                                 if (COL_COORDINATES[m][C][v] > 99)
    27.                                 {
    28.                                     BYTE.Add(Convert.ToByte('-'));
    29.                                     BYTE.Add(Convert.ToByte(COL_COORDINATES[m][C][v] / 10));
    30.                          
    31.                                 }
    32.                                 else
    33.                                     BYTE.Add(Convert.ToByte(COL_COORDINATES[m][C][v]));
    34.                             }
    35.                         }
    36.                     }
    After repair this here new Byte divider is the only compression occurring. Result is 1737 bytes. So we are back up again. i was too excited. We are back up again but i think i can almost live with it.
    I am sorry for being so spammy. I just get a little bit over-excited, i should better control my urges.
    Willing_To_Accept.png

    Yes I believe I will live with it. and i will speak no more of it. 6x larger than a PNG i am confident it will re-assemble. No such more words i will speak of the matter. I will append that i cannot with this algorithm deal with a file larger than 999 pixels. Unless i give another 'v' to indicate i will divide by 1000, 10000, 100000 etc. WE are entering the realms of Power and Weakness (or power negative as they say)

    Code (CSharp):
    1.  
    2. if (COL_COORDINATES[m][C][v] > 99)
    3. {
    4.    if (COL_COORDINATES[m][C][v] > 999) // Remember this is an Index belonging to a COL
    5.    {
    6.          BYTE.Add(Convert.ToByte('@'));
    7.          BYTE.Add(Convert.ToByte(COL_COORDINATES[m][C][v] / 100));
    8.    }
    9.    else
    10.    {
    11.          BYTE.Add(Convert.ToByte('-'));
    12.          BYTE.Add(Convert.ToByte(COL_COORDINATES[m][C][v] / 10));
    13.    }
    14. }
    15.  
     
    Last edited: Oct 7, 2022
  28. orionsyndrome

    orionsyndrome

    Joined:
    May 4, 2014
    Posts:
    3,153
    Nah, of course I took some liberty with the actual comparison. Technically, you are right, and in fact, there is no difference in complexity. I was just trying to handwave the "apparent" magnitude in difficulty when approaching the two:

    1) PNG is more obfuscated than TIFF, especially when trying to naively make any sense of it,
    2) TIFF is better documented, by all accounts,
    3) TIFF is much more straightforward as a container, PNG is more low-level and tightly-packed, with almost no observable structure,
    4) Raw TIFF is plain and simple, things only get complicated when you start looking at compressed binaries, but even then you can easily discern what tool was used, simply because the tool would leave marks, i.e. header. You can't say the same for vanilla PNG, except maybe with palletized images which should be RLE with deflate.

    This is all just how a person from a naive "hacking" position would evaluate the differences between the two. And frankly, I would always rather try to decode TIFF than PNG. It's simply by nature of how these things are designed, from the looks of it. And I used to reverse engineer MP3 in late 90's. It was actually harder to reverse engineer PNG than any other image format back in early 2000's, and there were plenty of reasons behind that, but most of it is just incredible opacity and lack of internet access.

    I can even semi-prove this position because even Photoshop had a few issues with how they treated PNG transparency in some edge-cases (through Save for Web module, which is highly specialized in JPEG/PNG codecs). 15 years ago or so, I remember having to either use an external plug-in (I think it was called FIXPNG or sth like that) to save PNG with a properly encoded alpha channel or to avoid the format altogether. For a format that was as ubiquitous, it's kind of weird that even Adobe's engineers had a hard time doing it properly. This is one of numerous instances that made me describe it the way I did.

    So yes, PNG is more complex than TIFF, it's just you and I are using the word complex to describe different things.

    You're thinking in engineering terms WHEN you now exactly what to look for, and PNG is pretty much an open book these days, and I'm thinking in terms of practical reverse engineering, and PNG was never easy to work with nor it was an open book. Of course when you really go in headlong, deflate = zip, yes.

    Though I never said it was invented for any of the two, a zip is a zip, you probably misread something.
    I used ARJ back in my time, DEFLATE and PKZIP came out in mid 90's. I kind of know these things by heart.


    Exactly, this is both pro and contra. You can at least tell how many containers TIFF has, and you can even tell what to expect from them, but yes this makes one hell of a combinatorial space, but that was the point of it, it's TAGGED for a reason. PNG on the other hand, has no obligations toward you as a "hacker". No tagging. You can only observe a stressfully tight stream of complete garbage, and unless you have ANY inside information about its intermediate contents (how on Earth you can tell which filters were applied if you don't know how the encoder operates?), you're doomed.


    Yes it was (edit: I mean listen to your own words, they had to supersize what GIF has already achieved, they've stumbled into a new business model territory, and tried to cease this opportunity). I mentioned the apparent obsolescence already, but the real issue was that CompuServe was sued over LZW and development of GIF was stuck in a legal hell.

    Here's a snippet from WP and I've read a much more thorough overview of that era but sadly can't remember where to look for it
    I've bolded that word to highlight the apparent urgency. They had a deal with Netscape you see, and this was in the midst of the great browser war (or the start of it). So all of this hassle is mostly man-made, and had almost nothing to do with just improving the technology. This also partly explains a) the deliberate obfuscation of the PNG format, b) the passive-aggressive fight for dominance with JPEG, c) the subsequent and very silent non-adoption by Internet Explorer (which I remember well), and d) the first such successful monopoly of MP3 not that much further down the history line. I also personally know a guy who fully reverse engineered WMA and then got a job working for Nero (https://en.wikipedia.org/wiki/Nero_AAC_Codec , *name redacted* fully credited under History). Oh the details are juicy. I am still friends with his neighbor (he obviously moved to Germany and never looked back).

    It was all part of the now famous "format war".

    I think you're all used to having access to documentation, schools, mentors and proper culture for this sort of stuff. We're still banned from Paypal in Serbia, obviously because of politics but also because we have some great hackers, and hacking is what I'm thinking of when I describe something as "easy" or "hard". I learned all kinds of S*** from having access to weird assemblies and demo scene files and whatnot, I have no school, or computer-related specialization and I have never read a book on computers except when I was 7, I remember my father brought me some introduction to BASIC and that was it. Everything I learned was from hands-on experience, my own mistakes, plenty of source code that was circulating in the underground, and also I'm a sponge when it comes to reading online. In the late 90's I was already reverse-engineering the Adobe files, specifically by extracting the JPEG thumbnails as I would hijack Adobe Bridge to generate the thumbnails for me, that S*** was easier than having to produce them in the first place, so I could automate entire image databases without doing much, this was before I had broadband internet; in the early 2000's I used to earn for a living for a while by returning missing HDD partitions equipped only with a disk editor (it was literally one bad byte in FAT32 MBR /or was it FAT16 I don't remember/ which I was able to pinpoint armed with some patience and series of good hunches; it was a wave of some malware, I still don't know what caused it, but those photographers and designers paid well for my "wizardry"). This was my side job while I was producing 3 fully-featured luxuriously designed monthly-issued magazines in parallel. Crazy times.

    Finally, I'm really intimate with TIFFs because I used to be a print professional myself, so maybe I'm a little biased lol. (And now you can perhaps understand my deeply rooted hatred of anything JPEG.)
     
    Last edited: Oct 7, 2022
  29. orionsyndrome

    orionsyndrome

    Joined:
    May 4, 2014
    Posts:
    3,153
    As I said, you didn't quite grasp what I tried to say, but nvm.
    TIFF/ZIP was the latest addition, btw. Someone figured out it made sense in the mid 2000's. Now it's almost a standard. LZW, being a text compression algorithm isn't that good for true color images, and in fact, does very little compared to general compression algorithms. But it was widely supported before, so it stuck. Who knows how many TIFF encoders exist in the wild, that thing is a jungle. By this metric, again, you're right, PNG is much more straightforward. One format, one set of rules, sure.

    Though I wish QOI would gain more traction. Such a fun format. Idk, we ought to move on to something that's faster and more transparent (and open). FLAC did wonders for the audio industry.

    Oh, btw, deflate is only the small part of the general PNG format. It comes basically after everything to squeeze just a little extra and obfuscate things further.
     
  30. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    :D good advice, luckily I don’t use version control

    dropping the single unit of the file to 6X a png I am over the moon with it

    I opened tiff and looked at the data and it is rammed full of data. In comparison to png

    the theory goes; well now if the files are all combined and then they are all linearised byte savings will be made on the compression of all files by index of file, index of colour which colour grid index references could potentially say all empty space for all files can be considered one byte per reference but I must admit proud as I am with such progress I will likely fall back on the open png filled directory in the end. I just know it.
     
  31. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    And btw dune 2000 was one of my fav games. the entire Westwood franchize was my fav games. Didn’t buy the new one though. I modded the hell out of the originals added a snow pallets to dune all kinds of nerd behaviour. But I wasn’t a smart person back then. But I practiced a lot with graphics and copy and paste techniques. And I recognize no developer has attempted such things since. Not even in China. So I’m happy to be able to understand a fraction of the work they originally put into it. Of that ea was unable to replicate.
     
    Bunny83 likes this.
  32. Bunny83

    Bunny83

    Joined:
    Oct 18, 2010
    Posts:
    4,112
    I don't really want to argue here. However are you sure you have even looked into the png format yourself?

    Pretty much all of those points are upside down ^^.

    Even though TIFF is much more complex and has much more features, the RFC of TIFF only has 23 pages and only explains some details. The PNG RFC has about 100 pages and explain pretty much every detail with clear examples. and pseudo code for a lot of things. TIFF uses integer tag IDs which are not really human readable or help identifying the structure. PNG on the other hand uses 4 byte text tags for each chunk which all represent a 4 character text which can be easily identified.

    IHDR - header
    PLTE - palette
    IDAT - image data chunk
    IEND - end marker

    other common chunks
    cHRM - chromaticity information
    gAMA - gamma correction information
    pHYs - physical dimensions (PPI)

    You seem to only focus on trivial non compressed TIFF files. Sure they are simpler, but are only a tiny subset of what makes up the format. Writing a TIFF loader that can only load uncompressed files, which are always little endian, always use a certain bit depth, etc... is relatively easy to implement but would break down when you hit any other variation. The overall file structure of TIFF and PNG is exactly the same. They consists of several linked chunks. As a parser you don't have to understand all of them. PNG has the 4 critical chunks that every PNG file must contain, I listed them above. Any other chunks are optional.

    I think you somehow got scared by the video you linked above (I've seen this video in the past as well ^^). What was explained in the video was already everything there is to the PNG format. Here's a blog article explaining how to load the image data including the 4 filter variants. All with sample code snippets in type script. So pulling everything together would probably give you a complete png loader ;) Sure, they used zlib for the deflate, as it's the most complex part, though since it's standardized that's not really an issue.

    Note that we talk about loading the images, not creating them. Creating png images is quite a bit more complex if you want to do everything from scratch yourself. Finding the proper / best filter and implementing the inflate algorithm is a bit harder than decoding it.

    Anyways, it's already too late and I need some sleep ;) As I said, I don't wanna start a fight here. Everybody has their own reason(s) to like / dislike certain formats, companies, products, ...
     
  33. angrypenguin

    angrypenguin

    Joined:
    Dec 29, 2011
    Posts:
    15,635
    Here's more good advice: start.
     
    Bunny83 and spiney199 like this.
  34. orionsyndrome

    orionsyndrome

    Joined:
    May 4, 2014
    Posts:
    3,153
    Cmon no fights.

    I get your point, but you're again quoting the RFC. Have you actually looked inside the file?

    I am really talking about: F3 the file, try to make sense of what you see, if you do, cool, if you don't you switch to hex view, match some patterns, find clues on headers, find locations of interest. Find pointers to locations of interest, demystify header, be on the lookout for 4-letter codes... Some algorithms will leave specific residues (as per some specification) which can provide a hint on what was ran on it, many compression algorithms have their own header, it's rarely just pure data. There are also tags, frames, chunks, all kinds of separators. The structure itself reveals a lot about the process that went into it.

    I'm talking about Sherlocking.

    TIFF is intended for industrial interchange (and tool access), PNG is embedded and not made for any tools to analyze it, probe it, or dissect it in any way, it's pretty much monolithic, that's a huge difference in how I see file formats, and I've been doing this since I was a kid, literally since IFF and PCX times. I've made my own variants of PCX (because I was a kid), and it was a brutally honest format, employing only RLE. But this got me thinking about cryptography and so I naturally explored the idea further. I take all that as a learning process, and that was the core message behind my post intended for AnimalMan.

    I don't know what to say, maybe I wasn't particularly lucky with my particular sample files at the time, but I remember that PNG wasn't at all inviting. I wasn't particularly persistent either, I have to admit, but with TIFFs I could at least parse what I saw, and I probably only ever worked with a couple of specific codecs, likely all by Adobe.

    Anyway it was a long time ago the last time I did any of this, because I certainly did nothing of the sort in the last 18 years. I simply didn't have to and my interests have moved elsewhere.

    I don't find that video too scary, none of it I find scary, not even the zip part (it's just tedious), I do much scarier stuff occasionally; even JPEG is relatively simple to grasp, however: once you get to the bottom of the actual format on paper. But things are horribly different when you know next to nothing about the format, PNG is literally so unique that there isn't nothing else to compare it with, it demands months of painstaking work to reverse engineer properly. TIFF on the other hand is a festival of trial and error and you're pretty much guided the whole way, because it doesn't need to obfuscate or compress anything but the actual container data. On the web even minified JS looks daunting. I'm not sure if you ever tried to go into this blindly, but when I say "hacking", I really mean hacking, not programming, and even less so engineering. And it looked to me like AnimalMan was simply hacking for fun and credits.

    I don't know, maybe I should give you the benefit of the doubt, let's leave it at that. I have no reason not to believe you.

    I just don't like to sound like I'm talking nonsense, because what would be the point, I have definitely formed this opinion after having certain experiences, it's not just to write something willy-nilly. As I said, the differences in our viewpoints is only in how we perceive what is 'complex', but everything else surrounding the case I concur with. If you asked me to choose a decoder to build professionally, I would always choose PNG over TIFF exactly because of the things you mentioned; you simply have to consider reliability. But I still think it's actually harder to do, because when you're hacking TIFF (in hobby conditions, not in a corporate environment) you don't really have to care about all the possible combinations out there in the wild. Not even Photoshop does that. All they care is raw, lzw, zip (gzip most likely), as the post-process compression, and several channel ordering schemes like serial, interleaved, something like that, because deflate is also used intermediately and ordering helps to align redundancy. You just stream the channels to buffers, process them, enclose the structure and save metadata. It is incredibly easy to decode once and figure out the exact process Photoshop does, because the image will likely come out scrambled, then you apply the changes and do it again.

    Now think of PNG and tell me this is not the more banal case.
     
  35. Ryiah

    Ryiah

    Joined:
    Oct 11, 2012
    Posts:
    21,471
    Only the very last stages. Graphics hardware is designed to keep texture data compressed and only decompress it when it is needed. The purpose of this is to maximize the data that can be stored and minimize the bandwidth that is consumed when transporting it.

    That said the formats being discussed here are not the ones being used by the hardware. Graphics cards use much more specialized formats. When you set up a texture at runtime you can specify the format that it will use.
     
    Last edited: Oct 7, 2022
  36. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    Good morning,

    Did nobody sleep?

    I see no objection to the way I handled compression. No fault ?
    I am hoping this is not false confidence,
    I’ll progress on that image rebuilder and see what the result is.

    thanks for the useful responses everybody
     
  37. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    Code (CSharp):
    1. if (File.Exists("C:/Users/SCHar/Desktop/OBJECTIVE/DATA/ART/" + "INFANTRY_46"))
    2.         {
    3.  
    4.             var BYTE_ARRAY = File.ReadAllBytes("C:/Users/SCHar/Desktop/OBJECTIVE/DATA/ART/" + "INFANTRY_46");
    5.             List<byte> BYTE = new List<byte>();
    6.             int count = 0;
    7.             for (int i = 0; i < BYTE_ARRAY.Length; i++)
    8.             {
    9.                 BYTE.Add(BYTE_ARRAY[i]);
    10.                 count = count + 1;
    11.                 if (count > 3)
    12.                 {
    13.                     OUTPUTSTRINGS.Add(BitConverter.ToString(BYTE.ToArray()));
    14.                     BYTE = new List<byte>();
    15.                     count = 0;
    16.                 }
    17.             }
    18.             OUTPUT = BitConverter.ToString(BYTE_ARRAY);
    19.         }
    BYTE_MADNESS.png

    I am un-educated.
    The use case may be ridiculous overkill.
    The space gain is relatively insignificant to returning readable data.
    All values of output string on return are all 24 dash
    So either byte convert screwed up the input,
    or i am unable to string out the byte on byte read;

    using -
    Code (CSharp):
    1.  
    2.  OUTPUTSTRINGS.Add(Convert.ToString(BYTE_ARRAY[i]));
    3.  
    it returns a list of 434 with all values being 36

    Code (CSharp):
    1.  
    2. OUTPUTSTRINGS.Add(BitConverter.ToString(BYTE.ToArray()));
    3.  
    it returns list of 434 where all values are '24' alternate '-'

    It is too scrambling for me before breakfast for file size gain to be useful. Considering the use case for this says that I am only loading the games entire variable data. As the game itself is a variable, all variables introduced from folders, including graphics, models and more finite data relating to their ingame functionality. The size of this file is insignificant. As it is not streamed over net, it is not run at runtime. I may aswell bail out and fall back on something that is clearly legible.
    Do I need to be that much of a genius?
    The answer is no.
    Is anybody else that much of a genius?
    there are wealthier people with worse published game designs that are much worse yes.
    Considering this is all just initialize data.
    This coding adventure is a waste of time until i acquire a meaningful context on what is going on, and what i am doing wrong.

    :)

    I'm going to progress in other area's of the code. And at a later date if i wanted to simplify my open texture2D directory i will return to the problem using a future theoretical universal directory wrapper.


    Stream writer nets me just 1k more expensive for the same data in such a way. So its what 12x the size of a PNG? Who cares.
    DATA_Example.png
     
    Last edited: Oct 7, 2022
  38. angrypenguin

    angrypenguin

    Joined:
    Dec 29, 2011
    Posts:
    15,635
    I didn't examine it, so was in no position to comment either way.
     
    AnimalMan likes this.
  39. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    It’s def a job for end of development, when front end is complete ready for population and has been populated with a story some factions and the like. 1 day I learned enough. I could learn more, but it is not suitable time for it. I progressed with infantry import data, and animation sequence data instead. Production of infantry objects, who only remain to animate correctly as they move over the terrains. And now I stop for lunch.
     
  40. Ryiah

    Ryiah

    Joined:
    Oct 11, 2012
    Posts:
    21,471
    I'll confess I didn't read line for line. I just don't have the time to read those massive walls of text. Are you planning on loading the files at runtime? Or are you planning on importing while you create the game and let Unity create a build?

    When you import an asset into the editor Unity creates a copy of that asset in the Library cache folder that is in a completely different format that it picks based on the platform you have selected in the build menu. When Unity makes a build that is the format it uses not the original format you imported into the editor.

    You appear to be trying to make the files as small as possible but that's only going to work if you load the files in at runtime. If you're importing them into the editor for the purposes of making a build you're going to lose all of those savings.
     
    Last edited: Oct 7, 2022
  41. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    thanks Ryiah it got a little confusing and ran off course. I was just compiling some images into a suitable format to bulk modify those images. But all is resolved. Aside when I side tracked off on trying to compete with png file sizes.
     
  42. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    Hi guys,

    i know I said I would put an end to it keep it at the back of my mind.

    But I have read over this thread a number of times mainly at what you guys all say. And I am thankful for the great powerful knowledge you have all bestowed upon me. And during the act of processing this knowledge. If I am to truely understand, process and take on these teaching logically so that they will meaningfully impact my life decisions

    I have to know


    Code (CSharp):
    1. //  01001000 01100101 01101100 01101100 01101111
    2. //  |  H   | |  e   | |  l   | |  l   | |  o   |
    this is binary for hello, now I have been thinking things on and off for the most of today. And I think it has clicked.

    there is a physical bank, memory unit (yes we all know) inside the cpu, and that physical bank I will call it has 5(or more) compartments per module, and thousands of nearly microscopic modules inside its chip. And so when electricity is passed through these circuits it can be encouraged to enter certain chambers of these modules. And such, to store a binary number with a unique signature; the sequence of the order of the filled chambers is modified, and the order of said sequence is interpreted as a reference to abstract values both numerical and non-numerical determined by the order of chambers that are filled. This allows for a sequence of yes and no to determine a unique value that equates via a combination of; or via a single bank(byte) what that abstract value will be interpreted as.

    And I guess you might say, what am I explaining this for? And I am writing this now so that I can rationalise how to handle the situation of the bizarre nature of using the binary system, as it feels to somebody who has never played around here before. And I will do this for the purpose to intelligently compress and simplify files. (In the near future)

    Now I will be building this game in unity directory when it is ready for build testing but I will be developing most of the variable content from a different drive, and so when it comes to putting this content into the unity directory before build, some level of compression will be required for no other reason than that is how I want my files to be. I am aware they will be double wrapped after unity builds. But that’s no concern of mine.

    I don’t know if you guys can confirm; but if anyone can confirm that what I wrote here about binary is on track as a rough summary for what may be physically occurring inside the memory of the cpu. That would be fantastic. Because knowing all these details helps me comprehend what the designer of the binary system may have been thinking; so I can really attempt to master that thing. I either go all out at it and master it or I Iearn nothing at all.
    Just thought I would post it here if anyone had any links info vids or anything I got a lot of time to learn about it.

    Method.png
    Is this the complete system for the total configurations of a single 4 bit component?
    Presumably any one of these combinations may go with any other one of these combinations, to produce a more complex variety of abstract values.

    Code (CSharp):
    1. //  01001000 01100101 01101100 01101100 01101111
    2. //  |  H   | |  e   | |  l   | |  l   | |  o   |
    Looking at Hello here. It uses 2 of the 16 abstract values for each letter.
    I am presuming the Calculator and logic gates of the CPU output these configurations.
    If i know these charts, is there a function for me to write a binary value by entering its raw value via table reference?
     
    Last edited: Oct 8, 2022
  43. angrypenguin

    angrypenguin

    Joined:
    Dec 29, 2011
    Posts:
    15,635
    You're getting there, but there are gaps or misconceptions. This stuff is all really well documented in many places, such as this one for what "binary" is and how it's used. Many programming texts explain how this relates to your code.
     
    AnimalMan likes this.
  44. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    Good morning!
    thank you.

    i think I am schizophrenic because this morning I feel different.

    but last night

    I conclude —-
    In the first 16 defined base values of binary;
    The dude had put

    0 1 2 3 4 5 6 7 8 9 + - / * ^ =

    I think that’s 16 values. I wouldn’t know if that’s the correct assignment or the first 16 values in order it likely isn’t. But my bafflement once upon a time came at the lack of exponential division operator, and this lack of exponential division operator is corrected from I imagine at this point a binary level by using ^ negative. That is to run a negative value into to Power. Meaning that you wouldn’t receive after exponential power negative a pole shifting value of + - + - but you would receive 50% of the function of the missing power negative operator! At the sacrifice of 50% of the function of the ^ operator.
    But maybe I am connecting dots here I shouldn’t.


    You will notice ^ negative operator exponential division as we know it counts from 0; that is to say the result given is as if you requested +1 on-top of your input value. That is to say 2^(-2) divides 3 times. Result given is 0.25.
    But 2 / 2 = 1, and 1 / 2 = 0.5
    while 2 * 2 = 4 and 4 * 2 = 8
     
    Last edited: Oct 8, 2022
  45. angrypenguin

    angrypenguin

    Joined:
    Dec 29, 2011
    Posts:
    15,635
    I've got no idea what you're talking about. If someone wrote that here I can't see it? At any rate, they aren't binary values, or the first 16 ASCII (text) mappings of binary numbers.

    You're jumping around all over the place and getting yourself confused. This is all Computer Science 101 stuff, so I'd suggest looking up introductory Comp Sci material and reading / watching / whatever. If you want to mess around with file formats of your own then understanding how bits and bytes work and how they're used is an absolute necessity.
     
    Bunny83 and AnimalMan like this.
  46. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    Maybe final value is not ^ maybe it’s decimal place :)

    i stop here

    Oh no i found one which shows me the binary bit cost of these operator characters.
     
    Last edited: Oct 8, 2022
  47. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    Code (CSharp):
    1.  
    2.     public List<byte> BINARYKEYS = new List<byte>();
    3.     public List<string> ALPHABET_AND_NUMERALS;
    4.     public BitArray BIT = new BitArray(0);
    5.     public int SELECTEDBIT;
    6.      
    7.     void MAKEBINARYKEYS()
    8.     {
    9.         var B = new byte();
    10.         for (int i = 0; i < ALPHABET_AND_NUMERALS.Count; i++)
    11.         {
    12.             byte.TryParse(ALPHABET_AND_NUMERALS[i], out B);
    13.             BINARYKEYS.Add(B);
    14.         }
    15.     }
    16.     void SELECTBIT()
    17.     {
    18.         BIT = new BitArray(BINARYKEYS[SELECTEDBIT]);
    19.         string Display = "";
    20.         for (int i = 0; i < BIT.Length; i++)
    21.         {
    22.             Display = Display + " " + BIT[i].ToString();
    23.         }
    24.         Debug.Log(Display); // Blank String :) :) :) :)
    25.     }
    26.  
    27.  
     
    Last edited: Oct 8, 2022
  48. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    Mission Accomplished o_O
    Code (CSharp):
    1.  
    2.  
    3.         var C = File.Create("C:/Users/SCHar/Desktop/" + "BINARYFILE");
    4.         var WRITER = new BinaryWriter(C, Encoding.UTF8, false);
    5.         var STRING = "String Me Up Choppy";
    6.         var STRING2 = "GOSH DARN IT!";
    7.         var STRING3 = "I Think I've Got It!";
    8.         var COMBINEDSTRING = STRING + STRING2 + STRING3;
    9.         WRITER.Write(COMBINEDSTRING);
    10.         WRITER.Write(STRING2);
    11.         WRITER.Write(STRING3);
    12.         WRITER.Close();
    13.         var stream = File.Open("C:/Users/SCHar/Desktop/" + "BINARYFILE", FileMode.Open);
    14.         var reader = new BinaryReader(stream, Encoding.UTF8, false);
    15.         var LINEA = reader.ReadString();
    16.         var LINEB = reader.ReadString();
    17.         var LINEC = reader.ReadString();
    18.         Debug.Log(LINEA);
    19.         Debug.Log(LINEB);
    20.         Debug.Log(LINEC);
    21.         reader.Close();
    22.         stream.Close();
    23.  
    Order in Order Out.
    Why must these simple things be surrounded by such complexity I do not know.
    What goes in, then comes out, in the order it went in.
    But the conclusion here is far from what i expected. I mean its too easy.
    The encoding stuff is not binary. It is not the same, the text file can be viewed for its contents which is clearly legible on most encodings I tested. Okay, sometimes they replace a space with a hidden character. I just wonder at this point is this even what I want? It basically just places a flag down on every new Write command. So i guess they are indexed in accordance to the encoder. But they are all pretty much identical.
    Its also costing nearly 100 byte for 3 or 4 lines of text.

    No wonder you guys think i am speaking an alien language.

    I simply fail to understand how new BinaryReader is writing binary. Or is it writing binary, its just opening in text format in accordance to the encoder? its decoded on text format opening by popular encoder choices?

    I have come this far, for this? I am in disbelief.
     
    Last edited: Oct 8, 2022
  49. Bunny83

    Bunny83

    Joined:
    Oct 18, 2010
    Posts:
    4,112
    I'm sorry but I have no idea what you're on about ^^. As angrypenguin said, you seem to have countless of misunderstandings and a really really skewed view on how a computer works. While on the surface it doesn't really matter how the machine works under the hood for most of programming tasks, when you go more low-level it does matter. You still seem to clutch onto the idea that somehow "text" is like a natural language in a computer. It's not. Also the way you described the binary number format sounds like you describe mystical mysterious magic. Binary is really just another number system just like our decimal number system. It's in fact the most simplest positional number system, that's why it's used in a computer. A computer uses fixed sized units we call bytes which always consists of 8 bits. You can not insert spaces or seperators in between such numbers as the only two possible symbols a binary digit can represent is 0 or 1.

    When you want to develop a binary format, just forget that text even exists. Most binary formats store data on a per-byte basis as a byte is the smallest unit you can directly work with, so it's the fastest. However when space matters one can use bit operation to read, write or modify individual bits in a byte (or a larger type that consists of several bytes). Any kind of string conversion is slow and expensive and allocates garbage memory. While it's always good to experiment and try things out to get a better understanding, you're currently on a wrong track here. It's not even clear what your goal are. We could read between the lines that it seems like you want to protect your data from being stolen or manipulated. We can guarantee you, this is a pointless endevour. Reverse engineering file formats is relatively easy, especially when the code that can actually read the format is right next to it.

    As I said earlier, what's the point of you posting all those fragments of things you do here? Nobody can actually follow what you're doing since it's not even clear what's the goal. The scripting forum is for discussing concrete scripting problems and not for dev blogs :) Unless you have a concrete question about something this does not help anyone.

    About your last post asking about the BinaryReader. The BinaryReader is a stream reader that provides some functions that simplify reading certain more complex binary types. Specifically int(Int32), short(Int16), float(Single), double (Double) and some others. ReadString does read a very specific format. It requires a length prefixed string. So there has to be a "variable length" integer stored, followed by that amount of bytes that is the actual string data. ReadString only works on this particular format. ReadString essentially uses Read7BitEncodedInt internally to read the length of the string.

    The BinaryReader still works with bytes in the end. It's a great tool if you want to roll your own binary format. Though for writing you would use the BinaryWriter of course. How you design your format is up to you. You have to make sure you're writing a format that can actually be read back in without errors. If the format has variable elements, you have to also store some information about how to make sure you read it the right way. Just as an example, here's my MeshSerializer I've written years ago. Unity has added some more features to the Mesh class since them, so it may not be able to serialize all possible meshes. Though it should work for the most part. Maybe it helps you to understand a little better how it works. I've roughly documented the binary format in the comment section inside the MeshSerializer class.

    Another thing a lot people do not think about is multi-byte types can generally be stored in two different ways: little-endian and big-endian. Most machines usually use little-endian but some older architectures (or most network protocols) use big endian. So a binary number that consists of 32 bits requires 4 bytes (4 * 8). In the end that single 32 bit number requires one stream of 32 bit in a row to represent that number. Little endian stores the lowest byte first and the following bytes are the higher valued bytes while big endian stores the most significant byte / highest byte first.

    As an example of a 16 bit number (C# type short). The number 42 represented as a 16 bit number would be
    0000000000101010
    . in big endian encoding the number would be stored as those two bytes:
    Code (CSharp):
    1. // big-endian
    2. //       First byte     second byte
    3. //bin     00000000       00101010
    4. //dec        0             42
    5. //hex      0x00           0x2A
    6.  
    In little-endian the two bytes are orded this way in memory or on disk:
    Code (CSharp):
    1. // little-endian
    2. //       First byte     second byte
    3. //bin     00101010       00000000
    4. //dec        42             0
    5. //hex       0x2A           0x00

    The BinaryReader generally works in little-endian. I've also written a GIFLoader (it works, but may have issues with some exotic edge cases) which uses the BinaryReader for most of the values I need to read. This may be a bit too advanced, but my intention is that you get a better understanding of how the machine you're working on right now actually works. It may also help if you look up some videos which explain some fundamental things like the binary number format, how to convert from one format into another. Closely related the hexadecimal format which makes it much easier to display binary numbers in a compact form (4bits make up exactly 1 hexadecimal digit). Also some information on binary logic and logic gates should also help.
     
  50. AnimalMan

    AnimalMan

    Joined:
    Apr 1, 2018
    Posts:
    1,164
    read correctly

    screw big ender

    screw the encoding

    E907422C-C9D7-4AA9-88EC-A6BFE5106EB2.png
    those are the possible configs


    There is no difference between create text file write line and binary writer write except that I use somebody else’s encoding. They are both writing in “””” smart joke “””” binary.


    very smart

    thanks for trying to make a joke out of me



    no more I’m done