Search Unity

Resolved Best Way To Create Binary Asset

Discussion in 'Asset Importing & Exporting' started by StefDevs, Nov 8, 2022.

  1. StefDevs

    StefDevs

    Joined:
    Jan 20, 2014
    Posts:
    23
    Edit: Changed title to be hopefully more helpful for fellow googlers. Originally: How Does TextAsset(string text) Get TextAsset.bytes from 'text'?

    I'm trying to create an asset for a custom binary format. 'TextAsset' is the only way to add custom binary assets to the asset database, and of course the only way to construct a TextAsset is to supply a string.

    Fine, I can pack my bytes into a string and supply that right? Nope, TextAsset.bytes does not give an exact representation of the input string, and actually has a weird length that doesn't match the input string or TextAsset.text, or some multiple of the string length that would make sense to me.

    So what does it do? Does it add a header? Does it re-encode the string? What encoding does it use? I took a look at the reference source but found nothing there.

    If I can learn what it does to encode the bytes then maybe I can hack it to produce the bytes I want. I mean, ideally we'd just have some non-scuffed way to add a binary asset, or even a constructor that takes a byte buffer, but I'll take a hack if I can get it...

    And before someone says it: no, I don't want to make a file with a .bytes extension as this has bad implications for our asset workflows. Or at least I'm really hoping to not resort to that.
     
    Last edited: Nov 8, 2022
  2. halley

    halley

    Joined:
    Aug 26, 2013
    Posts:
    2,421
    I think it'll be safer all around to not depend on some Unity flavor-of-the-week encoding quirk to work with byte buffers. There's always a chance some part of a text-oriented setup can change in Unity 2085.5.12g, suddenly finding a valid but unintended UTF-8 code sequence and converting your level data into a phrase from the Kama Sutra in Sanskrit. Industry standards are safer; you'll find Base64 encoding routines in just about every language.

    Encode with:

    Code (csharp):
    1.  
    2. byte[] bytesToEncode = Encoding.UTF8.GetBytes(inputText); // or whatever other byte[] you have
    3. string encodedText = Convert.ToBase64String(bytesToEncode);
    4.  
    And decode with:

    Code (csharp):
    1.  
    2. byte[] decodedBytes = Convert.FromBase64String(encodedText);
    3. string decodedText = Encoding.UTF8.GetString(decodedBytes); // or do whatever with the byte[]
    4.  
     
  3. halley

    halley

    Joined:
    Aug 26, 2013
    Posts:
    2,421
    Just to add, this can likely be included in Editor pre-import or post-import scripts, so it's pretty seamless to your process.
     
  4. StefDevs

    StefDevs

    Joined:
    Jan 20, 2014
    Posts:
    23
    Ugh, converting (and/or copying at all) on load is really gross and it might end up being impactfully slow for larger assets, but maybe it's worth trying for now. FWIW I really don't care about it being reliable into the future so if I could do the hack and avoid conversion on load that would be preferred.

    All my options suck in one way or another right now... It seems like such a bad omission from the API but I guess not many people hit this issue? Maybe I should buy a source license and fix it? lol
     
  5. bastien_humeau

    bastien_humeau

    Unity Technologies

    Joined:
    Jun 14, 2017
    Posts:
    191
    Is there any specific reason to use a TextAsset to store your byte array?
    Would creating a ScriptebleObject that only contains a public byte[] member be enough to store it?
    Something like
    Code (CSharp):
    1. public class MyByteContainer : ScriptableObject
    2. {
    3.     public byte[] bytes;
    4.  
    5.     public static MyByteContainer CreateContainer(byte[] bytes)
    6.     {
    7.         var container = ScriptebleObject.CreateInstance<MyByteContainer>();
    8.         container.bytes = bytes;
    9.         return container;
    10.     }
    11. }
     
    EthanKennerly-Glu likes this.
  6. StefDevs

    StefDevs

    Joined:
    Jan 20, 2014
    Posts:
    23
    It just seemed like the most simple/direct option available since TextAsset is the canonical way to import a binary file. But I didn't think of your suggestion! It definitely seems like the way to go. Thanks.

    I'm now curious about if the yaml encoding of the byte buffer is efficient but I'll worry about that if/when it becomes an issue, which it hopefully won't.
     
    Last edited: Nov 8, 2022
  7. bastien_humeau

    bastien_humeau

    Unity Technologies

    Joined:
    Jun 14, 2017
    Posts:
    191
    You can force an asset to be serialized as binary in your project instead of yaml using the PreferBinarySerializationAttribute, which should completely avoid the binary->yaml size and performance issue.
     
  8. StefDevs

    StefDevs

    Joined:
    Jan 20, 2014
    Posts:
    23
    Excellent, thanks again!
     
  9. steffenhb

    steffenhb

    Joined:
    Feb 7, 2021
    Posts:
    7
    stucking on the same issue, with Spine binary skeleton files. They rely on TextAssets so it's either modifying their SDK or getting a way to encode a byte array into a string that reflect that exact same byte array back... q.q, not sure if possible.

    But I'm positive to byte array in TextAsset.bytes equals to the bytes of the binary file