Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.

Performance Improvements To Text Serialization In 2019.3

Discussion in '2019.3 Beta' started by harryr, Apr 15, 2019.

  1. harryr

    harryr

    Unity Technologies

    Joined:
    Nov 14, 2017
    Posts:
    38
    Hey all!

    Alongside many other exciting changes Unity 2019.3 also includes some large improvements to our text serialization backend.

    What's Changed?
    Unity uses YAML as a text serialization format, up until 2019.3 we've been using an open source library called libYAML.
    libYAML has served us well but for speed and maintainability reasons we've been working on a custom internal library called UnityYAML.

    2019.3a1 brings about the full switch to this custom library.

    Why does this mean for you?
    In theory the switch should be completely transparent from a usability point of view.

    From our internal testing the switch over to UnityYAML has resulted in a 30-40% speed up in scene loading times and a ~13% speed up when importing prefabs using text serialization.

    You should see a speed up on the reading and writing of any text serialized documents, these are just some common examples.

    We'd be very interested in hearing the improvements you see in your project from this change.

    What should you look out for?
    We've worked hard to test and QA this new improvement, but as always things may slip through the cracks.

    Please file a bug report according to these instructions if you experience any of the following:
    • Version control noise:
    For example, a text serialized asset changing in a way where the data is the same, but simply moved around a bit due to the new writer.
    We have done thorough testing for this, but it's worth keeping an eye out.​
    • Data loss during serialization
    We'll be tracking new incoming bugs that we think could be related.

    If you are creating your own YAML files
    This has never been officially supported but If you're currently externally creating YAML files then you might need to update your tooling to make sure the format is correct.

    We also want to give you this forum thread as a place to voice any questions or ask about any related issues you've found.
     
  2. steego

    steego

    Joined:
    Jul 15, 2010
    Posts:
    967
    Is it, or are there any plans to make it available for the runtime? Would love to have YAML available as an option to JSON for serialization.
     
    sand_lantern and cxode like this.
  3. harryr

    harryr

    Unity Technologies

    Joined:
    Nov 14, 2017
    Posts:
    38
    Currently this is a C++ library integrated into the editor, we don't have any plans to expose it via a C# api I'm afraid.
     
  4. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    6,326
    The scenes in my current project open so much faster in the editor now, it was immediately noticeable from the moment I opened the first scene.

    Kudos to the team who worked on the serialization improvements!
     
  5. 479813005

    479813005

    Joined:
    Mar 18, 2015
    Posts:
    57
    Can support serialize dictionary?
     
  6. Bastienre4

    Bastienre4

    Joined:
    Jul 8, 2014
    Posts:
    187
    It's a very good news! Thanks for the hard work!
     
  7. goran_okomotive

    goran_okomotive

    Joined:
    Apr 26, 2017
    Posts:
    60
    I also realized a speed-up on my powerful machine by opening a set of scenes. In fact, opening all 43 scenes of FAR: Lone Sails at once in the editor was 42% faster in Unity 2019.3 than in 2018.4. Thumbs up!
     
  8. vertxxyz

    vertxxyz

    Joined:
    Oct 29, 2014
    Posts:
    89
    Would this speed up animation importing times too?
    I know in the past when I was doing mocap work I made all projects binary because selecting any assets that contained the animations would hang the editor for ~30 seconds as they loaded
     
  9. DouglasPotesta

    DouglasPotesta

    Joined:
    Nov 6, 2014
    Posts:
    96
    When I saw the title I thought this was improvements for serialization of string fields. Lol
    Awesome work!
    Was this done purely for improvement on loading scenes and prefabs?
    Will this have effects on Unity collaborate?
    Will there be an option for legacy text serialization if we are running in to issues?
     
  10. tonycoculuzzi

    tonycoculuzzi

    Joined:
    Jun 2, 2011
    Posts:
    300
    Exciting! Though I can't help but think this would've been a great time to address the gripes most people have with the current serialization system, like the inability to to serialize dictionaries. Still, looking forward to the changes!
     
  11. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    6,326
    If you have "Asset Serialization Mode = Force Text" turned on in the Project Editor Settings, many asset types are saved as text format and thus it affects more than just scenes and prefabs.

    There is an option in the Project Editor Settings, it's called "Asset Pipeline (experimental)".
     
    phobos2077 likes this.
  12. harryr

    harryr

    Unity Technologies

    Joined:
    Nov 14, 2017
    Posts:
    38
    This change was focused on optimizing the current system and as such doesn't allow C# dictionaries to be serialized.

    If the animation files are stored as YAML .assets then you should definitely see a speed up

    This change should improve the performance of all systems that load and save text serialized data, scenes and prefabs are just very common use cases.

    The only changes for collaborate would be improvements to importing times when pulling down new assets.

    Currently there's no option to revert back to the old system, as of 2019.3 UnityYAML is the default when using text serialization.

    Yeah I can see that, as mentioned above this project was mainly focused on optimizations.
    I'd personally love to see serializable dictionaries but it's out of the scope of this project.
     
    phobos2077 likes this.
  13. vertxxyz

    vertxxyz

    Joined:
    Oct 29, 2014
    Posts:
    89
    Looks like they are, that's awesome news then - text serialisation is so much nicer for source control :)
    I haven't got a project to this version yet, but you've clearly done fantastic work here!
    (Though I still imagine loading text-serialized mocap data will still be slow as the anim files are so damn verbose)
     
  14. Jes28

    Jes28

    Joined:
    Sep 3, 2012
    Posts:
    989
    Few bugs I have noticed when working with version control on old system.

    - Sometimes serialized references was serialized in one line, sometimes in two making difference for source control:

    m_SkyboxMaterial: {fileID: 10304, guid: 0000000000000000f000000000000000, type: 0}

    m_SkyboxMaterial: {fileID: 10304, guid: 0000000000000000f000000000000000,
    type: 0}
    suggest just always serialize it into exactly one line.

    Order of serialized objects sometimes was changed.
    Suggesting just always sort it by uid so every chunk of data will always be in same place no matter what happens.


    Hope that this issues will be adressed with new serializer :)
     
    Michael-Ryan and dadude123 like this.
  15. harryr

    harryr

    Unity Technologies

    Joined:
    Nov 14, 2017
    Posts:
    38
    This happens when the line passes over roughly 80 characters and is behavior we match with the old writer.
    If we were to update the logic to only write one line we'd actually end up generating a bunch of version control noise fixing up existing two line references.
     
  16. Jes28

    Jes28

    Joined:
    Sep 3, 2012
    Posts:
    989
    Can you please add Project option to make it serialize in 1 line and may be make it default for new projects.

    Line length often changes from commit to commit and actually create noise in source control all the time :)
     
  17. harryr

    harryr

    Unity Technologies

    Joined:
    Nov 14, 2017
    Posts:
    38
    We can definitely look into this, great idea! :)
     
    Michael-Ryan, DrummerB, Ludiq and 8 others like this.
  18. DavidNLN

    DavidNLN

    Joined:
    Sep 27, 2018
    Posts:
    77
    Very interesting, we have moved to flatbuffers for most of our data as unity wasn't able to load scenes fast enough, maybe we can now move some of the smaller stuff back for easier usage ^^
     
    MadeFromPolygons likes this.
  19. EvOne

    EvOne

    Joined:
    Jan 29, 2016
    Posts:
    172
    Wow! 0_o -What a fast this Unity 2019.3.a7 !!! :):):)
    -With beta 2019.2 - just can be no comparison!
    This is just amazing! :) I installed beta 2019.2 to be able to work with 2D animation and other preview packs and after working with it a little bit, I realized that most likely my notebook will simply not be able to work normally with any new versions of Unity... :(

    But with 2019.3.a7 - all again "Flies"! :))) It was completely unexpected ... :)

    -Big Big BIG THANKS to the developers! :D

    P.S. Sorry for Google-translate... :rolleyes:
     
    MadeFromPolygons likes this.
  20. jashan

    jashan

    Joined:
    Mar 9, 2007
    Posts:
    3,304
    Hi @harryr , this sounds like a really awesome improvement! I'm not sure this is the right place / scope, but it might be:

    One thing that is bothering me a lot is that during Unity updates, on the actual update, only a few changes are applied in the actual project files (i.e. everything under Assets). However, there are a lot of changes that occur in the YAML files (I'm assuming it's the YAML files, the two types of assets I see this most with are scenes and prefabs), and those changes seem to be then written into the files whenever the file is opened and a change is made, causing a lot of noise unrelated to the actual changes that we as developers do in those files.

    While I can total see how this will usually improve the update experience by not having to iterate over all the files on every update, it really does mess up version control because after each Unity update (and that seems to also include minor releases), the changes end up in version control distributed over several days, whenever any file that has such changes is touched.

    I believe this is a case where both approaches are needed because it's like that some people will prefer the current approach, with quicker project updates. But other people, like me, would prefer to have every change caused by a Unity update to be in source control in my "Updated Unity to 2019.x.y" commit, separate from my actual development changes.

    Such an option would most likely be best put into Preferences / General, next to "Compress Assets on Import" (because it's a very similar decision to be made), and could be called something like "Immediately Update Asset Layout on Unity Update" or maybe "Update all YAML-Files on Unity Update".

    You might be doing this during the 2019.3 update already ... or convert to Asset Database v2 "on-the-fly" which would also make finding serialization bugs much harder during the alpha. But from my experience, finding actual serialization bugs would be pretty difficult due to the many "Unity version update" changes we have when going from one version to the next (e.g. 2019.2 to 2019.3).
     
  21. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    6,326
    When I update Unity, I call
    AssetDatabase.ForceReserializeAssets
    after the update to make sure Unity writes all assets in the new format to disk, so I don't run into this problem. Then submit/push all changed files to version control.

    https://docs.unity3d.com/ScriptReference/AssetDatabase.ForceReserializeAssets.html
     
    phobos2077, WolveX, fherbst and 9 others like this.
  22. jashan

    jashan

    Joined:
    Mar 9, 2007
    Posts:
    3,304
    Ah, very cool, that'll solve this for me. Thank you!
     
    liortal, Peter77 and harryr like this.
  23. harryr

    harryr

    Unity Technologies

    Joined:
    Nov 14, 2017
    Posts:
    38
    Hi @jashan, thanks for the feedback!
    @Peter77 beat me to it! The ForceReserializeAssets api was added for the exact reason :)
     
    Peter77 and jashan like this.
  24. AlkisFortuneFish

    AlkisFortuneFish

    Joined:
    Apr 26, 2013
    Posts:
    903
    Oooo, how did I miss that! We have long had a "Set Dirty" context menu, when updating I select all assets, wait ages, right click, wait ages, Set Dirty, then save, just to achieve that.
     
  25. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    6,326
  26. Awarisu

    Awarisu

    Joined:
    May 6, 2019
    Posts:
    215
    Is there a button somewhere that you can press to achieve this or are you forced to script the editor?
     
  27. steego

    steego

    Joined:
    Jul 15, 2010
    Posts:
    967
    You have to script the editor, but put a script like this in a folder named Editor and you have it as a menu item

    Code (csharp):
    1.  
    2. using UnityEditor;
    3.  
    4. namespace MyNamespace.Editor.Assets
    5. {
    6.     public static class ReSerializeAssets
    7.     {
    8.         [MenuItem("File/Re-Serialize Assets", false, 198)]
    9.         private static void Reserialize()
    10.         {
    11.             AssetDatabase.ForceReserializeAssets();
    12.         }
    13.     }
    14. }
    15.  
     
    Deozaan, SugoiDev and harryr like this.
  28. fherbst

    fherbst

    Joined:
    Jun 24, 2012
    Posts:
    800
    @harryr have you actually implemented that "serialize single line" behaviour by now?

    I've just ran ForceReserializeAssets on a large project that we've been seeing a ton of serialization woes with. I can already see that there's a lot of issues with the now-reserialized files - properties change, file IDs change, references change, serialization reorders all over the place, hundreds of console errors and warnings during ForceReserializeAssets:
    • "Component could not be loaded when loading game object. Cleaning up!"
    • "Component is no longer available in Unity. References to it will be removed!"
    • "You are trying to replace or create a Prefab from the instance 'Camera' that contains the script 'OrbitCameraController', which does not derive from MonoBehaviour. This is not allowed."
    • "Identifier uniqueness violation"
    • "You are trying to replace or create a Prefab from the instance 'Radar' that references a missing script. This is not allowed."
    • "This MeshCollider requires the mesh to be marked as readable in order to be usable with the given transform."
    • "Lighting data asset ‘LightingData’ is incompatible with the current Unity version. "
    • "Failed to extract System.Net.WebSockets.WebSocketException class of base type System.ComponentModel.Win32Exception"
    The operation resulted in nearly 5000 touched files, with about 3000 of them being actual binary changes (remaining changed files after staging in git).



    Here's some examples I saw in git:

    upload_2019-11-10_17-53-23.png
    fileIDs getting switched after serialization reordering

    upload_2019-11-10_17-53-56.png
    properties changing just because of the reserialization (this one changed in nearly all scene files)

    upload_2019-11-10_17-55-9.png
    Array elements doing crazy things

    upload_2019-11-10_17-55-33.png
    Another example of changing fileID just because of reserialization


    upload_2019-11-10_19-34-13.png
    transform hierarchy changed on re-serialization

    Additionally, it seems there's no info given (besides the logged errors) about which objects these changes are actually applied to. Would be great if the log would properly ping the object that caused it.
     
    Last edited: Nov 10, 2019
    Havokki and mh114 like this.
  29. AlkisFortuneFish

    AlkisFortuneFish

    Joined:
    Apr 26, 2013
    Posts:
    903
    Have you tried running ForceReserializeAssets on an older version of Unity to see what's going on? As one thing I have found is that it will bring up any longstanding issues on assets that have not been touched in a long while, especially missing scripts and the like.
     
  30. fherbst

    fherbst

    Joined:
    Jun 24, 2012
    Posts:
    800
    Yes, that's exactly why I made this post, as we were seeing issues with serialization and I think running the re-serialization made them more visible.
    I guess the interesting question @harryr is "Which kinds of breaks and changes are considered bugs?"
     
    Last edited: Nov 11, 2019
  31. fherbst

    fherbst

    Joined:
    Jun 24, 2012
    Posts:
    800
    After trying this out on a couple of projects I can see two clear bugs (besides the unclear behaviour above which might be ok):

    - m_useShadowMask is toggled on every run of AssetDatabase.ForceReserializeAssets on every .unity scene file (from 1 to 0, from 0 to 1, etc etc)
    - serialization reordering happens only with scene files and prefab files on subsequent runs, and pretty consistently in nearly all of them (it seems to not happen on very small scenes and prefabs)

    I would expect that running the method after the first run would not change the project again.

    @LeonhardP this also applies to the current beta with AssetDatabase v2
     
    NibbleByteSSG likes this.
  32. Ramobo

    Ramobo

    Joined:
    Dec 26, 2018
    Posts:
    212
    I have to ask: Why are scene objects serialized inline with references bouncing around everywhere instead of a hierarchical format? I understand IDs for references between objects, but not for parenting, of all things. Also, what's the point of saving each object in its own YAML document?

    Let's see a text representation of the current format (simplified):
    Code (CSharp):
    1. -DOC 1
    2. GameObject:
    3.     Components:
    4.         2
    5.         3
    6. -DOC 2
    7. Transform:
    8.     GameObject: 1
    9.     Children:
    10.         4
    11. -DOC 3
    12. GameObject:
    13.     Components:
    14.         4
    15.         5
    16. -DOC 4
    17. Transform:
    18.     GameObject: 3
    19.     Parent: 2
    20.     Children: // Yes, empty.
    21. -DOC 5
    22. MonoBehaviour:
    23.     GUID: {XYZ}
    24.     GameObject: 3
    25.     TransformReference: 2
    As you can see, the game objects reference its components and the components reference their respective game objects.

    And now a representation of what would make sense to me:
    Code (CSharp):
    1. GameObject:
    2.     ID: 1
    3.     Components:
    4.         Transform:
    5.             ID: 2
    6.             Children:
    7.                 GameObject:
    8.                     ID: 3
    9.                     Components:
    10.                         Transform:
    11.                             ID: 4
    12.                         MonoBehaviour:
    13.                             GUID: {XYZ}
    14.                             ID: 5
    15.                             TransformReference: 2
    Quite a lot of spaces, but nothing a few tabs can't fix. A considerable amount of space saved in a big scene. Also, what's with that u in MonoBehaviour? The US English spelling doesn't have it.
     
  33. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    6,326
    https://www.grammar.com/behavior_vs._behaviour

    Unity seems to use a mix of US and British english. Unity uses Color rather than Colour and MonoBehaviour instead of MonoBehavior for example.
     
    phobos2077 likes this.
  34. harryr

    harryr

    Unity Technologies

    Joined:
    Nov 14, 2017
    Posts:
    38
    Hey!
    I can confirm this feature will be in Unity 2020.1, we ran into some unforeseen issues that caused a slight delay.

    Whilst I can't say for sure, it doesn't seem to be like these issues would be caused by the upgrade to the new YAML library. As @AlkisFortuneFish said, I fear that calling ForceReserializeAssets perhaps brought up some longstanding issues that have been slightly hidden.

    Obviously that doesn't mean it's ok that you've encountered this and sorry it's caused issues in your project, would you be able to log a bug about the two consistent issues you're seeing across projects? (m_useShadowMask and object reordering)
     
  35. AcidArrow

    AcidArrow

    Joined:
    May 20, 2010
    Posts:
    9,855
    The reordering happens for me too but on 2018.4, so it’s not an issue with the latest serialization changes.
     
  36. AlkisFortuneFish

    AlkisFortuneFish

    Joined:
    Apr 26, 2013
    Posts:
    903
    Actually, nothing a few tabs *can* fix, with the hierarchies the average Unity project has the overhead would be tremendous and what would we gain exactly? The way the objects are serialized right now makes sense.

    Serialize a big graph structure with Newtonsoft, which does exactly what you are proposing. The overhead is literally over 90%.
     
  37. Ramobo

    Ramobo

    Joined:
    Dec 26, 2018
    Posts:
    212
    Last edited: Feb 11, 2020
    sand_lantern and phobos2077 like this.
  38. Ramobo

    Ramobo

    Joined:
    Dec 26, 2018
    Posts:
    212
    We would theoretically have smaller YAML file sizes. Throwing my examples in a text file the inline structure is 296 bytes and the hierarchical one is 253 bytes, with spaces converted to tabs, of course, a 14.53% decrease. Now think about that in a long file. Like I mentioned, it's not just the inline structure, but the two-way references that to me appear to only waste space too, so even if the hierarchical structure is a dumb idea, there's still potentially getting rid of the seemingly unnecessary references back to the parent which is already referencing the child.

    Overhead in what? File size? I can't see how. Speed? Maybe.
     
  39. AlkisFortuneFish

    AlkisFortuneFish

    Joined:
    Apr 26, 2013
    Posts:
    903
    Size if using formatting. Obviously, with Json you just disable formatting, but it is required for yaml. In Unity terms, if you have a hierarchy that is 100 deep, which is not at all uncommon with really complex animation rigs, have a quick calculation you will have an overhead of 100 bytes for every field at the deepest end.

    Besides, Unity serializes by pretty much doing a memory dump through a transfer function, not by following unmanaged references around.

    That would potentially result in serialization that drastically changes just by adding a new reference to an existing object. Unless you are suggesting Unity should actually start serializing in a non-generic way that treats object scene hierarchy as a first class citizen. That would be a different story altogether and quite a drastic change.
     
  40. Ramobo

    Ramobo

    Joined:
    Dec 26, 2018
    Posts:
    212
    Well, that makes more sense.
     
  41. AlkisFortuneFish

    AlkisFortuneFish

    Joined:
    Apr 26, 2013
    Posts:
    903
    I may have found a crash bug related to serialization of Unicode characters in string, namely emojis. I'll investigate some more and see if it's caused by this. Those characters used to survive Unity serialization just fine (we render Emojis using our own modifications to TMPro).
     
  42. Awarisu

    Awarisu

    Joined:
    May 6, 2019
    Posts:
    215
    sand_lantern and phobos2077 like this.
  43. Ramobo

    Ramobo

    Joined:
    Dec 26, 2018
    Posts:
    212
    PascalCase for public fields? The only standard I heard about that is camelCase, but I do agree that it's a mess.
    And yeah, I was referring to that fact that their new naming conventions is another piece of inconsistency to add to the table. Now they have old Unity (UnityEngine, mostly), new Unity (Unity, also mostly) and Mathematics (Unity.Mathematics, also mostly), <sarcasm>because 99.99% of Unity programmers are shader programmers.</sarcasm>

    EDIT: Huh. Then now I'm breaking PascalCase for public fields (I use camelCase) and no underscores (I use _camelCase for private fields).
    But if you look at it for a bit, not even Roslyn follows those conventions entirely. For example, here you can see them using _camelCase for private fields. Hell, CoreFX too.
     
    Last edited: Nov 24, 2019
  44. AlkisFortuneFish

    AlkisFortuneFish

    Joined:
    Apr 26, 2013
    Posts:
    903
    Unity’s convention is generally camelCase for public properties. Most m_Stuff you see in serialization is private serialised fields. I cannot think of any *public* m_ fields in the engine.

    But yes, it is not consistent across systems, especially considering the standards are changing with the new package driven development.
     
    phobos2077 and Peter77 like this.
  45. Baste

    Baste

    Joined:
    Jan 24, 2013
    Posts:
    6,076
    There is one large instance of this, in Cinemachine. That team has their own little special convention, just for themselves.
     
  46. Awarisu

    Awarisu

    Joined:
    May 6, 2019
    Posts:
    215
    Yes, I was thinking of Cinemachine as the most obvious example of public m_camelCase. Speaking of packages, it's nice to see that Unity is going with the Java naming convention for packages, god forbid they'd use something NuGet-based like everyone else for .NET.
     
  47. Ramobo

    Ramobo

    Joined:
    Dec 26, 2018
    Posts:
    212
    And what's wrong with the simpler Author.Package.Modules?
     
  48. superpig

    superpig

    Drink more water! Unity Technologies

    Joined:
    Jan 16, 2011
    Posts:
    4,576
    One important use case for the scene/prefab format is merging changes - we want to minimise the risk of merge conflicts, data corruption due to bad merges, etc. (We do provide the semantic merge tool to try and help resolve such things, but it's better if we can design the format to avoid conflicts arising in the first place).

    It's one reason why fileIDs are not just sequential, for example - if they were then it would make two people adding an object to the scene file at the same time a guaranteed conflict that may not be easy to resolve. It's also a benefit that a format like YAML has over JSON - no issues with merges producing mismatched braces or trailing commas.

    Encoding the hierarchy in the way you're describing could be quite fragile - you'd get a conflict if e.g. you add an object or component, while I change the parenting somewhere upstream of you.
     
  49. superpig

    superpig

    Drink more water! Unity Technologies

    Joined:
    Jan 16, 2011
    Posts:
    4,576
    Did the version of the LightmapSettings object also change at the same time? When upgrading from version 9, m_UseShadowmask is enabled if the older m_ShadowMaskMode property was nonzero, and when upgrading from version 10 there was another upgrade step where we folded in a version of the property with the wrong casing in the name ("m_UseShadowMask").
     
  50. Ramobo

    Ramobo

    Joined:
    Dec 26, 2018
    Posts:
    212
    Ah, if only Smart Merge/YAML Merge/whatever actually worked... Since it doesn't, we have to resort to manual merging, which, if you have more than a few conflicts, has lots of conflicts between completely unrelated things. Even if it worked, you have to set it up for every Unity installation. Is it that hard to detect if you're in a repository like Visual Studio does? Hell, as far as I can tell, you already use the Git configuration for package manager authentication, so we shouldn't even need to configure the manual conflict editor, just get it from the Git configuration.