Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. We have updated the language to the Editor Terms based on feedback from our employees and community. Learn more.
    Dismiss Notice

Saving/loading "used" ScriptableObject instances in savegame file

Discussion in 'Scripting' started by cvbattum, Mar 9, 2020.

  1. cvbattum

    cvbattum

    Joined:
    Apr 20, 2018
    Posts:
    12
    While creating a saving/loading system for my game I ran into a problem. In our game the player has a finite number of objects (Posts) to consume. These are assigned and cached at the start of the game, for example in the GameManager class. This class takes a list of references to Post ScriptableObjects and creates a pool in which they will be stored. The problem arises when serializing this pool. See the following code to get a basic idea of the class structure.

    Code (CSharp):
    1. public class PostPool {
    2.     private HashSet<Post> availablePosts = new HashSet<Post>();
    3.     private HashSet<Post> usedPosts = new HashSet<Post>();
    4.  
    5.     // other data and methods...
    6. }
    Code (CSharp):
    1. public class Post : ScriptableObject {  
    2.     [SerializeField] private string title;
    3.     [SerializeField] private string content;
    4.  
    5.     // more primitive data and properties...
    6. }
    Take the following situation: We have released v1.0 of our game. Players are starting to get their own savegames. A bit later we create a v1.1 which has more Posts for the player to choose from. When they update to v1.1, the posts they had used in v1.0 should not simply be returned to the pool after the update. I wouldn't consider that saving as it is an integral part of the game state. However, the new posts should in fact be added to the list of available posts.

    This in itself is not a big challenge, but in my case the Posts are instances of ScriptableObject. After each build of the game, they may have a slightly different signature/hash. If I were to make them Serializable (which, after a quick search on Google, is not recommended by most people on the internet) I run the risk of not being able to deserialize the objects into the exact same instance again (in the sense that an equality check between them or their hashes would return true). An easier solution would be to give all the posts their own ID, however that would require that there is some universal manager of post IDs. There isn't because they are assigned as references through inspectors in the editor, after which Unity creates instances of them. I could have the pooler take care of this as the game is loading up, but then I run into the exact same problem: how do I make sure that each Post instance gets the exact same ID as it had in the previous build.

    The solution with IDs would definitely be better from a general point of view as well: Since I'm not going to use that Post ever again (it has already been used after all), I don't need to store anything more than an indication that this Post has already been used. Storing all of the data of the Post is unnecessary in this case and a simple ID would suffice. Yet again, the problem remains the same.

    How do I get a universal, reliable way to compare these ScriptableObjects across different versions or even builds of my project? I might have made some incorrect assumptions about the way ScriptableObjects work and how they are stored because I'm not very well versed in Unity's serialization engine (nor C# serialization in general). I'm actually not a big fan of using ScriptableObjects for this situation and I'd much rather use an actual database, but that requires time I don't have for the project. I'm aware it's not ideal, but there has to be a solution.
     
  2. Serinx

    Serinx

    Joined:
    Mar 31, 2014
    Posts:
    785
    Edit: Sorry I think I missed the part about the serialization of ScriptableObjects. I would recommend having your own serializable class which reads the values from the scriptable object if you're having trouble with that. I don't know too much about that though.



    I just did a post about backward compatible save files!

    Correct me if I'm wrong. This is your scenario:

    Version 1 has some data, lets say "A", "B" and "C".

    Player uses "B" and saves, so "B" is removed.

    Version 2 has posts "A", "B", "C" and "D"

    You want the player to load their save and get "A" and "C", but you also want them to have the new data: "D"

    Is that correct?

    I think the only thing to do here is to have a version number against your save files. If the version has increased since the last save, you need to perform certain operations.
    In this case, you want to add the "new data" to the save data.
    So you want to add post "D" to the data when they load it. You know they can't have used it, because it didn't exist in the previous version.

    Here's my post for reference:
    https://forum.unity.com/threads/backward-compatible-save-files.842854/
     
    Last edited: Mar 9, 2020
  3. AngryGamemaster

    AngryGamemaster

    Joined:
    Jun 24, 2016
    Posts:
    4
    I just posted this thread. https://forum.unity.com/threads/system-for-reading-writing-class-data-from-to-disk-c.842968/

    it doesnt implement scriptable object writing, which i should do, as my previous version of this system did. but if you want to learn this system for file saving and creation give it a whirl.

    [edit]
    The link i provided is basically a database system that you may find useful to poke around the test examples and see how you can implement the interface. i suggest the simplePackage object.

    feel free to send me more information on that code and i can send you that code back with it implementing the system.

    but i would imagine that the problem your having has to do with using hashset's ive never used them. but they create a large array of data and attempt to organize them in a way that makes searching easier. im not sure how many posts your using but i think List<Post> with a large starting capacity might perform better.
     
    Last edited: Mar 9, 2020
  4. orionsyndrome

    orionsyndrome

    Joined:
    May 4, 2014
    Posts:
    3,043
    Three questions.
    1. What exactly is a post? Sorry, English is not my native language, and I cannot understand the word post in this context.
    2. What data does it contain?
    3. Have you heard of a hash function?
     
  5. cvbattum

    cvbattum

    Joined:
    Apr 20, 2018
    Posts:
    12
    @Serinx This is an interesting idea. Thanks for the link! It is not the full extent of my problem though, but it is part of it. The other part would be how I store the Post in my save file. Since a Post is already a ScriptableObject, it should not be stored in full, only as a reference. After sleeping on it I do believe the only solution would be to introduce an ID system that identifies each post. Then for saving I would only write the IDs of the used object to the save file, and for loading I would reconstruct the HashMaps in the way you're proposing but based off only the IDs that are stored in the save file.

    @orionsyndrome
    1. A Post in this case is an object specific to my game. We're making a social media simulator game with predefined thing the player can post to their timeline. It is a bit of an arbitrary name for it though, but the most important part of it is in the code snippet that I added.
    2. It is a ScriptableObject and holds a couple of primitive data fields as seen in the code snippet. None of the fields are necessarily identifying, hence my idea to introduce an ID system.
    3. I have, but I don't think it's very applicable. I don't know exactly how the default GetHashCode in C# is implemented but I assume it looks at the signature of the entire object, including whatever I inherited from ScriptableObject. Meaning that I can't rely on everything staying the same throughout different builds. I could implement my own hash function but I am wary of doing that. And again, as none of the data in my class is necessarily identifying for the object, I don't really want to make a hash based on my own data either because that too is subject to change.
     
  6. orionsyndrome

    orionsyndrome

    Joined:
    May 4, 2014
    Posts:
    3,043
    1. Ok, got it.
    2. This leads naturally into
    3. No, not GetHashCode. That's something else. You need a reliable hash function. What default GetHashCode implementation does is not truly reliable, you're supposed to override the method to get the reliability you need. And in your case you don't really need that method at all, unless your Posts live in hashtables.

    Look, you can't have unreliable data, because then nothing ever qualifies your Posts as unique. Whatever you do, you first need to make sure that you, as a human, can distinguish Post 1 from Post 234897. Also, you absolutely have to make sure that you have zero collisions (or to put it in layman's terms, overlaps) in the future. Because this is what you need, right? How would you compare two Posts anyway? And determine whether it was the same one or two different ones?

    To fix this, I recommend a combination of having unique tokens and block reservation.

    Tokens are special data signatures that are considered unique for your current block.
    Blocks are special, reserved values in your keys that you increment only when you release the next iteration of your product. Thus each iteration of your product has a unique block.

    You take ulong (64-bit integer) and partition it into two parts like this, let's call this a KEY
    bbbbbbbb xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx

    this KEY has to be a part of Post's data
    together with block data, each KEY will be different from any other KEY ever

    In this case you have a space of 256 potential blocks of tokens, each of which can contain 72 billion billions different tokens in total.

    You generate tokens with a 32-bit hash function, and then you push (left-shift) the result every day to the left. To know how many days have passed, you subtract a hardcoded block release date from the server time. You have 46 thousands years of headroom between two different blocks. If you expect a major traffic of Posts, you can condense this time rate to hours. This is still almost 2 thousand years of headroom.

    To generate a 32-bit hash, first you pick a hash function with a known high distribution entropy, you feed it with a seed that was generated once, and without tinkering with the result, you left shift it by the amount provided by the day counter.

    Chances that you will generate a collision by using this scheme are (literally) astronomically low.

    Additionally, you can always tell in which release block a key was generated. And you can also determine the rough time of its creation (because of the zero-padding on the right).


    Here's an implementation of a high entropy cryptographic hash function, just as an example.

    Because you don't need security per se, you don't even have to make sure whether it works as claimed, you really only need it to be extremely well distributed, and this is one of the best algorithms for that kind of thing.

    [edit]
    I've made an oversight, you cannot just shift left, you need to add 1 and shift left, and then rotate when it overflows. You're still adding additional entropy to the high 24 bits, which is more than enough to account for the passage of time.

    in any case, try fiddling around with such a possibility, I'm sure you'll find the best solution eventually.
     
    Last edited: Mar 9, 2020