Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Voting for the Unity Awards are OPEN! We’re looking to celebrate creators across games, industry, film, and many more categories. Cast your vote now for all categories
    Dismiss Notice
  3. Dismiss Notice

Efficiency of name generation algorithm

Discussion in 'Scripting' started by Sendatsu_Yoshimitsu, May 25, 2018.

  1. Sendatsu_Yoshimitsu

    Sendatsu_Yoshimitsu

    Joined:
    May 19, 2014
    Posts:
    691
    I wrote a function to randomly generate code names like X-Com's mission titles, by combining random nouns and adjectives in the form of "Operation: Discovered Warthog" or what have you. I have a bunch of lists of random nouns and adjectives in different categories, and generating a single name is as easy as choosing a random adjective list, picking an element from it, then repeating for the noun.

    Where I'm concerned is preventing repetition. I already keep a list of every name already generated, so technically preventing repetition is as simple as doing this:

    Code (csharp):
    1.  
    2. List<string> allocatedNames;
    3.  
    4. string newName = GetRandomName();
    5. while (allocatedNames.Contains(newName) && allocatedNames.Count < maxPossibleCombinations){
    6.     newName = GetRandomName();
    7. }
    8.  
    9. allocatedNames.Add(newName);
    10.  
    The problem is that the efficiency of that check is going to drastically decrease as the list of allocated names gets larger and the possible valid combinations decrease, and when even 60-70% of the possible combinations have been allocated, we'd be very likely running GetRandomName tons of times for every one valid name it would output.

    I recognize that this is vanishingly unlikely to happen in a typical use case; with even 50 nouns and 50 adjectives you'd have 2,500 possible combinations, which is ten times the number of missions a typical x-com campaign lasts. But it still seems like a serious flaw- is there anything obvious I can do to prevent performance from tanking as the list fills?
     
  2. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,378
    In steps the 'HashSet':
    https://msdn.microsoft.com/en-us/library/bb359438(v=vs.110).aspx

    The collection stores entries uniquely based on the hashcode for the entry.

    An entry can only exist once (unique).

    But the other benefit is that it has an O(1) retrieval (test contains). Since all it has to do is calculate the hashcode, look in the corresponding spot, and confirm the entry exists or not.
     
  3. LeftyRighty

    LeftyRighty

    Joined:
    Nov 2, 2012
    Posts:
    5,148
    How about creating a Queue collection of each; populating them randomly at the beginning of the game (session or save), if you have diffferent numbers of nouns and adjectives they'll stay out of sync for some time and the name creation is just "get the next noun, get the next adjective" and move the used ones back to the end of the Queue
     
  4. Suddoha

    Suddoha

    Joined:
    Nov 9, 2013
    Posts:
    2,824
    First of all, instead of a list, you could use a dictionary. For calls such as Contains, a normal list has a time complexity of O(N), whereas lookups in a dictionary have a time complexity of O(1).

    Another solution would be to pre-build a list of all names and then randomly choose one. The strategy of managing the list in a memory-efficient manner depends on the total number of elements in the final list.

    For a few hundreds or even thousands, you could easily keep them in memory. When you reach numbers that would actually consume too much memory, you could keep them in a file and either load them on demand or pre-select N items for a game session (and optionally repeat whenever new ones are needed).
     
    lordofduct likes this.
  5. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,378
    You should definitely checkout HashSet.

    It's basically a Dictionary of only keys, no values. Meaning it has the same O(1) efficiency for retrieval.
     
    Suddoha likes this.
  6. Sendatsu_Yoshimitsu

    Sendatsu_Yoshimitsu

    Joined:
    May 19, 2014
    Posts:
    691
    Ooh that's a new class for me and should work fine; thank you!!
     
  7. Suddoha

    Suddoha

    Joined:
    Nov 9, 2013
    Posts:
    2,824
    That's actually what I had in mind. :oops: Don't know why I wrote dictionary.:confused:
     
  8. kru

    kru

    Joined:
    Jan 19, 2013
    Posts:
    452
    I want to start by reiterating that a HashSet is what the OP is looking for to solve his initial question. However, I suggest that there is a better way to approach the problem of generating mission names that is both more memory and time-wise efficient.

    I suggest something like this. Popping items off a shuffled list is more robust than get-while-not-duplicate methods.

    One clever way to accomplish this is to use a non-repeating PRNG to build your adjective-noun pairs. Each call of Next() to a non-repeating PRNG is guaranteed to give a not-before seen number (until all integers have been exhausted, whereupon they begin to repeat). We can save ourselves the storage cost of even the hashed names.

    Here's a primer on how one might accomplish it: http://preshing.com/20121224/how-to-generate-a-sequence-of-unique-random-integers/