Search Unity

Join together strings based on length

Discussion in 'Scripting' started by ashley, Jan 3, 2017.

  1. ashley

    ashley

    Joined:
    Nov 5, 2011
    Posts:
    84
    I have a list of strings and I want to join together until the length of the combined strings is greater than or equal to a percentage below a defined int (let's say 10% of 100 i.e. 90 chars or more) or until the next string is too big to join.

    Thus if I started with this list:
    • Little sentence paragraph.
    • Has lots of small.
    • Sentences.
    • Although it also has some longer ones that would not be joined and as such they would remain as is.
    • This is kind of small.
    • So it would happily join with this sentence to make a paragraph together.
    • This would be by itself though.
    It would ultimately end up with the following items:
    • Little sentence paragraph. Has lots of small. Sentences. (56 chars)
    • Although it also has some longer ones that would not be joined and as such they would remain as is. (99 chars)
    • This is kind of small. So it would happily join with this sentence to make a paragraph together. (96 chars)
    • This would be by itself though. (31)
    The first three are joint together because they themselves are small, but the "although it also..." one would take it over 100 characters, so it doesn't join to that one. Strings 5 and 6 work in a similar kind of way and the final one has nothing to join to as its at the end.

    I can use a do while loop to force it together, but its very much programmed for the above bit of text whereas I want it to work more flexibly to handle other text.

    What I've currently got is below, however essentially what I want to do is:
    1. Run through each string in stringList
    2. If string.Length >= maxLength - (maxLength/percDifference) add it to changedStringList
    3. Else, concatenate all following strings until the combined total is >= maxLength - (maxLength/percDifference) and <= maxLength
    Code (CSharp):
    1. using UnityEngine;
    2. using System.Collections;
    3. using System.Collections.Generic;
    4.  
    5. public class JoinText: MonoBehaviour {
    6.  
    7.     public List<string> stringList = new List<string> ();
    8.     public List<string> changedStringList = new List<string> ();
    9.  
    10.     int end = 0;
    11.     public int maxLength = 100;
    12.     public int percDifference = 10;
    13.  
    14.     void Start()
    15.     {
    16.         stringList.Add ("Little sentence paragraph.");
    17.         stringList.Add ("Has lots of small.");
    18.         stringList.Add ("Sentences.");
    19.         stringList.Add ("Although it also has some longer ones that would not be joined and as such they would remain as is.");
    20.         stringList.Add ("This is kind of small.");
    21.         stringList.Add ("So it would happily join with this sentence to make a paragraph together.");
    22.         stringList.Add ("This would be by itself though.");
    23.  
    24.         do{
    25.             if (end == stringList.Count) {
    26.                 break;
    27.             }
    28.  
    29.             if(stringList[end].Length >= (maxLength - (maxLength/percDifference))) {
    30.                 changedStringList.Add(stringList[end]);
    31.                 end++;
    32.                 continue;
    33.             }
    34.  
    35.             if(stringList.Count > end+2) {
    36.                 if (stringList[end].Length + stringList[end+1].Length + stringList[end+2].Length <= maxLength) {
    37.                     changedStringList.Add(stringList[end] + " " + stringList[end+1] + " " + stringList[end+2]);
    38.                     end += 3;
    39.                     continue;
    40.                 }
    41.             }
    42.  
    43.             if(stringList.Count > end+1) {
    44.                 if (stringList[end].Length + stringList[end+1].Length <= maxLength) {
    45.                     changedStringList.Add(stringList[end] + " " + stringList[end+1]);
    46.                     end += 2;
    47.                     continue;
    48.                 } else if (stringList[end].Length + stringList[end+1].Length >= maxLength) {
    49.                     changedStringList.Add(stringList[end]);
    50.                     changedStringList.Add(stringList[end+1]);
    51.                     end += 2;
    52.                     continue;
    53.                 }
    54.             }
    55.             else {
    56.                 changedStringList.Add(stringList[end]);
    57.             }
    58.             end++;
    59.         } while (end <= stringList.Count);
    60.  
    61.     }
    62.  
    63. }
    If you run that it gets the results I want in changedStringList, but as I mentioned I'd rather do this more systematically (i.e. I don't have to work my way down from "if the next 3 strings are fine then join, else if the next 2 are fine then join else just leave as is"). This has just been designed to hopefully make it clearer what my intended outcome is and I know it's not ideal, I just don't really know how to approach this in any other way.

    However, if there's no solution that is going to make things much better, I'll work with the above and expand it out a bit more (to account for instances where you need to join 4+ strings together etc) but I don't want to have to manually account for instances where the allowed length might be 2000 and there's a bunch of 2 word strings.

    Please do let me know if this doesn't make sense because I'm worried it doesn't!

    Thanks in advance.
     
  2. Brathnann

    Brathnann

    Joined:
    Aug 12, 2014
    Posts:
    7,187
    I would say since you can already get the length of strings, just check if string1 + string2 is going to be within the range you want. if so, add them, if not, you store the current string and then start a new string.

    It should be simple to start with a string that is equal to the first value. Then check if adding the second string will be out of range. If not, combine the second string to the first. Then you loop and see if the combined string plus the next string is greater. If not, combine again. repeat. Once you are out of range, you save out the combined string and then you start again, using the next string to start as your base string.

    You may also want to look into the stringbuilder class, depending on how many strings you plan to allow.
     
    ashley likes this.
  3. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,531
    You seem to have a lot going on in your script.

    This should be a simple loop over a collection of inputs.

    Something like this:
    Code (csharp):
    1.  
    2.     public static void ReduceStrings(IEnumerable<string> input, int maxLength, float bufferPercent, ICollection<string> output)
    3.     {
    4.         int lowerLength = maxLength - (int)(maxLength * bufferPercent);
    5.         string str = string.Empty;
    6.        
    7.         foreach(var s in input)
    8.         {
    9.             if(str.Length + s.Length > maxLength)
    10.             {
    11.                 if(str.Length > 0) output.Add(str);
    12.                 str = s;
    13.             }
    14.             else
    15.             {
    16.                 str += s;
    17.                 if(str.Length >= lowerLength)
    18.                 {
    19.                     output.Add(str);
    20.                     str = string.Empty;
    21.                 }
    22.             }
    23.         }
    24.        
    25.         if(str.Length > 0) output.Add(str);
    26.     }
    27.  
    In your code you'd call it like so:

    Code (csharp):
    1.  
    2. using UnityEngine;
    3. using System.Collections;
    4. using System.Collections.Generic;
    5.  
    6. public class JoinText: MonoBehaviour {
    7.  
    8.     public List<string> stringList = new List<string> ();
    9.     public List<string> changedStringList = new List<string> ();
    10.  
    11.     int end = 0;
    12.     public int maxLength = 100;
    13.     [Range(0f,1f)]
    14.     public float percDifference = 0.1f;
    15.  
    16.     void Start()
    17.     {
    18.         stringList.Add ("Little sentence paragraph.");
    19.         stringList.Add ("Has lots of small.");
    20.         stringList.Add ("Sentences.");
    21.         stringList.Add ("Although it also has some longer ones that would not be joined and as such they would remain as is.");
    22.         stringList.Add ("This is kind of small.");
    23.         stringList.Add ("So it would happily join with this sentence to make a paragraph together.");
    24.         stringList.Add ("This would be by itself though.");
    25.  
    26.         ReduceStrings(stringList, maxLength, percDifference, changedStringList);
    27.     }
    28.    
    29.     public static void ReduceStrings(IEnumerable<string> input, int maxLength, float bufferPercent, ICollection<string> output)
    30.     {
    31.         int lowerLength = maxLength - (int)(maxLength * bufferPercent);
    32.         string str = string.Empty;
    33.        
    34.         foreach(var s in input)
    35.         {
    36.             if(str.Length + s.Length > maxLength)
    37.             {
    38.                 if(str.Length > 0) output.Add(str);
    39.                 str = s;
    40.             }
    41.             else
    42.             {
    43.                 str += s;
    44.                 if(str.Length >= lowerLength)
    45.                 {
    46.                     output.Add(str);
    47.                     str = string.Empty;
    48.                 }
    49.             }
    50.         }
    51.        
    52.         if(str.Length > 0) output.Add(str);
    53.     }
    54. }
    55.  
     
    Meri and ashley like this.
  4. ashley

    ashley

    Joined:
    Nov 5, 2011
    Posts:
    84
    Thanks so much. I knew there must be a better way to do it but I was having a complete mental block about how. I hadn't thought of introducing a dummy string as a temporary measure, even though I did something similar with something else (can I still blame the Christmas break?)

    One minor thing, and only if someone else is looking at this, but you put maxLength * bufferPercent whereas it should be divide. Did take me a few moments to figure out why it was adding anything, but it was because lowerLength ended up as -900!

    Anyway, thanks again. I really appreciate the clear guidance. :)

    Thanks. I can see this is pretty much the same advice as lordofduct and it did indeed work. Much appreciated!
     
  5. Brathnann

    Brathnann

    Joined:
    Aug 12, 2014
    Posts:
    7,187
    Glad you got it working! I usually don't have the chance to flesh out an example, but lordofduct got one up before hand, so I didn't feel the need. But it was exactly what I was thinking.
     
  6. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,531
    Note I mad the bufferPercent a float with a value from 0->1, so that you'd pass in 0.1 for 10%. In which case, yes, multiplication is what you would use.

    Code (csharp):
    1.  
    2. [Range(0f,1f)]
    3. public float percDifference = 0.1f;
    4.  
    Division won't work if you're using whole number either.

    100 - (100 / 50) = 98

    But you are saying 50, that's 50% (as far as I can tell from your definition). You should be getting 50 as an output not 98.

    If you wanted whole numbers you'd do something like this:

    Code (csharp):
    1.  
    2. int lowerLength = maxLength - (int)(maxLength * bufferPercent / 100f);
    3.  
    Personally though, I don't like whole number representations of percentages.

    I prefer the decimal ratio.

    They're arithmetically faster to use.
     
    ashley likes this.
  7. ashley

    ashley

    Joined:
    Nov 5, 2011
    Posts:
    84
    Ah yeah I didn't even notice the values you gave (I just looked at the method). Apologies for that.

    And you're very right about the maths too. I was fooled by it coincidentally working in my way and a lack of proper testing (at this stage). I really appreciate all your help though.