Search Unity

Bug with StringReader speed? Or am I missing something?

Discussion in 'Scripting' started by johot, Sep 27, 2011.

  1. johot

    johot

    Joined:
    Apr 11, 2011
    Posts:
    201
    So I have a big textfile with words that I need to parse. To do this I have them all in a TextAsset, it's about 500 000 lines of words, about 8 mb.

    Now I want to parse this file, line by line.

    I've tried three different ways.

    1. String.Split - this is really slow and I can't use it... Using the text property of the text asset here (so a big string).

    2. StreamReader using a MemoryStream and using the bytes property of the textasset (so byte[]). This is pretty fast, about 10x as fast as method 1.

    3. StringReader... this never even finishes. Do we have a bug here?

    Have anyone any experience with the StringReader? Is there a bug in it or something? Because it's really slow! I would think it would be about as fast as nr 2?

    Why would I think it is a bug then? Because if I loop through the characters of the file (using the textasset text property) and using a StringBuilder and basically simulating what the StringReader should do (or so I think anyway) I get performance about the same as method 2.

    You can test this easily by creating a big textasset file and then trying to read it line by line using a StringReader.


    Code (csharp):
    1.  
    2. StringReader reader = new StringReader(aVeryLongString);
    3. string line;
    4.            
    5. while ((line = reader.ReadLine()) != null) {
    6. }
    7.  
    8.  
     
  2. EddieCam

    EddieCam

    Joined:
    Oct 28, 2009
    Posts:
    26
    Old thread, but for anyone else googling, there is a bug with Mono's current StringReader.ReadLine method, as detailed here

    Basically, if a string is using Unix line endings it will search the entire string each time you call ReadLine, to look for Windows line endings too. O(n^2)!

    As Johot says, you may want to write your own ReadLine type method using stringbuilder, for any text file with >1000 lines.
     
  3. made-on-jupiter

    made-on-jupiter

    Joined:
    May 19, 2013
    Posts:
    25
    Still seems an issue with Mono 4.0.1.

    If all line endings in the string are of the same type, this will do a quick job at replacing Unix line endings with Windows ones (but leave Windows ones be):


    if (Regex.Match(text, "\r?\n").Length == 1)
    text = text.Replace("\n", "\r\n");