Search Unity

Looking for a free/open-source C# library for reading/writing CSV files

Discussion in 'General Discussion' started by elmar1028, Jun 12, 2016.

  1. elmar1028

    elmar1028

    Joined:
    Nov 21, 2013
    Posts:
    2,214
    Hi guys,

    I am looking for a free and open source C# library that can read/write data in spreadsheet format.

    License wise, it should be free, open source and would allow me to use it into my plugin which I am going to sell on the Asset Store.

    Any suggestions?

    Thanks!
     
  2. QFSW

    QFSW

    Joined:
    Mar 24, 2015
    Posts:
    2,530
    CSV files are a human readable format, they are literally just a new line for a new record and commas to separate attributes
    You sure it wouldnt be easier to code your own solution for your usage? What do you want it for?
     
    ZJP and elmar1028 like this.
  3. darkhog

    darkhog

    Joined:
    Dec 4, 2012
    Posts:
    2,219
    CSV? I could knock something like that in few hours, you probably could do so as well. All CSV really is is comma-separated list of values, where each row of data is separated by ENTER symbol (cr+lf on Windows, lf on mac and cr on linux).

    So all you really need is first to read all the lines in the file into array of strings, then use string.split with delimiter set to "," on each of those to get the values for the individual cells.
     
    elmar1028 likes this.
  4. elmar1028

    elmar1028

    Joined:
    Nov 21, 2013
    Posts:
    2,214
    Oh wow! Didn't know it was possible. Just did a small experiment with notepad. Works fine!

    Now that I know it's just data split into commas, yes - it's something I can make myself :)
     
  5. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    6,638
    It wouldn't be easier to code in your own library instead of using something someone already wrote.

    few hours on csv is too much time wasted, IMO.
    ------


    Either way.

    Quick 1 second search located this:
    https://joshclose.github.io/CsvHelper/

    Keep in mind that native csv library will generate garbage and that will bite you on mobile platform.
    Everything in C# generates garbage, garbage is bad, and there's no delete keyword in C#. Because reasons. :-\
     
  6. elmar1028

    elmar1028

    Joined:
    Nov 21, 2013
    Posts:
    2,214
    Does C# generate garbage even if I just read data? Not write it?
     
  7. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    6,638
    Yes, it is list of comma-separated values, human readable. However, you'll probably need to handle stray extra whitespaces and possibly quotes. Naive "string.split" approach will faceplant on quotes, definitely.
     
  8. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    6,638
    Lemme recall it.

    Calls to string.format generates garbage.
    Starting coroutines generates garbage.
    Calling delegates in some cases may generate garbage.
    Creating any reference type with new generates garbage.
    Making a string generates garbage, because strings are immutable.
    Any call that returns an array to you generates garbage.

    So, you pretty much will always generate garbage.

    Juust run the game through profiler and deal with this when it becomes a problem.
     
    elmar1028 likes this.
  9. knr_

    knr_

    Joined:
    Nov 17, 2012
    Posts:
    257
    Its actually fairly easy to code yourself. We've done it, but there is one exception -

    The format of a CSV file isn't the same in every country.

    For instance, in the United States, the C in CSV actually does mean using a comma; in France they use a semi-colon instead of a comma (for those of us who live and work internationally it irritates the living daylights out of you :p)
     
  10. elmar1028

    elmar1028

    Joined:
    Nov 21, 2013
    Posts:
    2,214
    I guess you can't avoid Garbage Day whether you use a third-party C# Library or code your own solution :p

    So I might as well code my own solution. For my own experience :)

    You kidding me? :eek:
     
  11. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    6,638
    Well: https://en.wikipedia.org/wiki/Comma-separated_values
    So, you can expect to have a tab-separated .csv file.
     
  12. knr_

    knr_

    Joined:
    Nov 17, 2012
    Posts:
    257
    Heh, nope. So it creates issues when you are on a computer with Excel as US / English and try to open a CSV in the Canada / French or French / French format, or vice-versa. Its a headache.

    From a coding perspective, expose a delimiter so its easily changeable and use the delimiter to separate the values being written out or being read in. That way people can easily change the delimiter to suit their country's format for the CSV file.
     
  13. goat

    goat

    Joined:
    Aug 24, 2009
    Posts:
    5,177
    It doesn't matter what they use elsewhere. You don't need to use CSV format either is you don't want if a very long record list separated by a comma, a colon, or whatever you choose is what you need.
     
  14. knr_

    knr_

    Joined:
    Nov 17, 2012
    Posts:
    257
    This is also an issue with JSON. No one parser will treat the period in exactly the same way as all the others. So it creates issues with the format.

    We love JSON, but in creating serialized JSON data or middleware that utilizes the JSON format we have to explicitly explain how the JSON parser treats periods.
     
  15. elmar1028

    elmar1028

    Joined:
    Nov 21, 2013
    Posts:
    2,214
    I guess I should use something more flexible like XLS or XLSX format, which seems to be an XML format document.
     
  16. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    6,638
    Now, that's news to me. Doesn't json explicitly define format for every value?
    http://json.org/

    If middleware can't properly handle periods, it means they're cutting corners and use locale-dependent functions for number parsing.

     
  17. knr_

    knr_

    Joined:
    Nov 17, 2012
    Posts:
    257
    Give different JSON parsers a try :)

    They all may have moved toward treating periods in the same way since we last game them a run through (which was admittedly a while back), but it has been an issue in the past.
     
  18. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    6,638
    How about I ask you instead, since you already dealt with that?
     
  19. knr_

    knr_

    Joined:
    Nov 17, 2012
    Posts:
    257
    I already answered that question.
     
  20. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    6,638
    Sigh.

    Rephrasing.
    Which json parsers cannot treat dot symbol properly despite json spec explicitly specifying usage of dot symbol?
     
  21. Kiwasi

    Kiwasi

    Joined:
    Dec 5, 2013
    Posts:
    16,608
    System.IO works quite well.
     
  22. Ryiah

    Ryiah

    Joined:
    Oct 11, 2012
    Posts:
    15,017
    Which specification? There are at least four - RFC 4627, RFC 7158, RFC 7159, and ECMA 404.
     
  23. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    6,638
    Here you go.
    RFC 7159:
    ECMA 404:
    RFC 7158:
    rfc 4627:
    Now what?
    ---

    Why don't you list discrepancies between all four documents that would involve decimal point sign and could result in inconsistent parser behavior?
     
  24. Ryiah

    Ryiah

    Joined:
    Oct 11, 2012
    Posts:
    15,017
    Pfft. All I did was open Wikipedia, check how many specs existed, and linked them hoping I wasn't too far off. :p
     
  25. Dustin-Horne

    Dustin-Horne

    Joined:
    Apr 4, 2013
    Posts:
    4,562
    There are a lot of complications with data such as quotes with values, delimiters within values, the mix of quoted unquoted fields, and fields that have multiline data which means you can't simply split or read per row. A good csv library will properly parse those scenarios. There is an excellent one we use at the office and I can look it up tomorrow. It has a nuget package and I believe is also OSS on github.

    Yep, the json schema is defined and if it's not handled correctly them you're not using a validating (or correct) json parser.
    Not that easy. Xls is proprietary. Xlsx is openxml but there's an extra compression step. It would be worse in terms of heap allocations.

    So use a parser that follows the spec.
    Excel let's you choose the delimiter when importing regardless of language. Excel is crappy at CSV for other reasons, such as trimming leading zeroes off of text fields (like us postal codes) that it thinks are numeric.
     
    Trexug, Ryiah and neginfinity like this.
  26. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    6,638
    ^_^ Murphy's law. ^_^
     
    Ryiah and Dustin-Horne like this.
  27. Eric5h5

    Eric5h5

    Volunteer Moderator Moderator

    Joined:
    Jul 19, 2006
    Posts:
    32,203
    Well, being scared of GC under any circumstances whatsoever seems counter-productive. Everything I've done on mobile generates garbage in certain cases, and it's completely irrelevant. You'd generally only have issues if you're generating lots of garbage every frame, which is certainly not the case if occasionally reading a CSV file.

    All Unix-like operating systems use LF, which includes OS X and Linux. In any case you'd use System.Environment.NewLine if you were going to deal with it directly.

    --Eric
     
    Kiwasi, Ryiah and Dustin-Horne like this.
  28. Kiwasi

    Kiwasi

    Joined:
    Dec 5, 2013
    Posts:
    16,608
    This.

    Garbage is an issue if it bites you every frame. But CSV's are for loading and saving data, not updating on a frame by frame basis. You can typically get away with a collection during a loading sequence.
     
  29. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    4,788
    I tried parsing conceptnet5 csv data, was TAB and comma separated. Tried in Unity C#, noped, did it in blitzbasic to my relief. Thank god I didn't need those part with non separating comma.

    Any simple data format is only as simple as the exception it was created for. After that exception kill its simplicity and lead to increasing competing standard aimed at simplifying parsing back, except now you also have to parse which standard the data is stored in ...
     
  30. Dustin-Horne

    Dustin-Horne

    Joined:
    Apr 4, 2013
    Posts:
    4,562
    There's a difference between a standard and a Willy nilly implementation. If data is being separated by both commas and tabs in the same file the. That's not following a standard, it's just plain wrong and sloppy or lazy programming.
     
    Kiwasi likes this.
  31. Kiwasi

    Kiwasi

    Joined:
    Dec 5, 2013
    Posts:
    16,608
    True. But often with CSV data you only get the data, you are not in control of the data source.

    I often get SCADA data that is a mess. Mostly comma separated, but just as often tab or space separated. Random columns missing in rows, which throws off most csv parsers.

    In an ideal world all of the programmers outputting data would ensure their software spit out decent, standardised formats. But in reality I spend a lot of time sanitising data by hand.
     
  32. MD_Reptile

    MD_Reptile

    Joined:
    Jan 19, 2012
    Posts:
    2,602
    Excel has a pile of features that can reformat messy csv files, and it is fairly easy working on the 'string with comma delimiter' file from code in unity. I was messing with writing a csv grapher that predicts the future of data based on averages, and creates a little 2d graph to show, for instance, admob revenue potential future estimates. Anyway I didn't finish it... yet :p

    Edit:
    I think this ended up working well to load up the csv files:
    http://wiki.unity3d.com/index.php?title=ImprovedFileBrowser
     
    Last edited: Jun 13, 2016
    Kiwasi likes this.
  33. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    4,788
    It's because it's a nested representation they have, big chunk of data with TAB separation then sub data with comma. Given they had 5gb of data no soft could open it, though they knew how to parse nested separation (and ask to).
     
  34. Dustin-Horne

    Dustin-Horne

    Joined:
    Apr 4, 2013
    Posts:
    4,562
    Sure... this is true with any date, not just CSV. I've gotten a lot of support requests for my asset to handle some really wonky situations where APIs (including some from Google) were not returning valid json.

    I wouldn't say "no software"... you just need a parser that uses a stream and parses on access per row / column instead of trying to parse the entire file into a monolithic data structure.
     
  35. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    6,638
    Software can handle files that are bigger than available memory. If the the data can be streamed or if the programmer knows about memory mapped files.
     
    Ryiah likes this.
  36. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    4,788
    That's why I used my custom prog on blitzbasic because excel and company didn't get the memo at all, not even notepad and notepad ++ ;)
     
  37. Forgon

    Forgon

    Joined:
    Mar 19, 2013
    Posts:
    8
    MD_Reptile likes this.
unityunity