Search Unity

Free lightweight XML Reader (needs road testing)

Discussion in 'Assets and Asset Store' started by Cameron_SM, Feb 10, 2011.

  1. Cameron_SM

    Cameron_SM

    Joined:
    Jun 1, 2009
    Posts:
    915
    I wrote this today after looking for something to read XML data but didn't want to use C Sharp's 1MB XML library just for loading in some preferences. Was also a bit shocked that no one seems to have written a lightweight C# parser. I saw there was one on the wiki but it didn't have attribute support which IMO is needed if you want to store more complex types of data.

    So, I dragged out my perl archives (floppies FTW!) and updated my old XML token parser for C#.

    The result is a tiny 20K script (10K without comments so it ought to be a spec when compiled) that reads an XML formatted string into a simple object list hierarchy.

    Of course, this needs to be battle tested but so far so good. It parsed the entire script of The Tragedy of Richard the Third (link) which is an 8,600+ line (270K) XML document in around 0.1 second within Unity so speed shouldn't be an issue. :D

    Download the Script here: XMLParser.cs (right click and save as)

    Usage Example:
    Loads XML text into object hierarchy then loops though opjects and re-writes XML as string to console.
    Code (csharp):
    1.  
    2. using UnityEngine;
    3. using System.Collections;
    4.  
    5. public class XMLTest : MonoBehaviour {
    6.  
    7.    
    8.     // Use this for initialization
    9.     void Start () {
    10.        
    11.         // Load a text asset XML file from the assets/resources folder
    12.         TextAsset xmlAsset = (TextAsset)Resources.Load("test", typeof(TextAsset));
    13.  
    14.         // get xml formatted string from text asset
    15.         string xmlString = xmlAsset.text;
    16.  
    17.         // create XMLParser instance
    18.         XMLParser xmlParser = new XMLParser(xmlString);
    19.  
    20.         // call the parser to build the IXMLNode objects
    21.         XMLElement xmlElement = xmlParser.Parse();
    22.  
    23.         // test string to re-build XML from XMLNode objects
    24.         string xmlOutputString = "";
    25.  
    26.         // recursively re-construct XML string
    27.         WriteXMl(xmlElement, ref xmlOutputString, 0);
    28.  
    29.         // log re-constructed xml string to the console.
    30.         Debug.Log(xmlOutputString);
    31.     }
    32.    
    33.     // rebuilds xml string in output, ugly little method but it works
    34.     public void WriteXMl(IXMLNode element, ref string output, int depth) {
    35.        
    36.         // tab strings for nicer formatting...
    37.         int i = 0;
    38.         string tabs = "";
    39.         while(i < depth) {
    40.             tabs += "\t";
    41.             i++;   
    42.         }
    43.        
    44.         // if textnode add content to output return early...
    45.         if(element.type == XMLNodeType.Text) {
    46.             output += tabs + element.value + "\n";
    47.             return;
    48.         }
    49.        
    50.         // add opening tag to output
    51.         output += tabs + "<" + element.value;
    52.        
    53.         // add attributes to opening tag
    54.         i = 0;
    55.         int attributeCount = element.Attributes.Count;
    56.         while(i < attributeCount) {
    57.             output += " " + element.Attributes[i].name + " = \"" + element.Attributes[i].value + "\" ";
    58.             i++;
    59.         }
    60.        
    61.         // close opening tag
    62.         output += ">\n";
    63.        
    64.         // recurse through all child elements
    65.         i = 0;
    66.         int childCount = element.Children.Count;
    67.         while(i < childCount) {
    68.             WriteXMl(element.Children[i], ref output, depth+1);
    69.             i++;   
    70.         }
    71.        
    72.         // add closing tag to output string
    73.         output += tabs + "</" + element.value + ">\n";
    74.        
    75.     }
    76.    
    77. }
    78.  
    Bit of a pointless example but it demonstrates both reading XML strings and how to use the XML class objects to get at the data.

    The XML Object hierarchy is composed of 2 classes that share an interface and 1 simple Struct:

    IXMLNode is the main XML hierarchy interface. XMLText and XMLElement use it and nothing more.

    Accessable properties:
    string value - either the tag name or the text content
    enum type - an enum to tell you if it's a a Text node or an Element node that could have child nodes.
    IXMLNode Parent - the parent node in the hierarchy (read only)
    List<IXMLNode> Children - a list of child nodes. Text nodes will always return an empty list here.
    List<IXMLNode> Attributes - a list of attributes. Text nodes will always return an empty list here.

    XMLAttribute is a simple struct with two public fields:
    string name - name of the attribute.
    string value - value of the attribute.

    I also implemented the XMLParser class to be used as an object instead of a static class so that it's easier to extend and modify. There's heaps of comments all the way thought it so it shouldn't be too difficult to follow and modify if you need extra features.

    XMLParser will break (and cry) if you feed it malformed XML documents!

    This shouldn't be an issue for games though. It also won't strip extraneous white-space, again shouldn't be an issue for games as you have tighter management over resources than other web/feeder based XML tasks.

    Requires HTML reserved characters (< > ' " ) to use entity references if you want to use those characters in the content (non-markup) or attribute values of your documents. XMLParser has some static class functions for converting these back and forth. The parser automatically handles and translates entity name references in xml documents for the above listed entities (it converts from " to " ). It does not support entity number references (such as " ). Any other entities will be stripped and replaced with a null char unless you modify the parser to handle them.

    Please report any bugs here and I'll try by best to fix them up as soon as possoble. :D
     
    Last edited: Feb 10, 2011
  2. Corscaria

    Corscaria

    Joined:
    Jul 25, 2009
    Posts:
    11
    Bah... why didn't i check the forums before writing my localization routines... i used the attributeless TinyXMLReader from the wiki... <starts rewriting code>
     
  3. Lypheus

    Lypheus

    Joined:
    Apr 16, 2010
    Posts:
    664
    Very nice, I was just looking for something to parse my mocked database data and properties - will check it out!
     
  4. spotlightor

    spotlightor

    Joined:
    Jun 25, 2010
    Posts:
    15
    Thanks for your great work! It really helps a lot and I think it's much easier to use than TinyXML on the wiki.
     
  5. defjr

    defjr

    Joined:
    Apr 27, 2009
    Posts:
    436
    Looks useful! Thanks for sharing.
     
  6. zine92

    zine92

    Joined:
    Nov 13, 2010
    Posts:
    1,347
    Sorry to ask, but how do i go about using it? I have currently added the scripts to my unity project folder but how do i start parsing XML files. Really sorry because i am new to XML parsing and c#.
     
  7. psyclone

    psyclone

    Joined:
    Nov 17, 2009
    Posts:
    245
    Thanks for this, was intending to expand the Tiny Reader from the wiki... Now I dont need to...
     
  8. MaDDoX

    MaDDoX

    Joined:
    Nov 10, 2009
    Posts:
    764
    I've found a bug when parsing an element with attributes and both opening and closing signs, like this:

    <rect x="1" y="1" width="1" height="1"/>

    it doesn't "read" the closing sign, it seems. I've sent a PM to Cameron detailing the problem, hope he gets it soon. Great and very useful tool btw :)
     
  9. spotlightor

    spotlightor

    Joined:
    Jun 25, 2010
    Posts:
    15
    I encountered this bug too and have made a fix for it. Now the XML reader can parse element in these formats:
    Code (csharp):
    1.  
    2. <node/>
    3. <node height="2" width="2"/>
    4.  
     

    Attached Files:

  10. sleglik

    sleglik

    Joined:
    Jun 29, 2011
    Posts:
    682
    Only one question: why not xml serialization?
     
  11. francksitbon

    francksitbon

    Joined:
    Jan 22, 2010
    Posts:
    268
    Hi i'm trying this but I got : null

    It print null in stead of sun

    Where am I wrong ?
     
  12. Cameron_SM

    Cameron_SM

    Joined:
    Jun 1, 2009
    Posts:
    915

    GetValue() is not a method of XMLElement, perhaps it's some extension method you have defined?

    What you want is node.Children[0].value

    Have also updated this to include spotlightor's fix. This has since been use in a published game so consider it fairly well tested.
     
  13. tredpro

    tredpro

    Joined:
    Nov 18, 2013
    Posts:
    515
    Can this import a feed as

    image
    title
    description
    image
    title
    description
    etc.