Search Unity

Text Reading And Formatting

Discussion in 'Scripting' started by john-essy, Jan 15, 2019.

  1. john-essy

    john-essy

    Joined:
    Apr 17, 2011
    Posts:
    464
    I am busy creating an app for my client which is an animal park. They have over 365 animals, birds, Routes etc. Each one of them is inside a Docx File.

    So to start with i created a Json object for each entry so that i can just fill in the info and save inside unity which was straight forward of course. I am hitting 2 snags right now.

    Code (CSharp):
    1. {
    2.   "FullName": "Kudu",
    3.   "LatinName": "Acinonyx jubatus jubatus",
    4.   "MaleWeight": "40-60kg",
    5.   "FemaleWeight": "35-50kg",
    6.   "Height": "86cm",
    7.   "Age": "14yrs",
    8.   "KrugerPop": "<150",
    9.   "Description": "",
    10.   "Identification": "",
    11.   "Diet": "",
    12.   "Habitat": "",
    13.   "Behaviours":,
    14.   "WhereInKruger":,
    15.   "DidYouKnow": "",
    16.   "KidsCorner": ""
    17. }
    1. How do i read the text from a specific part of text say "Diet" (Which is the title of the next part of information) down to the next part of the document which might be something like Habitat? I did think about adding my own escape characters like /*/ this at the very end so that i can search for it and when i hit it stop, take all the lines i just read and feed it into the Diet section of my json object. Another issue is new lines?

    2. How do i keep the texts formatting from the Text file? without adding <b> etc manually? There is over 225,000 words in total across all docs so this is just not an option?

    Thank you for any suggestions, this has my brain fried!
     
  2. fire7side

    fire7side

    Joined:
    Oct 15, 2012
    Posts:
    1,819
  3. Kurt-Dekker

    Kurt-Dekker

    Joined:
    Mar 16, 2013
    Posts:
    38,745
    Part 1: You shouldn't be hand-parsing JSON. You're welcome to but if you do you're on your own. That's not the point. Instead, drag the above JSON into a website such as json2csharp.com and get a class out of it, then use something like JSON.NET (available free on the Unity asset store) to unpack your JSON into that object.

    Here's a screencap from when I ran it just now:

    Screen Shot 2019-01-15 at 3.29.37 PM.png

    Then you can manipulate it as first class C# data structures. You can add to that C# code and extend it anyway you want.

    Part 2: I don't know what format DOCX files store their bold, etc. formatting in but that has to be on the web somewhere. If it isn't already in a HTML-like markup that Unity understands you will need a tool to transmogrify it into that format.
     
  4. john-essy

    john-essy

    Joined:
    Apr 17, 2011
    Posts:
    464
    Hey Thanks for the reply, @kurt, That is exactly what i did sorry, i already have the class, created a template and only showed you the json. Also thanks for point #2!

    The main problem here is reading the right paragraphs to put into the right section of the json. Think of it like i have a doc with all these Titles and info about each, i want to split this up and put it into the correct area in my json. If that makes any sense!
     
    Kurt-Dekker likes this.
  5. Kurt-Dekker

    Kurt-Dekker

    Joined:
    Mar 16, 2013
    Posts:
    38,745
    Things that will help in this regard will be regular expression parsing, general string parsing, etc. Depending on how regularly formatted your document is, it will lend itself to automated processing more or less. If the document was made and edited by human hands, there is likely a lot of irregularities in formatting, and you will need to either identify and handle them properly, or just chop it up by hand. It's going to depend entirely on how sanitary and well-formatted your doc is.