String/Unicode encoding problems!

128bit · Sep 21, 2015

Hi,
im trying to make the unicode "\u2665" look like "♥" in an string.
I cant just use String.Replace since all different unicodes should be converted.
Using this, it works fawlessly:

Code (CSharp):

string unicodeString = "\u2665";

// Create two different encodings.

Encoding ascii = Encoding.UTF8;

Encoding unicode = Encoding.Unicode;

// Convert the string into a byte array.

byte[] unicodeBytes = unicode.GetBytes(unicodeString);

// Perform the conversion from one encoding to the other.

byte[] asciiBytes = Encoding.Convert(unicode, ascii, unicodeBytes);

// Convert the new byte[] into a char[] and then into a string.

char[] asciiChars = new char[ascii.GetCharCount(asciiBytes, 0, asciiBytes.Length)];

ascii.GetChars(asciiBytes, 0, asciiBytes.Length, asciiChars, 0);

string asciiString = new string(asciiChars);

Debug.Log (asciiString);

Tho if im declaring "unicodeString" as an public string and just set the value in the editor instead of inside the script, it just outputs ""\u2665" instead of "♥". Someone may know how to solve the encoding problem?

JamesLeeNZ · Sep 21, 2015

you might be able to put the @ symbol at the start. Not sure it will fix your problem, but it might.

ie. string unicodestr = @"\u2665";

eisenpony · Sep 21, 2015

my guess is that the built in editor for strings automatically encodes anything you type here, so \ is probably serialized as \\ or something, so that it always comes back as \. In order to get around that, you might be able to make a custom editor which watches for special characters, or else you could figure out how to type ♥ into the editor (hint: alt + 3)

128bit · Sep 22, 2015

JamesLeeNZ said: ↑

...
Click to expand...

Thanks, tho doesnt fix it when using an public variable and setting value in editor.

eisenpony said: ↑

my guess is that the built in editor for strings automatically encodes anything you type here, so \ is probably serialized as \\ or something, so that it always comes back as \. In order to get around that, you might be able to make a custom editor which watches for special characters, or else you could figure out how to type ♥ into the editor (hint: alt + 3)
Click to expand...

Thanks. The point is, the final need for it will be to download an text via Downloadstring (that contains codes like \u2665) and convert it so it does display unicode characters. So just knowing how to type an heart or other unicodes wont be enough. Only strings that got declared like:
string unicodeString ="\u2665";
in script are working. All other, in editor or even when just downloaded. arent.Thats kinda confusing me.

JamesLeeNZ · Sep 22, 2015

yeah it was a longshot.

eisenpony · Sep 22, 2015

128bit said: ↑

Only strings that got declared like:
string unicodeString ="\u2665";
in script are working.
Click to expand...

Okay, let's clear this up. The \u escape sequence is recognized by the c# specifications as a Unicode character escape sequence. So, it makes sense that the strings written in code would be converted into a Unicode character -- it is done by the compiler as a "preprocess", before regular compilation begins.

Once you are into the runtime, e.g., downloading a string, that conversion will not happen automatically. If you want to convert Unicode patterns into Unicode characters, you will need to write a parser that can "unescape" these \u patterns. Looks like there is already a parser which can do this and much more (maybe too much) in the System.Text.RegularExpressions.Regex class called Unescape: https://msdn.microsoft.com/en-us/li...ularexpressions.regex.unescape(v=vs.110).aspx

vncnt_klm · Sep 20, 2017

This worked for me:

// ADD EURO SIGN
FieldData.text += System.Convert.ToChar(0x20AC);

naviln · Jun 16, 2018

vncnt_klm said: ↑

This worked for me:

// ADD EURO SIGN
FieldData.text += System.Convert.ToChar(0x20AC);
Click to expand...

This is perfect, exactly what I needed. Thankyou!

Search Unity

Unity ID

Useful Searches

String/Unicode encoding problems!