Search Unity

TextMesh Pro Needs Unicode support

Discussion in 'Unity UI (uGUI) & TextMesh Pro' started by JerellFox, Jul 17, 2017.

  1. JerellFox

    JerellFox

    Joined:
    Jul 17, 2017
    Posts:
    2
    Hello !
    I currently using Textmesh Pro and so far so good !
    But I came from Asia, and we are wondering to know..
    Is there any possible to add Unicode / Asian CJK Fonts support into the Textmesh Pro ?

    Anyway, Thanks for your hard work for all of this ! :)
     
  2. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,650
    TextMesh Pro does support full UTF32 / Unicode. When you create your font assets, you have to include the characters that you wish to use. For Asian languages whose character sets are larger, you might also need to create a few fallback font assets.

    See the Font Asset Creation video which covers how to include characters for any languages as well as important options like "Characters from File".

    You should also watch the video about the Font Fallback as well as this one to allow combining symbols in the same object such as FontAwesome.

    Lastly make sure that you are familiar with Material Presets as this is also an important part of using TMP and avoiding resource duplication.

    Here is some additional information that I wrote previously about handling CJK and localization in general.

    ***************************​

    In terms of how to handle the mapping of these characters, here are my suggestions.

    I recommend creating a Primary SDF Font Asset which will contain all the known / used Chinese characters in the project. By known I mean those used in your menu and text components but not those who might potentially come from user input. This will result in an SDF Font Asset which most likely contains less than 1000 characters. (P.S. I would actually love to know what you end up with in terms of character count.) To input the list of Characters, I always use "Characters from File" since the text used in your project should already exist in some text file (encoded as Unicode).

    Next, I would create 3 additional Fallback font assets which would contain the remaining 8105 characters not already present in your Primary Font Asset. The first Fallback would contain the first 3500 from the Table of General Standard Chinese Characters minus those already in your Primary. The 2nd would contain the 3000 minus again those in the Primary and lastly the third would contain then 1805 minus those in the Primary.

    This will give you (1) Primary SDF Font Asset to which you will assign these (3) Fallback SDF Font Assets.

    When creating these Fallback Font Assets, the sampling point size and padding and texture resolution do not need to be the same.

    In order for the visual appearance of things like Outline, Shadow, etc to be consistent, you have to maintain the same Ratio of Sampling Point Size to Padding. So if the Primary is using a Sampling Point Size of 120 with padding of 10. Then you Fallbacks could be using a Sampling of 60 with padding of 5. You can control the sampling point size by instead of using Auto Sizing on Point Size, you set a value manually.

    So for all the known text, I usually maximize the sampling point size since I know these characters are contained in my project and I want them to always look great. However, for the Fallbacks where only a few might be used in the context of user input which might not be visible on screen much, I use lower quality settings which allows me to save on texture size / resources.
     
  3. JerellFox

    JerellFox

    Joined:
    Jul 17, 2017
    Posts:
    2
    WOW!
    Appreciate your kind assistance!
    I'll give it a try.
     
  4. ratanki

    ratanki

    Joined:
    Jul 17, 2017
    Posts:
    2
    Hey @Stephan_B ,

    hope you don't mind I add a round of questions to the Unicode support here, since I'm struggling to find accurate documentation:

    • Does this apply to all platforms, including android/ios?
    • I read in one of your recent posts that gsub tables are not supported yet. Does this not impact asian fonts?
    • How does this apply to input fields, rather than text component / render?
      • Specifically, does user input work for codepoints beyond Basic Multilingual Plane? I believe Unity currently only supports BMP, but I'm not sure about TMP input field.

    Thank you!
    Rsam.
     
  5. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,650
    This apply to all platforms.

    Freetype which Unity and TMP use to raster glyphs, does not provide access to the GPOS and GSUB tables which contain "Font Features" which includes among other things ligatures, diacritical marks, glyph substitutions, kerning, etc.

    Although most languages use some font features like kerning and some other stuff, languages like Arabic, Thai, Bengali, etc rely heavily on these features. Asian languages like most Latin languages don't rely on that as much. Regardless, support for Font Features is planned for the Integrated version of TMP.

    TMP currently supports the full range of Unicode. UTF16 characters can be accessed with \u03A9 (2 hex paris) while UTF32 is \U0001F600 (4 hex pairs).

    Strings in C# are 16 bit so you also have to use \u or \U or surrogate pairs to access UTF32 characters.

    Although we can access the full Unicode range in strings or editor input field using the information above, the Text Input Field relies on the Event Class in Unity which is required to process keyboard input. Currently and depending on the platform, UTF32 input doesn't always work. Some additional work will be required to update classes like the Event Class to make sure we get the correct Unicode input on all platforms.
     
  6. Feaver1968

    Feaver1968

    Joined:
    Nov 16, 2014
    Posts:
    70
    Is it possible to add this integration to the roadmap? https://unity3d.com/unity/roadmap
     
  7. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,650
    I am sure it will get added at some point as besides supporting current TMP users, this is my primary focus.
     
  8. intellime

    intellime

    Joined:
    Mar 18, 2018
    Posts:
    11
    Hi,
    Thank you for specific explanation about the problem of importing Asian languages to Unity.

    But some activities have been done on importing RTL Asian languages like Arabic/Persian to Unity e.g. UPersian (by ElectroGryphon) which is based on ArabicSupport for Unity (by Konash).
    They nicely have imported RTL support to Unity and the asset performance regarding their Typographic Ligature is almost perfect.

    Could you please let me know if there is any way to combine e.g. UPersian with TMP?

    regards,
     
  9. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,650
    This is something that the author(s) of UPersian could certainly explore.
     
  10. intellime

    intellime

    Joined:
    Mar 18, 2018
    Posts:
    11
    I Hope they would, although I also hope TMP develop RTL features as well.

    Thanks,
     
  11. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,650
    TMP has basic RTL support but does not currently support glyph re-ordering which UPersian does.

    It looks like UPersian could be used like the old Arabic asset as described on the TMP user forum. See the following thread / post.

    Native support for glyph re-ordering as well as OpenType font features is planned for the new text system that will eventually replace TextMesh Pro.
     
  12. intellime

    intellime

    Joined:
    Mar 18, 2018
    Posts:
    11
    Wow.. excelent!
    Thank you dear Stephan for your reference to :
    http://digitalnativestudios.com/forum/index.php?topic=462.msg8705#msg8705

    The ReverseText function in the above thread was the missing task.
    The reverse flow of Persian text (bottom-up) became OK when reversing every each of the chars in the string in each paragraph.

    Thanks for your great support
     
  13. ndever

    ndever

    Joined:
    Apr 22, 2017
    Posts:
    11
    Hi,

    I seem to have a problem with some UTF32 characters. Many hours of trial and error, forum reading and trying again didn't solve it, so here it goes.

    The problematic characters are the following: (U+27607), (U+20089), (U+201A2), (U+20086), (U+20087). The font file does contain these and yes, the have the right unicode values. First, I tried adding Characters from File. This is the glyph info output:

    Characters packed: 0/9
    Missing Characters
    ----------------------------------------
    ID: 55389 Hex: D85D Char [í¡]
    ID: 56839 Hex: DE07 Char [í¸‡]
    ID: 55360 Hex: D840 Char [í¡€]
    ID: 56457 Hex: DC89 Char [í²‰]
    ID: 56738 Hex: DDA2 Char [í¶¢]
    ID: 56454 Hex: DC86 Char [í²†]
    ID: 56455 Hex: DC87 Char [í²‡]
    ID: 13 Hex: D Char []
    ID: 10 Hex: A Char []

    It says 0/9 and also there are these Hex values which are from the Low Surrogate Area (https://www.unicode.org/charts/PDF/UDC00.pdf). If I'm getting this right this method reads only UTF16 codes and splits the 32 bit characters into their surrogate pair codes.

    But when I try to add a character by Unicode Range (Hex) as in \U000201A2 the glyph info tells me "Characters packed: 0/0". Sooo what am I doing wrong? Please help me!

    Thanks in advance!
     
  14. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,650
    Can you provide me with a link to this font file?
     
  15. ndever

    ndever

    Joined:
    Apr 22, 2017
    Posts:
    11
  16. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,650
    Thank you for providing the source font file.

    Here are the settings that I used to create a font asset that contains these glyphs.

    upload_2018-5-19_2-24-46.png

    This is the hex character sequence that I entered

    A4,2E85,2E89,2E8D,2E96,2E98,2EA1,2EA3,2EA8,2EAD,2EB9,2EBE,2EC2,2ECF,4491,4EBC,5315,20086,20087,20089,201A2,27607

    This font only contains 25 glyphs as you can see in the image below.

    upload_2018-5-19_2-25-41.png
     
  17. ndever

    ndever

    Joined:
    Apr 22, 2017
    Posts:
    11
    Wow, thanks! It seems that I added the hex range in the wrong format. Somewhere I read that it should start with \U... Or is that for the Unity editor?

    Also, just out of curiousity, why didn't the characters from file work?

    Anyway, thanks for the instant response. Your help is greatly appreciated!
     
  18. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,650
    When trying to reference a UTF16 or UTF32 character in the text, you need to use \uFFFF for UTF16 and \UFF00FF00 for UTF32 which is the standard conversion for how to reference those in strings in C#.

    BTW: This information is covered in the Font Asset Creation video.
     
  19. ndever

    ndever

    Joined:
    Apr 22, 2017
    Posts:
    11
    Oh, I see now what I got wrong. Thank you again!
     
  20. rocksvick

    rocksvick

    Joined:
    Sep 9, 2012
    Posts:
    3
    Screen Shot 2018-09-22 at 4.17.06 PM.png I am trying to create characters to support Russian Language. I tried with every combination. It shows all the characters in missing ones.
    When i use same font on normal text then i am able to see russian characters. Please how to fix it.
     
  21. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,650
    I just downloaded this Lobster font from Google and it does contain those Cyrillic characters as seen in the image below.

    upload_2018-9-22_4-27-52.png

    Maybe the font you downloaded has some issue. See if you get the same results from the version available from Google (just in case you got it from somewhere else).

    In my example, I used an SDF mode but I also testing with Hinted Smooth and also got the correct results.
     
    rocksvick likes this.
  22. rocksvick

    rocksvick

    Joined:
    Sep 9, 2012
    Posts:
    3

    Yes its working after downloading from that link. Actually i tried with same hex range on fonts allready came with TMPRO even then no russian characters were coming. Thanks Stephan
     
  23. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,650
    Liberation Sans is the only font included with TMP that includes Cyrillic characters.

    In terms of determining what glyphs are present in a font file, the Font Asset Creator can be relied upon as it is accurate. In the current version, it doesn't report if a glyph was not included because it is missing or could not be packed in the Atlas but in the image of the updated version I posted, you will now be able to tell which are missing and which were not packed because they simply didn't fit in the atlas.

    BTW: Google fonts is a very reliable place to get fonts. They also clearly list the licensing terms which is so important.
     
  24. dhami-nitin

    dhami-nitin

    Joined:
    Jun 5, 2016
    Posts:
    8
    I am trying to make a font for khmer but some of the characters don't display correctly in the text mesh pro as well can anyone help?
     
  25. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,650
    The khmer language makes use diacritical marks which isn't fully support (yet) in TextMesh Pro. So although it is possible to include all the characters and diacritical marks in the font asset by using the Font Asset Creator, the marks will not get positioned correctly.

    It appears the Khmer language uses mostly single diacritical marks and although the Font Asset Creator won't extract the positional data to properly position these marks, it should be possible to manually create Glyph Adjustment Pairs in the Font Asset inspector to properly position them.

    Take a look / play around with this feature which should allow you to setup this marks for the various character / mark combinations.
     
    NITRONOME likes this.
  26. how2winsquad

    how2winsquad

    Joined:
    Oct 16, 2018
    Posts:
    2
    i'm trying to make a font for Vietnamese in TMP but it not working although i followed tutorial in this topic. please help me, thank you
     
  27. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,650
    The new preview / release of the TextMesh Pro package version 1.4.0-preview.1b for Unity 2018.3 is now available.

    This new release includes the new Dynamic SDF system which will improve the workflow for handling localization. Please see the top sticky post and video about this new release / topic.
     
  28. how2winsquad

    how2winsquad

    Joined:
    Oct 16, 2018
    Posts:
    2
    thank you very much
     
  29. duonglkh

    duonglkh

    Joined:
    Mar 27, 2018
    Posts:
    1
    I am trying to make a font for khmer but some of the characters don't display correctly in the text mesh pro as well can anyone help?
     
  30. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,650
    Most likely the issue is related to OpenType font features like Diacritical Marks, Ligatures, etc. not being support (yet) by TMP.

    For me to confirm this, please provide an example of the text you are using and images showing the current results vs. expected results.
     
unityunity