Search Unity

TextMesh Pro Creating asset from font with a lot of characters

Discussion in 'Unity UI (uGUI) & TextMesh Pro' started by Necronomicron, Jul 16, 2018.

  1. Necronomicron

    Necronomicron

    Joined:
    Mar 4, 2015
    Posts:
    90
    I want to create asset from NotoSansCJKjp-Regular (42188 characters), I will use many of them, so I don't want to pick some certain ones but just to take them all. I also need them to be huge on screen (like half of screen or so), so I want good quality of characters as well. What is the best approach to perform this? What settings should I use etc.?

    I use Unity 2018.1.4f1 and TextMesh Pro 1.2.4.
     
    Last edited: Jul 16, 2018
  2. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,182
    Adding every single character from a font file is very inefficient as most of those characters will never be needed.

    The easiest way to handle localization (currently) is to create a Primary font asset that contains all the known characters used in the project for each given language or sets of languages.

    For Latin based languages and since their character set is limited, you can create a Primary font asset that contains all of extended ASCII. Then create a few additional font assets that will contains Cyrillic, Greek and other potential subset and assign those as Fallbacks to the Primary for Latin languages.

    For Latin languages (and depending on the font you select) a sampling point size of 72 with padding of 8 typically results in a nice look font.

    For CJK and since their character sets are much larger, you will create a Primary font asset for each and then have several fallbacks for each of them as well.

    For Chinese for example, your primary will contain all the Chinese characters known / contained in your project. Then for those unknown characters likely to come from user input, you will create 3 additional fallback font assets which will contains the remaining 8105 character defined in the Table of General Standard Chinese Characters. As a result, the first fallback will contain the 3500 characters from the list minus those already in your primary. The 2nd fallback will contains the next 3000 minus again those in the primary and lastly the third the remaining 1605 minus those in the primary.

    For East Asian characters, a sampling point size of 36 to 48 is actually pretty good with padding value of 4 to 5. Try to keep the padding at about 10% of sampling point size.

    When creating the Primary and Fallback font assets, keep in mind that they do not need to be using the same sampling point size and padding. So the Primary can be sampled at higher quality (since you know these characters are contained in the project / UI and menus) and then use a lower quality for the fallback since they will likely come from user input which is typically plain white text and smaller on screen where the higher quality won't be noticeable.

    The only important part is maintaining the same ratio of Sampling Point Size to Padding for the Primary and Fallbacks. For instance if the primary is using a sampling point size of 80 with padding of 8 then the fallbacks could be using sampling point size of 50 with padding of 5. Maintaining the same ratio will ensure the same visual appearance in regards to styling (outline, shadow, etc.) between the primary and fallbacks when using Material Presets.

    I certainly understand this is more involved than you wanted but like many aspects of game development where we have to create efficient geometry / topology and UV mapping for models or bake NavMeshes or Lightmaps to achieve the visual results and performance we seek, the same is true for text which granted isn't as cool as these other things still remain important.

    Having said all of that, a hybrid dynamic SDF system is in the works and will make this process much simpler. The recommended workflow will still include creating primary font assets that contains all the known / used characters in the project for each language or sets of languages but you will be able to use fallback font assets (set to dynamic mode) where characters not covered in your primary or other fallbacks can be added into those font asset at runtime.

    The idea is to have the vast majority of characters already baked in your primary and existing fallbacks thus providing best quality and performance while relying on the dynamic system for those few characters that were unknown and coming from user input where the performance impact here is not noticeable by users since human type slow.
     
  3. Johannski

    Johannski

    Joined:
    Jan 25, 2014
    Posts:
    571
    Just a quick addition: If you're using an excel sheet or google sheets for your translations I made a small handy tool to get all unique characters of a language: https://github.com/JohannesDeml/CsvCharacterExtractor
    That way it is really easy to just include the characters you really need.
     
  4. Necronomicron

    Necronomicron

    Joined:
    Mar 4, 2015
    Posts:
    90
    Well, then maybe I could pick only those I will use. It's 1000+ characters for now and later I will probably extend this number to around 3000. I will use them all and in huge size (1 characters = 1 level, like this). And it's only hieroglyphs (no latin, cyrillic or else). Can I fit them all in one asset or should I split them in parts somehow? Or maybe it will split them automatically? I'm asking because testing it myself would take enormous amount of time, yesterday I've tried to create asset of 100 hieroglyphs and it took something like 5 minutes...
     
  5. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,182
    For handling about 3000 glyphs, depending on the sampling point size and padding that you need, you could likely fit all of those in one 2048 x 2048 font asset but I think it would be easier to simply split those between two or even 3 font assets if needed. Assuming you end up with 2 or 3 font assets, the primary should contain the known text (menu, UI, common text) and then the other font asset the remaining characters. These other font assets (fallbacks) should be assigned to the primary in the Fallback list.

    In terms of the time it takes to create the font asset, it can take several minutes depending on the sampling point size, padding and number of characters. This only has to be done once in theory so figure out how you will split these characters and start baking the font assets. BTW: This process will be much faster in the next release of TMP :)
     
    Necronomicron likes this.
  6. Necronomicron

    Necronomicron

    Joined:
    Mar 4, 2015
    Posts:
    90
    What are disadvantages of fallback assets? Can primary and fallback assets be of the same quality?
     
  7. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,182
    Since a font atlas texture size / space is limited, you can only add so many characters at a certain point size per atlas. The Fallback system allows you to split the characters between multiple font asset / atlases thus removing this limitation.

    When a certain character is requested, TMP will look in the primary (font asset assigned to the text object) but if this character is not available, TMP will then look thru the list of fallback font assets assigned to the primary as well as their own fallbacks. If the character is still not found, TMP will look at thru the list of general fallbacks assigned in the TMP Settings file. If the character is still not found, then TMP will look in the Default Font Asset assigned in the TMP Settings and then if the character is still not found, it will display the missing glyph character specified in the TMP Settings.

    The system is even more flexible as sprite assets assigned to the text object and in the TMP Settings are also scanned as Sprites can now have unicode values assigned to them.

    The only down side to using fallback is the extra draw call that you get for using characters from the other atlas textures. So if you have a primary with 2 fallbacks and use characters from all of them, you get 3 draw calls instead of 1. Since there is no real measurable performance difference between 1 and 20 draw calls this is fine even of old mobile devices.

    They can but do no have to. The primary can be using a higher sampling quality and larger texture vs the fallbacks. The only thing to be mindful of is maintaining the same ratio of sampling point size to padding between the two. If the Primary is using sampling point size of 100 with padding of 10, then the fallbacks could be using sampling point size of 60 with padding of 6 or anything matching this 10% ration.

    Currently font assets and their fallbacks are loaded when the primary is loaded. In the future, I want to make the fallbacks load on demand. However, even without the on demand loading, you would be loading a much larger texture vs. loading several smaller textures so no memory usage difference here. More importantly, it is faster to read from a smaller texture than a larger texture on many mobile devices so in that regards using several smaller textures is better than one big one.
     
    gracezhu and Necronomicron like this.
  8. Necronomicron

    Necronomicron

    Joined:
    Mar 4, 2015
    Posts:
    90
  9. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,182
    I checked the font file and this glyph is present in it so unless it could not fit in the texture it should be included.

    I'll check into it later this afternoon since I am working on Font Asset Creator stuff anyway.
     
    Necronomicron likes this.
  10. Necronomicron

    Necronomicron

    Joined:
    Mar 4, 2015
    Posts:
    90
    I tried to create asset of only this symbol and it was missed. In fact, there were 2 missed characters, all of them were something else (D8 42 and DF 9F instead of D8 42 DF 9F). The problem may be that this symbol is outside the Unicode BMP and is interpreted as 2 others.
     
    Last edited: Jul 21, 2018
  11. Kumo-Kairo

    Kumo-Kairo

    Joined:
    Sep 2, 2013
    Posts:
    332
    Does this new dynamic system involve GPU SDF calculations of some sort?
     
  12. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,182
    It does not. I think most users would prefer we keep the GPU free to do other cooler stuff.

    Don't get me wrong I obviously think text is cool but I seem to be in the minority here ;)
     
  13. Kumo-Kairo

    Kumo-Kairo

    Joined:
    Sep 2, 2013
    Posts:
    332
    My point is - if we're loading our CPU for a pretty long time, causing one-long-frame stutter, and we can't really do anything on GPU either (as pipeline is stalling), why not make a distance calculation shader that does the same thing but quite a few times faster. I think that OpenGL ES 2.0 capabilities would be enough for it, no need for complex compute shaders. It will be used only during one frame in which we won't be able to do anything else anyway.
    Or am I missing something?

    As for the editor-based generation that takes some 5 minutes for Chinese characters - it would possibly benefit from this approach as well.

    I would dig into this problem myself and will see if it's possible with gles2 and how much time it would take compared to a CPU-based approach (on a low-end mobile device)
     
  14. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,182
    The idea is still to create font assets that will include coverage for all the known characters used in a project. Then to rely on the dynamic system for unknown characters coming mostly from user input. In such case, we will be talking about few characters to be rastered and added to the font atlas at runtime and given this will be happening mostly when a user is typing, the overhead of doing this should not be perceivable by any user.

    In other words, I think we'll be able to achieve the performance we need without having to tap the GPU resources. Should that not be the case, then I will most certainly explore alternative options including offloading the task to the GPU.
     
    Last edited: Jul 23, 2018
    Kumo-Kairo likes this.
  15. ALL-CAPS

    ALL-CAPS

    Joined:
    Jun 23, 2014
    Posts:
    8
    Is there any ETA on the release of new hybrid font system? We're really looking forward to that feature.
     
  16. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,182
  17. Egil-Sandfeld

    Egil-Sandfeld

    Joined:
    Oct 8, 2012
    Posts:
    41
    I was looking for something like this! Nice.
    Quick learning from me was to arrange my csv like yours and save with unicode-UTF8.
     
  18. Johannski

    Johannski

    Joined:
    Jan 25, 2014
    Posts:
    571
  19. seltar_

    seltar_

    Joined:
    Apr 16, 2015
    Posts:
    14
    You could also extract the characters from a text with javascript.

    Code (JavaScript):
    1. const getUniqueChars = (text) => {
    2. let chars = {};
    3. for(var i = 0; i < text.length; i++){ chars[text.charCodeAt(i)] = true; }
    4. return Object.keys(chars);
    5. }
    Usage:
    Code (JavaScript):
    1. const text = `abcdefghijklmnopqrstuvwxyz0123456789
    2. ABCDEFGHIJKLMNOPQRSTUVWXYZ-.,*=+`;
    3.  
    4. const chars = getUniqueChars(text).join(",");
    5.  
    6. console.log(chars);
    and paste the results in to the textmesh font creator with character set as custom range.
     
  20. Catlard

    Catlard

    Joined:
    Sep 23, 2011
    Posts:
    3
    Hi there @Stephan_B ! I was very happy to find this thread. How's that dynamic font rendering system coming along? I'm sorely in need of it!
     
  21. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,182
    It is coming along nicely.

    Mostly tweaking editors / clean up at this stage while trying to address as many reported issues as possible which helps me further test the new system and changes.

    Planning on creating a new video in the next few days to cover key changes to Font Assets, Sprite Assets and their Editors / Inspectors. This will give me another good opportunity to further test everything as issues tend to surface when I am about 10 minutes into recording. Murphy's Law always lurking around when recording videos or during demos / big presentations looking to trip you up ;)
     
    tvilarinhoAquiris likes this.
  22. Catlard

    Catlard

    Joined:
    Sep 23, 2011
    Posts:
    3
    What great news! It sounds like it will be out soonish (at least, before I try to release MY app in January).

    Quick question: after I download my translations.json file and compile my list of unique characters in it, I'd like to be able to put all those custom characters I want to render in one file, and then manually trigger the SDF to re-render with the same settings, and those characters. Is that going to be possible with this fancy dancy new dynamic system? I hope it will!
     
  23. Cromfeli

    Cromfeli

    Joined:
    Oct 30, 2014
    Posts:
    191
    For example:
    https://www.google.com/get/noto/#sans-hans

    Use these ranges:
    https://forum.unity.com/threads/table-of-general-standard-chinese-characters.559882/
     
    Last edited: Mar 21, 2019
  24. YoungXi

    YoungXi

    Joined:
    Jun 5, 2013
    Posts:
    47
    Did you find any reason that might cause this ? I'm using 1.2.2, but having the same problem: Characters exist in my font, but missing in the result.
     
  25. Stephan_B

    Stephan_B

    Unity Technologies

    Joined:
    Feb 26, 2017
    Posts:
    3,182
    What specific characters and what font file?

    Are you running Unity 2017 or 2018 or possibly 2019?