Search Unity

Bug Font is missing characters, which absolutely exist in the font...

Discussion in 'UGUI & TextMesh Pro' started by aaversa, Sep 13, 2022.

  1. aaversa

    aaversa

    Joined:
    Sep 5, 2016
    Posts:
    41
    I'm using a font called 03SmartFontUI for the Japanese localization of our current game. I have a character file generated with 1425 items, which I use along with the "Characters from File" option. When using Font Asset Creator, no matter what size, padding, or resolution I use, I keep getting the same issue:

    Characters Included: 1181/1423
    Missing Characters: 242
    Excluded Characters: 0

    The issue is that this font absolutely has those characters. Not only did I use this exact font for the Japanese localization of my previous game (literally, the same exact base TTF), but I installed that very font to Windows and tested the missing characters in Word. They exist, but TMPro isn't seeing them.

    I've tried bypassing "Characters From File" altogether by simply copying the entire unique list of characters and pasting it as a Custom Character List, but it's the same issue with 242 missing characters.

    I have attached the font and the character file being used.

    I'm using Unity 2020.3.27f1 with TMPro updated to the very latest version via package manager.
     

    Attached Files:

  2. Stephan_B

    Stephan_B

    Joined:
    Feb 26, 2017
    Posts:
    6,595
    I just had a chance to look at the font file and character list you provided.

    As per the report from the Font Asset Creator, those characters are indeed missing from the source font file. For instance, the first missing character reported by the Font Asset Creator is 0x5B59 which looking at the font file in High-Logic FontCreator is indeed missing.

    upload_2022-9-12_19-33-2.png

    You can see that 0x5B58 is present as well as 0x5B5A but no 0x5B59.

    Why does this character show up in Microsoft Word or other apps? Most of these applications will use font substitution / fallback which leads users to believe the characters are present in a given font file when in reality they are not. In most cases, this magic hidden fallback works but as the developer you have no control over what other font might get used and on which platform.

    Here is a screenshot from Microsoft Word of 0x5B5b, 0x5B59 and 0x5B5A

    upload_2022-9-12_19-41-38.png
    The first and last characters which are slightly heavier / thicker are coming from your font file. The middle character 0x5B59 which is thinner is coming from DengXian font file. In the above case the weight different between your font file and DengXian is sort of subtle but too often users can easily tell that different fonts are being used. That is the issue with those magic fallback as it leaves the developer unaware that their font is missing those characters which is something they usually discover late in the development cycle or when users report it after release.

    P.S. The Font Asset Creator missing glyph report can be relied on with respect to missing glyphs from font files. Font Editing tools like FontCreator, FontForge, FontLab can also be relied on to verify all of this.
     
  3. aaversa

    aaversa

    Joined:
    Sep 5, 2016
    Posts:
    41
    Something doesn't make sense here though. As mentioned I use this *exact* reference font in my other game which has all of these characters. The other game has been out for years and extensively tested. It has no missing characters.

    I took the TMP font asset from *that* game and imported it into my current project. From what I can tell it includes the missing characters and others... I've attached that generated asset here. The source font is exactly the same (03SmartFontUI) as I literally copied+pasted it via Windows Explorer from one project to the other.

    Edit: Also worth noting, at least a few of the missing characters being reported are extremely common kanji. It is inexplicable that a Japanese font would be missing them.
     

    Attached Files:

    Last edited: Sep 13, 2022
  4. Stephan_B

    Stephan_B

    Joined:
    Feb 26, 2017
    Posts:
    6,595
    I checked the first 10 missing characters in High-Logic FontCreator and they are most definitely missing from the font file. I also verified this in FontLab 8 as seen below and those characters are not in the font file.

    upload_2022-9-12_21-20-24.png

    It is not unheard of for font publishers or companies like Microsoft to update font files from time to time. Most likely this is a different source font file.
     
  5. aaversa

    aaversa

    Joined:
    Sep 5, 2016
    Posts:
    41
    Here is a link to download the source font file from, well, the source.

    https://www.freejapanesefont.com/smart-font-ui-download/

    I'm very confused because when I manually search for the character "存", I can see that in FontCreator as cid2840, index 2840, code $5B58. It IS there, right...? Wouldn't it be missing from that square if it were not there?
     
  6. aaversa

    aaversa

    Joined:
    Sep 5, 2016
    Posts:
    41
    Ok. I *believe* I've identified the problems.

    Issue 1: When building my character file to use, I was not using String.Normalize().

    Issue 2: There were actual Chinese characters being mixed in that were from the zh_cn dialog files.

    Issue 3: The Font Asset Creator modal seems to cache any file you put in Characters From File, and it's not clear how or when that cache is cleared.

    For example, every time I regenerated the character file (japaneseCharacters.txt) I was overwriting the same file via editor script. In Font Asset Creator, I would delete the reference, put in a new reference to a different text file, begin building from *that* file, cancel the build, and then switch back to japaneseCharacters. I did this because I wanted to make sure the cached version of japaneseCharacters.txt was not being used.

    However it seems like that cache is very stubborn and persistent. Once I started rebuilding character files with incrementing file names (japaneseCharacters2, japaneseCharacters3), only then did I see Font Asset Creator start reporting the correct number of found/missing glyphs.
     
  7. Stephan_B

    Stephan_B

    Joined:
    Feb 26, 2017
    Posts:
    6,595
    0x5B59 "孙" is the first missing glyph on the Font Asset Creator report list. 0x5B58 and 0x5B5A as seen above are both present in that font file.
     
  8. aaversa

    aaversa

    Joined:
    Sep 5, 2016
    Posts:
    41
    Right, understood now, there were some Chinese characters mixed into the file. That being said, did you see my note about Font Asset Creator caching the character file even if it's totally regenerated?
     
  9. Stephan_B

    Stephan_B

    Joined:
    Feb 26, 2017
    Posts:
    6,595
    Let me test that as soon as I have a chance.

    What happens if you close the Font Asset Creator? Does that reset / seems to then be using the correct file?