Search Unity

Question Wrapping with Chinese

Discussion in 'UGUI & TextMesh Pro' started by PigDa, Jul 7, 2021.

  1. PigDa

    PigDa

    Joined:
    Aug 22, 2017
    Posts:
    6
    HI All

    I am using TMP with chinese, and the wrapping is not my expect

    upload_2021-7-7_11-33-3.png

    right now I through RichText \n to make it
    have any smarter way?
     
  2. PigDa

    PigDa

    Joined:
    Aug 22, 2017
    Posts:
    6
    plus upload_2021-7-7_11-40-59.png
     
  3. Stephan_B

    Stephan_B

    Joined:
    Feb 26, 2017
    Posts:
    6,595
    Line breaking for Chinese, Japanese and Korean is based on the Leading and Following Character rules which define where it is legal to break a line of text. The list of leading and following characters is defined in the TMP Settings which can be edited.

    Based on the above rules, the text would break after '文' since this character is allowed to be at the end of a line. Since the series of "rrrrrrrr" are considered a word (Latin text), the word wraps to the next line. Since the whole word ("rrrrrr") doesn't fit on the 2nd line, we then have to break the word which is why the last "r" is on the third line.

    As you have shown in your image, the text breaks as follows:

    upload_2021-7-7_1-6-15.png

    Here is the same text in Microsoft Word.

    upload_2021-7-7_1-7-4.png
    As you can see, TMP breaks the text exactly as it does in Microsoft Word and other text processors / editing tools.

    Having said that, you can prevent the text from breaking between the word "Chinese" and the series of "r" by using the <nobr> or no break tag as seen below:

    "<nobr>中文rrrrrrrrrrr</nobr>"

    upload_2021-7-7_1-10-55.png
     
    PigDa likes this.
  4. PigDa

    PigDa

    Joined:
    Aug 22, 2017
    Posts:
    6
    WOW amazing!
    I didn't know before it able Line breaking rules
    <nobr> is working for me ,thanks a lot!
     
  5. dujimache

    dujimache

    Joined:
    Dec 17, 2011
    Posts:
    89
    how to avoid any line breaking? for example : <nobr>rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr rrrrrrrrrrrrrrr rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr</nobr>

    it still breaks at space position and get a result :
    upload_2021-7-9_14-25-27.png

    how to stop any line breaking???
     
  6. Stephan_B

    Stephan_B

    Joined:
    Feb 26, 2017
    Posts:
    6,595
    Given the following text: "<nobr>AB CD</nobr><zwsp><nobr>EF GH</nobr>"

    We get the following on a single line.
    upload_2021-7-8_23-40-20.png

    Then once we can no longer fit everything on a single line, we break on the <zwsp> as seen below.
    upload_2021-7-8_23-41-44.png

    Then as we can no longer fit any of the no break groups on a single line, we actually break where it is legal which is at the space between AB and CD.
    upload_2021-7-8_23-43-11.png

    If we were to further reduce the width of the text container, we would then break between EF and GH. Then pass that point, we begin breaking on individual characters.

    So in your example with all those "r" since we can no longer respect the no break, we break on those spaces between the sequences of "r"s.

    Now if we had the following text:

    "AAAA<nbsp>BBBB<nbsp>CCCC"

    Would we expect the text to break on the last C or to break at those <nbsp> if again we could no longer respect the no break?

    What is the use case for breaking those individual "r" or breaking on the last C above?

    I can certainly rationalize breaking on the last C given we added those <nbsp> in the example above. But in the case of those "r" and given the text does contain logical breaks (spaces) it would seem logical to respect those when we can no longer respect the no break. Thoughts...
     
  7. dujimache

    dujimache

    Joined:
    Dec 17, 2011
    Posts:
    89


    Thank you, in some cases, I don’t want to wrap the whole word,
    but only part of the word. for example:
    upload_2021-7-12_11-8-23.png
    got:
    upload_2021-7-12_11-9-25.png
    what i expected is :
    upload_2021-7-12_11-9-44.png
     

    Attached Files:

  8. Stephan_B

    Stephan_B

    Joined:
    Feb 26, 2017
    Posts:
    6,595
    In your 2nd example, is that text enclosed in <nobr> tags or do you expect it to not break on the white space between "g h"?
     
  9. dujimache

    dujimache

    Joined:
    Dec 17, 2011
    Posts:
    89
    Last edited: Jul 14, 2021
  10. Stephan_B

    Stephan_B

    Joined:
    Feb 26, 2017
    Posts:
    6,595
    Correct. As per one of my previous posts, when the text enclosed in the <nobr> tag can no longer fit on a single line, we then break the text using logical break points which are typically white spaces. When this is no longer possible, we then break individual words.

    My previous post to you was a question about whether or not your example was using the <nobr> tag which it appears it was.

    I am not opposed to adding additional line breaking modes, I am just trying to get a clear understanding of the use case to make sure these new modes achieve the desired result. So back to one of my previous questions:

    In the current releases of the TMP package, If the text was "<nobr>abcdefg hijkl</nobr>" where the width of the text container is such that the entire text doesn't fit on a single line, we would break on the space between "g" and "h".

    However, if the text was "<nobr>abcdefg<nbsp>hijkl</nobr>" where we would then break on the "k" as per your example, would that be acceptable?

    The key difference here is having to use the non-breaking space <nbsp> to indicate that we do not want to break between these two words.

    P.S. I made the change where using a <nbsp> will force breaking per character as described above.

    In this first example, the text is "AB <nobr>CD EF</nobr> GW" resulting "CD EF" wrapping as a single word and then between "CD" and "EF" when we are forced to break it. This is the current behavior.

    2021-07-14_0-07-36.gif

    In this 2nd example, we revise the text as follows: "AB <nobr>CD<nbsp>EF</nobr> GW" to use <nbsp> to force breaking "CD EF" per character when it no longer fits on a single line.

    2021-07-14_0-08-32.gif
     
    Last edited: Jul 14, 2021
  11. dujimache

    dujimache

    Joined:
    Dec 17, 2011
    Posts:
    89
    Character content is uncertain,so it's hard to add <nbsp>, if there is a checkbox to control character wrapping, that will be very convenient.
     
  12. Alien1997

    Alien1997

    Joined:
    Jul 8, 2017
    Posts:
    3
    yeah, encountered the same problem. if there is a wrapping mode, that will be better.