TextMesh Pro Appears to send an additional unused quad to gpu for each text component.

Noisecrime · Jun 5, 2021

I was looking into UI rendering of TextMeshPro via RenderDoc and noticed that in many (all?) cases the text component appears to send an unused 'phantom' quad to the gpu, in the form of an additional 6 vertices being appended to the end of the vertexbuffer.

This is with Unity 2019.4.f21 using TMPro package 2.1.6 in a fresh new project created yesterday.

In this screenshot above from RenderDoc it is displaying the default 'new text' string from adding a new 'TextmeshPro - Text' gameobject to an empty canvas. The yellow quads are each character and the blue dot shows the first of the 6 unused vertices. The dark blue line in the table shows the first of the 6 unused vertices. It should be noted that whilst the positions are zero in this screenshot they can actually be any value, but they are always exactly the same value for each vertex.

Its interesting that each of the unused vertices have the same vertex position, meaning the two triangles they define are degenerative and as such should be rejected by the gpu when it renders them. I assume therefore that gpu wise this has very little cost, but there is a cost in wasted bandwidth to send/upload the vertex buffer and on the cpu side with both TMPro and Unity UGUI processing this quad.

Further to these single phantom quads for each TMPro Text component I noticed greater number of phantom quads when changing the text content to one of a shorter length. For example replacing 'new text' with just 'a' would result in the same vertex buffer length but only the first 6 vertices had 'real' data.

Although I didn't carry out extensive tests here it should be noted that quitting and re-opening Unity 'resolved' the vertex buffer length in the case above, so the single 'a' text went back to 12 vertices ( not 6 as one might expect due to constant unused single quad overhead) from 48 vertices. I imagine that perhaps reloading the scene would have the same effect and maybe there are other cases that TMPro/UGUI would resize the vertex buffer.

Interestingly if I create a gameObject then add the TextMeshpro Text component to it I do not observe the phantom quad. But upon re-opening the project the same text component the phantom quad is back.

So yeah TLDR
TMPro/UGUI appears to add a phantom quad of 6 vertices to the vertex buffer and has some issues with not resizing vertex buffers, but that might be more by design.

Impact consideration
In the grand scheme of things an additional quad when you might be rendering 10's to 100's or even 1000's of characters in a single Text component isn't anything to be concerned about. However I'd been exploring using TMP to render UI icons, so each component might only have a single character and there it would double the number of quads. Again even here it probably isn't terrible as we are still dealing with low numbers of phantom quads but at the same time I don't think its completely impossible to end up with a situation where this behavior could impact performance, especially on lower end devices.

With respect to not resizing the vertex buffers I suspect this might be by design where either TMPro or Unity UGUI is electing not to resize the vertex buffers for the component for performance reasons in case the length of the text increases again and it can just reuse it.

However it is a bit of a concern especially if you re-use text components whose content length can vary wildly as you are likely to have a higher than expected overhead. I'm also wondering about place-holder text and whether its better to just put the minimum required such as a single character before building the project. In many cases place-holder text is short but on occasion I have used extensively long Lorem Ipsum strings for testing with but think i'll change that in the future.

Questions
Is this single phantom quad a bug or is it by design for some reason?

Is failing to resize a text components vertex buffer a bug or by design?
If its by design how does it work exactly?
If it never resizes during runtime could we perhaps get a method to force resizing for when developers know they want to gain performance long term over the initial cost of resizing it.

Stephan_B · Jun 8, 2021

Noisecrime said: ↑

Is this single phantom quad a bug or is it by design for some reason?
Click to expand...

This is by design and for performance reasons.

When a text object is first created, the arrays allocated for the geometry will match the text object exactly. As the text grows, these arrays will be allocated in blocks. This is done to minimize the # of times these arrays are resized.

Since all characters are contained within a quad, the Triangles, Normals and Tangents are uploaded only once when these buffers are allocated / resized. This is also a performance optimization as uploading new Triangle structure is very costly.

As the text changes and given we have already uploaded our triangles, normals and tangents, we only need to update the vertices, UVs and vertex colors. All unused vertices are degenerated which is something GPUs are designed to optimize.

Noisecrime said: ↑

Is failing to resize a text components vertex buffer a bug or by design?
Click to expand...

Having to resize these geometry arrays every time a character is added or removed + re-uploading the triangles, normals and tangents would be very costly. Again, it is much more efficient to upload these once and then simply change vertices positions, UVs and colors and to degenerate the rest.

Noisecrime said: ↑

If it never resizes during runtime could we perhaps get a method to force resizing for when developers know they want to gain performance long term over the initial cost of resizing it.
Click to expand...

GPUs are already optimized to deal with degenerate vertices. Resizing arrays is very expensive where it is more efficient to upload more degenerates then to resize those arrays. Of course this is within reason, for instance: If a text object was to go from 8 characters to 10,000 and then back to 8 characters, it would then makes sense to resize those arrays on some devices. TMP already does some of that automatically where if these arrays end up with an excess of 256 characters over what it needs, it will reduce the size of those arrays but even that turned out to be an issue for some users where they would set the text to some large blocks of text and then null a few frames later causing this resizing. So I ended up disabling this because it was causing more issues than anything else.

/// <summary>
/// Determines if the data structures allocated to contain the geometry of the text object will be reduced in size if the number of characters required to display the text is reduced by more than 256 characters.
/// This reduction has the benefit of reducing the amount of vertex data being submitted to the graphic device but results in GC when it occurs.
/// </summary>
public bool vertexBufferAutoSizeReduction

By default this is false where TMP no longer tried to reduce these allocations, it can be manually enabled where if the allocated arrays exceed the required size by more than 256 characters, these data structures will be resized and re-uploaded to the devices. These is no control over this 256 character threshold where I think it would be very cost prohibitive to do that in smaller increments.

GPUs are very efficient at handling large amounts of geometry in addition to even being more efficient with degenerates. Even on old devices, there is no real measurable performance between no degenerate and 1000's of them. They get ignored. So the only cost is the upload cost where again, like I said above, if your text goes from 8 to 10,000 characters and then back to 8 and then gets updated every frame then yes, you would likely want to reduce the size of those arrays.

Noisecrime · Jun 8, 2021

Stephan_B said: ↑

This is by design and for performance reasons.
Click to expand...

Thank you for the detailed explanation and reference to AutoSizeReduction, that's pretty much what I was hoping for.

As for performance I was more concerned about the cpu side and the overhead it might cause, but it sounds like for most cases you have already dealt with this issue in the past and while it might have performance impact the current approach is still more efficient in general. I'm surprised that even with quite a high autoSizeReduction value of 256 characters it could still be more performant to just leave the vertex buffers alone.

Regardless having the option to resize the vertexbuffers should be enough for developers who want to get into it. The only thing I would have preferred is having a method call that would simply resize the vertexbuffers to fit the current text that a developer could call instead of relying on exceeding a specific value. Mostly as I feel its rather a specific requirement that would only be used occasionally and only on one or two specific components.

I'm still not sure your explanation covers the single additional quad, maybe I failed to explain adequately, but in my testing I'm always seeing an additional quad ( 2 degenerate triangles ) at the end of every text component, be it the default 'new text' from adding the gameobject or adding a new component ( which defaults to empty ) to existing gameobject and typing in a single character. Though for the latter the additional quad only appears when reloading the scene ( it may appear earlier, but when testing a single character had a single quad, but on reloading the scene it suddenly had two ). Feels like its done on purpose but I'm at a lose as to why.

Search Unity

Unity ID

Useful Searches

TextMesh Pro Appears to send an additional unused quad to gpu for each text component.

Noisecrime

Stephan_B

Noisecrime