Can the ECS/DOTS be the solution to my framerate problem? If so, how to approach it? Where to start?

konstantin_lozev · Apr 17, 2019

Hi all,
I am new to game development, but I like it very much as a hobby. I like learning by doing and solving hurdles on the way.
I am prototyping a VR game for the Oculus Go, so high requirements for min 60fps in stereo, on a very modest hardware. Here is a very early version of my prototype from August last year
I stopped developing for 4-5 months, but since 2-3 weeks I am back on the project. My main goal now is to optimise the game to run on very large grids.
As you can see, the game consists of a large amount of "nodes" and "links" between them that are procedurally generated each time you start a new game. You will find a summary of the game logic at the end of this post.
I have taken down the drawcalls to a very low number, because I use Graphics.DrawMeshInstanced for the "links" and Renderer.SetPropertyBlock for changing the color of the "nodes". The Text Mesh Pro elements also batch well, apparently.
For a grid of 1093 "nodes" and 3-4 times more "links", the Stats window shows 22 batches and 4782 saved by batching: https://i.imgur.com/NRROtd2.jpg

The Stats window shows 229 fps, but that is on my PC.
In the Oculus Go initially, when the Text Mesh Pro components are not enabled, things look OK at 60 fps
https://i.imgur.com/FFfRWNg.jpg

However, once the Text Mesh Pro elements show up too, the framerate drops to around 50 fps (on a monitor that's not bad, but in VR it creates stuttering):
https://i.imgur.com/8dtMrhT.jpg

It is even worse when I rotate the camera around the grid, at 40 fps it is simply unacceptable as a VR experience:
https://i.imgur.com/zmjSWmM.jpg

I guess that the framerate drop is due to many GameObjects being drawn at the same time (that is 1093 icospheres and 1093 GameObjects with a Text Mesh Pro component). I think it gets even worse when I rotate the camera around the grid, because I have a method that goes through the array of transforms for each GameObject that has a Text Mesh Pro component and orients it towards the camera. It looks like that:

Code (CSharp):

void OrientNodes(){

Vector3 camPos=camContainer.position;

Vector3 camUp=camContainer.up;

for (int i = 0; i < allLvlArrayNum; i++) {

AllLvlArrayCenter [i].LookAt (camPos, camUp);

}

}

So I think I am out of ideas as to how to increase the performance with the classic Unity framework. Am I right in my assumption?
I read a little about ECS/DOTS and watched this introduction by Mike Geig
, but did not understand all of it. So, having looked at this scenario and the summary of the game logic below, do you think ECS/DOTS could be a solution for my performance problems? If so, where can I start from? How do I even plan what part will be in ECS/DOTS and how to combine the two parts? Does ECS/DOTS even support sphere colliders and does it interact with the Unity physics raycasting (I use raycasts for selecting and operating on the "nodes")? I guess there is no "one-click" conversion to DOTS (I wish there was), but I would prefer to get the full (potential?) benefits without too much change.

Thanks in advance to all. I realise it is a long post, but I needed to explain how things work in order to give you a better idea what issues I am facing.

Summary of the game logic:
I wrote my own very simple instanced shaders that support an instanced color, simple diffuse effect, vertice collapse (for disabling the rendering when I change the alpha of the material to 0) and recently (as an option in some of the shaders) dimming based on the distance from the camera.
The "links" between the nodes are not GameObjects and I render them in batches of 1024 with instanced rendering Graphics.DrawMeshInstanced from the script.
The nodes are simple icospheres that I created in Blender. The nodes also use an instanced material with instanced color, simple diffuse effect, etc. The "nodes", however, are GameObjects that I instantiate at the start of each game, because they have a sphere collider that enables me to interact with them (select, open, mark as mine, etc.). Each "node" also has an empty child object ("center") at the center of each node and a Text Mesh Pro element as a child object of the "center" that is offset towards the front of the icosphere. That enables me to rotate the Text Mesh Pro element to always face the camera when you are moving around the grid, while the parent icosphere does not rotate.
I am running all the game logic on a main GameManager.cs script and apart from the procedural generation of the grid, which happens only at the start of the game, the code is not that heavy, I think. The only heavy part might be the constant reorientation of all "nodes" towards the camera when you rotate the camera around the grid.
While creating the grid, I keep a number of 1 or 2-dimensional arrays. Most of the arrays consist of integers that describe the structure of the grid, i.e. how many "links" each "node" has, to which other "nodes" does each "node" link, how many adjacent "mine-nodes" each "node" has and then from the perspective of each "link", which 2 "nodes" that "link" connects to. I keep an array with the the Vector3 position of each "node". For the instanced rendering of the "links", I keep 2 arrays of Vector3 (for the position and localscale) and 1 array of Quaternions (for the rotation). I also have some bool arrays that contain the state of each node (is it selected or not, is it a mine or not, is it already revealed or not, does it show text or not etc.). I also have arrays of the transforms of the "nodes", of the "centers" and an array of the Text Mesh Pro elements.

FrankvHoof · Apr 17, 2019

Depending on how your game is set up, you might want to just stick to MonoBehaviours, but run a job for specific parts (like your billboarding), to prevent you from having to re-write a lot of your code to support ECS.
Take a look at e.g. https://github.com/stella3d/job-system-cookbook for some examples of how to use jobs

konstantin_lozev · Apr 17, 2019

FrankvHoof said: ↑

Depending on how your game is set up, you might want to just stick to MonoBehaviours, but run a job for specific parts (like your billboarding), to prevent you from having to re-write a lot of your code to support ECS.
Take a look at e.g. https://github.com/stella3d/job-system-cookbook for some examples of how to use jobs
Click to expand...

Thanks. I have read that the jobs system scales with core counts and on the Snapdragon 821 (found in Oculus Go) there are 4 cores on the CPU, so that might already be enough.
What difference in performance can I generally expect between going only with the jobs system vs pure ECS/DOTS (that might sound like a stupid question...)?
If I go only with the Jobs system, would it be easier or harder after to transfer to ECS/DOTS if I want to?
I am afraid that if I go only with implementing the billboarding with the Unity's jobs system, I will still be stuck at around 50 fps (the framerate when the OrientNodes() method is not called)
Also, if I want to go the ECS/DOTS route, would I have to re-write a lot? I have no idea how much that would involve, but I am also tentatively looking at it as a learning experience and a challenge to learn something new (it's all a hobby for me).
I am mostly worried whether the ECS has straightforward support for colliders and the Unity's raycasting, because the gameplay is dependent on selecting and marking things with the Go controller, which uses a Physics.Raycast to determine on which node you are operating.

NoDumbQuestion · Apr 17, 2019

It not about how much you have to rewrite, it's about time when shifting your code/solution thinking from OOP to DOTS. (you are looking at 2-8 weeks learning curve here)

And new ECS physics support raycast and collider. Also your problem might bump into here will be shared material instance not use GPU indirect instance + runtime - mesh while you can just use billboard instead of pipe mesh (it just a stick connection).

Antypodish · Apr 17, 2019

Also, instead spheres, you can also use billboards. LOD may be your friend here.

You can apply mesh instancing with Classic OOP as well. See how that result with performance.

As already mentioned, moving to ECS may cost precious time. A lot of it. Instead of focusing on game dev. But yes you could potential gain quite a bit o performance, just on rendering itself.

konstantin_lozev · Apr 17, 2019

NoDumbQuestion said: ↑

Also your problem might bump into here will be shared material instance not use GPU indirect instance + runtime - mesh while you can just use billboard instead of pipe mesh (it just a stick connection).
Click to expand...

Pardon me, but I did not get that part at all...
Thanks a lot for answering that the ECS supports colliders and physics raycasting. Where can I see an example of colliders and raycasts implemented in ECS? I would like to start with a simple example, like a sphere with a sphere collider where upon a mouse click I check if the raycast intersects with the sphere collider and I change that sphere's color with Renderer.SetPropertyBlock? If I know how that looks like, I think I would have an idea how much it would take me to transfer to ECS.

NoDumbQuestion · Apr 17, 2019

https://github.com/Unity-Technologies/EntityComponentSystemSamples/tree/master/UnityPhysicsExamples

konstantin_lozev · Apr 17, 2019

Antypodish said: ↑

Also, instead spheres, you can also use billboards. LOD may be your friend here.

You can apply mesh instancing with Classic OOP as well. See how that result with performance.

As already mentioned, moving to ECS may cost precious time. A lot of it. Instead of focusing on game dev. But yes you could potential gain quite a bit o performance, just on rendering itself.
Click to expand...

I do not think the polycount is an issue here. The game works at 60 fps when the icospheres are on the scene only. It slows down when the Text Mesh Pro elements are enabled. As I wrote, I do apply mesh instancing with Graphics.DrawMeshInstanced for all the "links". They are not GameObjects. I cannot apply the same approach to the spheres since I have sphere colliders on them.

I actually did come with a solution that does not go under 60 fps and that is by using a texture atlas with instancing for the different combinations of colors and numbers 0-9
https://i.imgur.com/eDHInWt.jpg

However, while performing at 60 fps, the whole sphere has to face you all the time, which does not let the light travel along its surface as you are rotating around it and ultimately makes the sphere look flat due to that, especially in VR.

konstantin_lozev · Apr 17, 2019

NoDumbQuestion said: ↑

https://github.com/Unity-Technologies/EntityComponentSystemSamples/tree/master/UnityPhysicsExamples
Click to expand...

Cool, thanks a lot, I will dig into those. One more question on the abilities of ECS: would I be able to change the text of a Text Mesh Pro element? Or is Text Mesh Pro not convertible to an entity at the moment?

Antypodish · Apr 17, 2019

Can you test, when you hide halve of text meshes, if you gain FPS during camera rotation?

Antypodish · Apr 17, 2019

konstantin_lozev said: ↑

One more question on the abilities of ECS: would I be able to change the text of a Text Mesh Pro element? Or is Text Mesh Pro not convertible to an entity at the moment?
Click to expand...

You would need keep it as ECS Hybrid or just on OOP side atm.

NoDumbQuestion · Apr 17, 2019

Gameobject is convertable to ECS entity. And for number on cube, I would go bake it to sprite then show it like a billboard always face camera. If you understand how shader work. You could make the number always see through object too. Like in sample of AmplifyShaderEditor

konstantin_lozev · Apr 17, 2019

Antypodish said: ↑

Can you test, when you hide halve of text meshes, if you gain FPS during camera rotation?
Click to expand...

You know what, your comment gave me an idea to dynamically disable each "center" GameObject (the parent of each Text Mesh Pro element) if it is further than 6 units from the camera.
Here is the code:

Code (CSharp):

void OrientNodes(){

Vector3 camPos=camContainer.position;

Vector3 camUp=camContainer.up;

for (int i = 0; i < allLvlArrayNum; i++) {

if (isShowingText [i]) {

if (Vector3.Distance (NodePositions [i], camPos) > fogLength) {

if (isBehindFog [i] == false) {

isBehindFog [i] = true;

if (textFieldsOn) {

AllLvlArrayCenterGO [i].SetActive(false);

}

}

} else {

if (isBehindFog [i]) {

isBehindFog [i] = false;

if (textFieldsOn) {

AllLvlArrayCenterGO [i].SetActive (true);

}

} else {

if (textFieldsOn) {

AllLvlArrayCenter [i].LookAt (camPos, camUp);

} else {

AllLvlArray [i].transform.LookAt (camPos, camUp);

}

}

}

}

}

}

Now I get 60 fps when I am not rotating around the grid:
https://i.imgur.com/aqWKJKN.jpg

I still get into mid-50s fps when rotating, but it's a big improvement:
https://i.imgur.com/Si84N3g.jpg

Thanks again

Antypodish · Apr 17, 2019

Good.
But now out of interest, I would do similar test, but with halve (or after n distance) disabled spheres and links.
If you get more than 60 FPS when rotating, that will be at least known, that you need cut down on something.
If that would be the case, LOD could help. Otherwise, you will need look in other form of improvement.

konstantin_lozev · Apr 17, 2019

Antypodish said: ↑

Good.
But now out of interest, I would do similar test, but with halve (or after n distance) disabled spheres and links.
If you get more than 60 FPS when rotating, that will be at least known, that you need cut down on something.
If that would be the case, LOD could help. Otherwise, you will need look in other form of improvement.
Click to expand...

I simply chose to disable the nodes at the same distance of 6 units and I did get almost to 60 fps when rotating around the grid:

Code (CSharp):

void OrientNodes(){

Vector3 camPos=camContainer.position;

Vector3 camUp=camContainer.up;

for (int i = 0; i < allLvlArrayNum; i++) {

if (isShowingText [i]) {

if (Vector3.Distance (NodePositions [i], camPos) > fogLength) {

if (isBehindFog [i] == false) {

isBehindFog [i] = true;

if (textFieldsOn) {

AllLvlArray [i].SetActive (false);

AllLvlArrayCenterGO [i].SetActive(false);

}

}

} else {

if (isBehindFog [i]) {

isBehindFog [i] = false;

if (textFieldsOn) {

AllLvlArrayCenterGO [i].SetActive (true);

AllLvlArray [i].SetActive (true);

}

} else {

if (textFieldsOn) {

AllLvlArrayCenter [i].LookAt (camPos, camUp);

} else {

AllLvlArray [i].transform.LookAt (camPos, camUp);

}

}

}

}

}

}

and I got this
https://i.imgur.com/V551TQw.jpg

This is only for testing however, it breaks the gameplay quite a bit.
How would LOD work? Do I simply have 2 renderers on the same object and stitch them on/off alternating depending on the distance from the camera? Or is this something built in?
As I said, I am new to game development, so all this is quite new to me.
Also, I just read of frustum culling. Is it on by default? It might be useful when you are in the middle of the grid not to render nodes that are behind your back.
Sorry, I know we are moving away from the topic of this section of the forum...
For the time being I will not switch to ECS, but will have a look at it in a few months, hopefully by then there might be some progress in allowing to have Text Mesh Pro to be integrates with ECS.

Antypodish · Apr 17, 2019

LOD, functionality is available in Unity as feature. But also you can code own.
LOD simply swaps between objects of higher poly count, to lower poly count, based on distance.
Camera culling by default does not renders outside frustum. That includes behind camera as well.

konstantin_lozev · Apr 17, 2019

Antypodish said: ↑

LOD simply swaps between objects of higher poly count, to lower poly count, based on distance.
Click to expand...

Would it not be more efficient to have 2 renderers on the same GameObject and to swap them?

Antypodish · Apr 17, 2019

Well, saying more correctly, you swap meshes.

konstantin_lozev · Apr 17, 2019

Antypodish said: ↑

Well, saying more correctly, you swap meshes.
Click to expand...

Ah, I think I overestimated the Oculus Go hardware. Under the Oculus's own guidelines you have to be under 100,000 vertices https://developer.oculus.com/documentation/unity/latest/concepts/unity-perf/#unity-perf-targets and with the 1093-node grid I am at 332.600 vertices
https://i.imgur.com/NRROtd2.jpg

I will definitely be looking into LOD then... ECS won't help with that.

Antypodish · Apr 17, 2019

konstantin_lozev said: ↑

I will definitely be looking into LOD then... ECS won't help with that.
Click to expand...

Potentially could. But generally is good to keep as few number of polys, as possible. That is irrelevant of the tech you are using.

Edit: if you can, don't use text mesh object on your nodes. Use texture when possible on the node, with a number. You will cut by one object per node.

konstantin_lozev · Apr 17, 2019

Antypodish said: ↑

Edit: if you can, don't use text mesh object on your nodes. Use texture when possible on the node, with a number. You will cut by one object per node.
Click to expand...

As I showed above, I did implement also the most efficient way, which is with a texture atlas, but it did not look good, so I went back to Text Mesh Pro. I really like the SDF shader of TMP, it gives great crisp image at any distance, which is quite important for the smooth gameplay (things are already quite fuzzy in VR, so any additional aliasing/fuzziness only compounds that). I will try with a high-res texture just to be sure.
Thanks so much for the help!

konstantin_lozev · Apr 21, 2019

Antypodish said: ↑

Potentially could. But generally is good to keep as few number of polys, as possible. That is irrelevant of the tech you are using.

Edit: if you can, don't use text mesh object on your nodes. Use texture when possible on the node, with a number. You will cut by one object per node.
Click to expand...

I did some further work by separating the rendering of the nodes into batched instanced rendering with Graphics.DrawMeshInstanced and only left the spherecollider as a gameobject and the Text Mesh Pro element as a child. After this change, when the Text Mesh Pro element is disabled, my peformance jumped to more than 1000 FPS (which means plenty of room for the Oculus Go):
https://i.imgur.com/2mrhDNP.jpg

This is also reflected in the Profiler:
https://i.imgur.com/U64SjYT.jpg

I have not used the Profiler up until now, so I don't know if those gaps in the Main CPU thread are normal (I think they are normal, since it says Editor overhead, so in the compiled apk there will not be that overhead), but it's obvious that my main script takes less than 0.2 ms.
However, when the Text Mesh Pro elements are all on, the CPU time jumps to 2 ms, more than half of which is taken by the 1092 instances of the Text Mesh Pro script
https://i.imgur.com/zblrCEg.jpg

So, to conclude, with my latest optimisations the only bottleneck is the Text Mesh Pro element, not the number of vertices.
From what I read further about the ECS, there is no way to use even hybrid ECS for the Text Mesh Pro elements.
Please correct me if I am wrong.
So I am left with either disabling the Text Mesh Pro element at a certain distance, or using the texture atlas, neither of which is ideal...

Antypodish · Apr 22, 2019

That sounds about right.
Text Mesh Pro Hybrid approach wont give you any benefit of ECS.
Hence you need consider already mentioned alternative.
Since you got only number on nodes, you can have own quad mesh, Or whatever shape is, and apply number texture on it, to see, how it will perform. You can use transparency, or cutoff on texture.

konstantin_lozev · Apr 22, 2019

Antypodish said: ↑

That sounds about right.
Text Mesh Pro Hybrid approach wont give you any benefit of ECS.
Hence you need consider already mentioned alternative.
Since you got only number on nodes, you can have own quad mesh, Or whatever shape is, and apply number texture on it, to see, how it will perform. You can use transparency, or cutoff on texture.
Click to expand...

FYI, my issues with the Text Mesh Pro script are apparently fixed in later versions https://forum.unity.com/threads/man...t-is-a-possible-solution.665614/#post-4455241
I will try the new Text Mesh Pro package.

GilCat · Apr 22, 2019

I'm also facing a similar challenge.
My bottleneck is on UGUI.Rendering.EmitWorldScreenspaceCameraGeometry (only if the gameObject changes transform values).
In my case i'm rendering big graphs (+10 nodes sometimes reaching 50k) where i have text labels on each node. Everything would be fine if my labels wouldn't move but they will be moving too often and my text wasn't dynamic.
My take on this is to have a culling system for it with a threshold of how many of them i will allow to be active at the same time while have them all synced into DOTS (transforms and rects), this way i can sort them very fast using bursted jobs for culling the objects further from the camera.
Would be great if one day we would have UI Text in DOTS

konstantin_lozev · Apr 23, 2019

GilCat said: ↑

I'm also facing a similar challenge.
My bottleneck is on UGUI.Rendering.EmitWorldScreenspaceCameraGeometry (only if the gameObject changes transform values).
In my case i'm rendering big graphs (+10 nodes sometimes reaching 50k) where i have text labels on each node. Everything would be fine if my labels wouldn't move but they will be moving too often and my text wasn't dynamic.
My take on this is to have a culling system for it with a threshold of how many of them i will allow to be active at the same time while have them all synced into DOTS (transforms and rects), this way i can sort them very fast using bursted jobs for culling the objects further from the camera.
Would be great if one day we would have UI Text in DOTS
Click to expand...

There is one line that I needed to disable in the TMPro_Private.cs and it solved my issue. Now I get faster framerate with culling off than with culling on (due to the fact that I have to check the distance to the camera each frame)!
Have a look at this thread.
https://forum.unity.com/threads/man...n-a-scene-what-is-a-possible-solution.665614/

GilCat · Apr 24, 2019

konstantin_lozev said: ↑

There is one line that I needed to disable in the TMPro_Private.cs and it solved my issue. Now I get faster framerate with culling off than with culling on (due to the fact that I have to check the distance to the camera each frame)!
Have a look at this thread.
https://forum.unity.com/threads/man...n-a-scene-what-is-a-possible-solution.665614/
Click to expand...

I've tried that and it made no difference on the framerate, i've also tried the suggestions from that thread and as i've stated before at some point (>3k) the UGUI.Rendering.EmitWorldScreenspaceCameraGeometry will take over everything
Anyway i think i've hit the limit of what hybrid Text can achieve. I can draw 2k simultaneous texts just fine but my graphs have can have 40k and culling is something i will need anyway because of overlapping text.
What is now taking up the most cpu is from the systems that are syncing the data between monoBehaviours and DOTS (CopyTransformToGameObject kind of systems).
If yours graphs are ~1k nodes it shouldn't be a problem.

konstantin_lozev · Apr 24, 2019

GilCat said: ↑

I've tried that and it made no difference on the framerate, i've also tried the suggestions from that thread and as i've stated before at some point (>3k) the UGUI.Rendering.EmitWorldScreenspaceCameraGeometry will take over everything
Anyway i think i've hit the limit of what hybrid Text can achieve. I can draw 2k simultaneous texts just fine but my graphs have can have 40k and culling is something i will need anyway because of overlapping text.
What is now taking up the most cpu is from the systems that are syncing the data between monoBehaviours and DOTS (CopyTransformToGameObject kind of systems).
If yours graphs are ~1k nodes it shouldn't be a problem.
Click to expand...

maybe a stupid question, but are you using a canvas or a mesh? I use the mesh option of TMP.

GilCat · Apr 24, 2019

I use a single World Canvas for all the Text.

GilCat · Apr 24, 2019

Oh i do get even better performance if i use TextMeshPro (Mesh option) instead of TextMeshProUGUI.
Now i can draw 4k simultaneous text elements.

EDIT: And commenting the line in TMPro_Private.cs makes just a little bit of difference.

konstantin_lozev · Apr 24, 2019

GilCat said: ↑

Oh i do get even better performance if i use TextMeshPro (Mesh option) instead of TextMeshProUGUI.
Now i can draw 4k simultaneous text elements.

EDIT: And commenting the line in TMPro_Private.cs makes just a little bit of difference.
Click to expand...

Cool

Arowx · Apr 25, 2019

Have you tried using quads/sprites for the text data as you should be able to have thousands of those even on low end hardware. And you could have a set of them on a circular background for distant nodes and a transparent background for close nodes when you draw the icospheres.

Have you considered setting the nodes out in a 3d grid pattern this would make them look much easier to understand and navigate and keep with the grid square style of minesweeper.

You could also add distance fog and frustum culling (limit the viewing distance of the camera) and give the player some UI elements that let them know the overall state of the game.

konstantin_lozev · Apr 25, 2019

Arowx said: ↑

Have you tried using quads/sprites for the text data as you should be able to have thousands of those even on low end hardware. And you could have a set of them on a circular background for distant nodes and a transparent background for close nodes when you draw the icospheres.
Click to expand...

Yeah, LOD approaches were suggested above. The most performant approach is by having an instanced texture atlas that incorporates the numbers and colors in a grid, but the result is not as clear image as with TMP. I will have to think how to implement LOD for both the icoshperes and the TMP elements. The problem with implementing LOD on the icospheres is that I use now batched instanced rendering for them. I can still have two separate sets of batches depending on how far the node is from the camera, but not sure that would be more performant, since the "hiding" that I implement in the node's shader is by collapsing the vertices, which would still mean that they are calculated even for the "hidden" nodes. For the TMP elements, I don't think there would be a big difference between that and quads with transparency (at least that's what the TMP dev is saying).

Arowx said: ↑

Have you considered setting the nodes out in a 3d grid pattern this would make them look much easier to understand and navigate and keep with the grid square style of minesweeper.
Click to expand...

I am actually particularly after this random procedurally generated grid's look. It is a bit ridiculous for >1000 nodes, but for lower number of nodes it is quite entertaining to have the shape rotate in front of you in VR. If you have an Oculus Go, I can send you an apk to test.

Arowx said: ↑

You could also add distance fog and frustum culling (limit the viewing distance of the camera) and give the player some UI elements that let them know the overall state of the game.
Click to expand...

I do have a setting that uses distance fog (I modified the basic shader slightly https://forum.unity.com/threads/tex...to-dim-based-on-distance.661549/#post-4430314). I also added culling, but unless you force very close fog and set the culling at the distance where the fog is >90%, the result is quite jarring.

Search Unity

Unity ID

Useful Searches

Can the ECS/DOTS be the solution to my framerate problem? If so, how to approach it? Where to start?