Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Mesh creation performance?

Discussion in 'General Graphics' started by z000z, Sep 14, 2017.

  1. z000z

    z000z

    Joined:
    Dec 31, 2014
    Posts:
    96
    I'm working on a system that dynamically loads in data and has to create meshes for the data as I load it in. Currently I have a background thread creating the vertices, normals, uvs, and triangle indices structured like this:

    List<Vector3> verts;
    List<Vector3> normals;
    List<Vector2> uvs;
    List<int> indices;

    Once they're all finished creating on the main thread I create the Mesh (since unfortunately you can't create Unity objects on background threads). Currently the way I do that is like this:

    var mesh = new Mesh();
    mesh.SetVertices(verts);
    mesh.SetNormals(normals);
    mesh.SetUVs(0,uvs);
    mesh.SetTriangles(indices, 0, false);

    I'm seeing some odd performance results though from this in the profiler. Here's my worst case profiler results for each of those lines:

    SetVertices - 2.04ms
    SetNormals - 2.65ms
    SetUVs - 5.03ms
    SetTriangles - 0.83ms

    You can assume this is a max vertex buffer of 65k vertices/Normals/Uvs. It strikes me as a little weird though that setting the UV's is taking twice the time as vertices and normals, especially since it should be the smallest amount of data (vector2's instead of vector3s). Each piece is pretty long in general for setting the data.

    Currently my objects are actually built out of multiple meshes since they have more than 65k vertices, the worst case currently is an object that has 12 meshes and ends up taking about 60ms to create all 12 (not all 12 meshes are full 65k buffers, have to create more meshes to order transparent parts). Anyhow 60ms is obviously way longer than I have in a single frame, I can try to split the mesh creation over several frames to avoid the frame drops but when I can have 10-11 ms to just create a single mesh even doing 1 per frame is going to hit the frame rate hard.

    So I'm hoping there's some faster method of building a mesh, I don't modify the mesh after initial creation so these don't need to be dynamic, although it seems the only way to have a readonly mesh is at build time instead of at runtime that I've found so far.
     
  2. z000z

    z000z

    Joined:
    Dec 31, 2014
    Posts:
    96
    I decided to go ahead and convert the lists into arrays in the background thread and see how much of a difference that made, it did lower the main thread time as you can see in the profiler picture, but it increased the background thread a bit as well (so a bit more of a delay before it shows up). Probably worth keeping it this way though.

    The code in the mesh creation changed to:

    var mesh = new Mesh()
    {
    vertices = _v,
    normals = _n,
    uv = _u,
    triangles = _i
    };

    Which I guessed would be the optimal way to pass in the arrays, since accessing .vertices, .normals, etc says it creates a copy. But perhaps there's an even better way?

    The amount of time still seems pretty high for just passing an array around, and the UV's are still mysteriously much more intensive than the others. Keep in mind this is performance on my laptop, the target is mobile platforms though so significantly less processing power there. Would do profiling on the devices, except you can't do deep profiling there.

    Now worst case for each type of array is looking like:

    Vertices - 0.56ms
    Normals - 1.27ms
    UVS - 1.82ms
    Triangles - 0.45ms

    So about 4x faster setting arrays, but still too slow, if you notice my total time is 32.97ms for the 12 Meshes (really 1 mesh but split since the 65k limit and for transparency ordering [approximately 600k vertices]

    Screen Shot 2017-09-15 at 9.52.10 AM.png
     
  3. Martin_H

    Martin_H

    Joined:
    Jul 11, 2015
    Posts:
    4,429
    I don't think I'm quite at the level where I can directly help you, but I've played with some mesh generation and profiling and want to share a few ideas: if you do deep profiling in the editor, the deep profiling itself skews the results heavily and can cause more delay than the actual task itself takes. The time certain operations take doesn't necessarily scale linearly from desktop to mobile. I would strongly suggest to find a way to test performance in a realistic scenario: meaning build on the target device, no profiler attached. You can experiment with scaling up the amount of stuff you calculate and using the stopwatch class from (iirc) System.Diagnostics.Stopwatch and logging the results. When I tried that it didn't seem capable of better precision than 1ms increments, so you might want to scale up to make the margin of error between measuring different tasks lower. I assume you're already avoiding all GC allocations? If not that'd be the thing I'd try to fix first. Are you using this already?
    https://docs.unity3d.com/ScriptReference/Mesh.MarkDynamic.html
     
  4. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    6,389
    Any reason why you don't use Profiler.BeginSample/EndSample?
     
  5. Martin_H

    Martin_H

    Joined:
    Jul 11, 2015
    Posts:
    4,429
    Main reason being I don't have any experience with it. My goal was getting some kind of performance metric without attaching the profiler, because the profiler overhead in my case was significant. The docs there read like this doesn't really achieve this, or am I wrong?

    "This will show up in the Profiler hierarchy. Profiler.BeginSample is conditionally compiled away using ConditionalAttribute. Thus it will have zero overhead, when it is deployed in non-Development Build."

    When it's compiled away in a non-dev build, it won't help you track anything, right? Or did you mean it as a way to profile specific sub tasks without using deep profile, which has much higher overhead than the regular profiler?
     
  6. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    6,389
    Right. Deep Profile is useless if you want to know how fast something really is. Profiler.BeginSample is useless as well, if it's called a gazillion times, like with Deep Profile.

    However, if you put a Profiler.BeginSample/EndSample block around your high-level "ComputeMesh" method that is called only a few times per frame, I believe it's pretty accurate then.
     
  7. z000z

    z000z

    Joined:
    Dec 31, 2014
    Posts:
    96
    Those are good points, I was thinking of making a synthetic test anyhow so that I can see the timings of what's happening on the background thread. Since as far as I know Unity's profiler doesn't show things not on the main thread, at least every time I stick something on a background thread it disappears out of the profiler.

    And yes I'm avoiding GC allocations as much as possible, I'm not seeing any hiccups from the GC at this time. I don't use MarkDynamic because I'm not re-using the meshes, I create them once and then render with them. Ideally there would be a MarkStatic/Readonly but it appears the only way to have readonly meshes is to load them as part of the scene.
     
  8. z000z

    z000z

    Joined:
    Dec 31, 2014
    Posts:
    96
    I went ahead and created the synthetic test and added Profiler.beginsample endsamples around the mesh creation, and on Android I'm seeing roughly the same thing as I was reporting. Since the way I'm creating the mesh is through object initialization I can't actually see the individual arrays on Android (no deep profiling). But I can see the overall mesh creation


    Android (Samsung Tab S3)
    Creating Mesh with arrays - 12 - 13ms
    Creating Mesh with lists - 8-9ms

    So it appears at least on Android that using lists is actually more performant. Ideally though I'd like to get that significantly lower, but haven't found any other methods.
     
    Last edited: Sep 26, 2017