Need suggestions for optimizing changes to mesh data.

Epic-Username · Jan 15, 2018

HI,
I'm working on a voxel mesh generation system and currently the world is made of chunks (each chunk containing 25600 blocks). Each chunk has a script which holds it's vertices, triangles and UV's in lists.

The current system loops through all blocks in the chunk, 25600 * 6 faces = 153600 calculations to generate the mesh from the block data. I don't mind this if i'm loading in the chunk but regarding quick changes to mesh for example if i were to change even 1 block type i would have to update the entire chunk and perform 153600 calculations again, and you can guess what the problem is here.

Since i'm using a list i don't know how i should go about implementing a new system where i would like to only update the adjacent blocks mesh data when changing a block type not the entire mesh.
Can any one give me suggestions regarding this issue?
Thanks

P.S. by block data i mean an array holding the block type (0 = land, 1 = air for example) and sorry if i'm not clear enough about anything just ask and ill try clarify it.

Baste · Jan 15, 2018

Is this a setup where each chunk has one huge mesh, and that's what you're trying to regenerate? Or is there several meshes per chunk? It's not quite clear.

Epic-Username · Jan 15, 2018

Baste said: ↑

Is this a setup where each chunk has one huge mesh, and that's what you're trying to regenerate? Or is there several meshes per chunk? It's not quite clear.
Click to expand...

Each chunk is its own mesh.
I have an array of integers which determine what block type is where for example 0 = land and 1 = air.
The entire chunk mesh is then generated based of this array of block data by looping through every index of that array and checking adjacent positions to check if it is air or not, and if the adjacent block is air it crates a face.
The mesh generation is working properly but it has to loop through the entire array to recalculate the mesh when i change a single block which crates lag. So i'm looking for ways to optimize this.

Epic-Username · Jan 19, 2018

I still haven't found any suggestions, so it would be appreciated if anyone could reply,
Thanks.

DonLoquacious · Jan 19, 2018

Is there any particular reason this is being combined into one mesh to begin with? You can generate each block as a separate mesh and I don't see how that would change anything- you can even render it as 6 planes per cube instead, pool the planes, and handle the whole thing as a 3D-array-based logical construct, instead of a physical one. Destroying one block would simply call "dirty" on all neighboring blocks and tell them to re-check which of their sides should exist based on the new void, and they pull additional planes from the pool to those spots.

But no. Based on your current setup, I really have no idea how you can optimize it- since you're constructing a single mesh to serve as the whole landscape, the only real way to speed things up would be to have a duplicate cache of each individual cube's mesh that went into the final result, so you can start sort of from the halfway point in the process. It would speed things up, but it's twice the memory usage.

Epic-Username · Jan 19, 2018

Lysander said: ↑

Is there any particular reason this is being combined into one mesh to begin with? You can generate each block as a separate mesh and I don't see how that would change anything- you can even render it as 6 planes per cube instead, pool the planes, and handle the whole thing as a 3D-array-based logical construct, instead of a physical one. Destroying one block would simply call "dirty" on all neighboring blocks and tell them to re-check which of their sides should exist based on the new void, and they pull additional planes from the pool to those spots.

But no. Based on your current setup, I really have no idea how you can optimize it- since you're constructing a single mesh to serve as the whole landscape, the only real way to speed things up would be to have a duplicate cache of each individual cube's mesh that went into the final result, so you can start sort of from the halfway point in the process. It would speed things up, but it's twice the memory usage.
Click to expand...

So basically The reason i'm using 1 mesh for each chunk is so that the game can actually run instead of having millions of instances in a single scene which unity wouldn't handle well if at all.
Also each chunk has it's own mesh and the world is made up of multiple chunks so the whole world isn't just 1 giant mesh, just a bit of clarification.

Baste · Jan 19, 2018

Epic-Username said: ↑

I still haven't found any suggestions, so it would be appreciated if anyone could reply,
Thanks.
Click to expand...

Oh right, I was going to reply and totally forgot.

You can speed this up a lot by having a link from each block to their indices in the vertice and triangle array. So when you remove a block, you can remove those indices from the existing arrays instead of re-generating everything. When you create a new mesh, you'll want to append their data to the end of the mesh.

The big problem here is that your mesh is an array of vertices. If you remove a block, that probably means deleting vertices from in the middle of that array, shifting all later vertices back, and re-assigning all triangles.

This can be done fast, but I'm not sure about how to do that. Calling @OswaldHurlem, which is the resident expert on Swedish Cubes. Maybe he's got some input.

EDIT: Or you can just read what he wrote in his blog about this.

Epic-Username · Jan 19, 2018

Baste said: ↑

Oh right, I was going to reply and totally forgot.

You can speed this up a lot by having a link from each block to their indices in the vertice and triangle array. So when you remove a block, you can remove those indices from the existing arrays instead of re-generating everything. When you create a new mesh, you'll want to append their data to the end of the mesh.

The big problem here is that your mesh is an array of vertices. If you remove a block, that probably means deleting vertices from in the middle of that array, shifting all later vertices back, and re-assigning all triangles.

This can be done fast, but I'm not sure about how to do that. Calling @OswaldHurlem, which is the resident expert on Swedish Cubes. Maybe he's got some input.

EDIT: Or you can just read what he wrote in his blog about this.
Click to expand...

That's the exact problem i have at the moment and what i need suggestions for, since it's a array deleting and adding elements in the middle of the array would make me have to update the whole thing to shift the elements for extra space for new elements.

exiguous · Jan 19, 2018

Baste said:

If you remove a block, that probably means deleting vertices from in the middle of that array, shifting all later vertices back, and re-assigning all triangles.
Click to expand...

Why shift them? Just copy the last element (face, cube, triangle) at the now empty position and delete the last one (now doubled). You only need to update the indices for the shifted one. I use this method for quickly deleting an element from a list without shifting all stuff around (when order does not matter!). Also when you add something do it at the end. The order of vertices is not very important (its not entirely unimportant though).

Also keep the data in a separate list and don't retrive it from the mesh. Make your changes in that list and push it to the mesh when ready. This way the calculations can also be done in a separate thread.

Also I don't understand if you create all surfaces (6 sides for each cube) or only the ones where "air" meets "land"?

Epic-Username · Jan 24, 2018

exiguous said: ↑

Why shift them? Just copy the last element (face, cube, triangle) at the now empty position and delete the last one (now doubled). You only need to update the indices for the shifted one. I use this method for quickly deleting an element from a list without shifting all stuff around (when order does not matter!). Also when you add something do it at the end. The order of vertices is not very important (its not entirely unimportant though).

Also keep the data in a separate list and don't retrive it from the mesh. Make your changes in that list and push it to the mesh when ready. This way the calculations can also be done in a separate thread.

Also I don't understand if you create all surfaces (6 sides for each cube) or only the ones where "air" meets "land"?
Click to expand...

I don't think i understand exactly what you mean can you please provide an example to clarify,
Thanks.

I create surfaces where it meets air not for all sides as there's no point in calculating surfaces that will never be seen without any changes.

Also sorry for the long wait to post a reply.

exiguous · Jan 24, 2018

Well, you said that you want to improve performance by not recreating the mesh the entire time when only one block changes. My suggestion is simply that you have methods that remove and add blocks (and what elseyou need). removing a block should be done by copying the last element at the position of the element to remove then delete it. How this works in detail depends entirely on your method to create the mesh and how you store the data of the blocks. its just a methodical suggestion not an implemention one.

Epic-Username · Jan 25, 2018

exiguous said: ↑

Well, you said that you want to improve performance by not recreating the mesh the entire time when only one block changes. My suggestion is simply that you have methods that remove and add blocks (and what elseyou need). removing a block should be done by copying the last element at the position of the element to remove then delete it. How this works in detail depends entirely on your method to create the mesh and how you store the data of the blocks. its just a methodical suggestion not an implemention one.
Click to expand...

Since different amounts of surfaces are on different blocks depending on visibility it would be difficult trying to get which triangles and vertices correlate with which block in the list. There's no way that i know of where i can simply calculate where the face is i'm looking for depending on where the block is in the list without knowing the amount faces are visible on each block (which varies).
Do you have any suggestions for this problem?

exiguous · Jan 25, 2018

Thats why I asked if you build all blocks or only the border ones. If you can't simply deduce which vertices + triangles to delete my suggestion won't help you. So it depends on your setup and methodology.
When the user deletes a block can you get its position or a reference to the block data? when you have a regular grid you should be able to calculate the position of a block (in the grid). This could be used to get a reference to a structure to the block, this structure could hold some metadata of the vertex and index positions. But when the amount per block varies this will also make it difficult to copy data around. So from this side a fixed aproach where all blocks are built completely into the mesh even if invisible would be advantageous. This also avoids to check if there is "air" on the other side.

Epic-Username · Jan 25, 2018

exiguous said: ↑

Thats why I asked if you build all blocks or only the border ones. If you can't simply deduce which vertices + triangles to delete my suggestion won't help you. So it depends on your setup and methodology.
When the user deletes a block can you get its position or a reference to the block data? when you have a regular grid you should be able to calculate the position of a block (in the grid). This could be used to get a reference to a structure to the block, this structure could hold some metadata of the vertex and index positions. But when the amount per block varies this will also make it difficult to copy data around. So from this side a fixed aproach where all blocks are built completely into the mesh even if invisible would be advantageous. This also avoids to check if there is "air" on the other side.
Click to expand...

I suppose i could create a struct for each block containing vertex and triangle references to the list and refer back to them, but i would like to minimize the amount of memory used.

exiguous · Jan 25, 2018

Usually its a tradeoff between memory usage and cpu cycles used. And you can't store references to vertices and triangles as these are value types. just store the index. And when you say you "create a struct" where do you store the block type in the first place or when it changes? So I think there should be already a struct or something for a block.
Just thinking about a "mixed approach". When you create the mesh include all sides of a block which "borders" air (in any of the 26 directions). This way you always have the same amount of vertices and triangles for a block and my suggested method can be applied. When a block is deleted look in your structure where its vertices and triangles begin (index) and copy the last element over it. Then find all new blocks surrounding it which need to be created. Seems to me as a pretty easy, straightforward and fast approach. You "waste" some vertices for quads never seen but makes handling the "rebuild" much easier from what I take from your problem description.

Edit: When thinking about it you probably only need to create blocks adjacent to a face in the 6 directions.

Epic-Username · Jan 25, 2018

exiguous said: ↑

Usually its a tradeoff between memory usage and cpu cycles used. And you can't store references to vertices and triangles as these are value types. just store the index. And when you say you "create a struct" where do you store the block type in the first place or when it changes? So I think there should be already a struct or something for a block.
Just thinking about a "mixed approach". When you create the mesh include all sides of a block which "borders" air (in any of the 26 directions). This way you always have the same amount of vertices and triangles for a block and my suggested method can be applied. When a block is deleted look in your structure where its vertices and triangles begin (index) and copy the last element over it. Then find all new blocks surrounding it which need to be created. Seems to me as a pretty easy, straightforward and fast approach. You "waste" some vertices for quads never seen but makes handling the "rebuild" much easier from what I take from your problem description.

Edit: When thinking about it you probably only need to create blocks adjacent to a face in the 6 directions.
Click to expand...

I mean i currently store blocks as integers which represents their type for example 1 = land or 0 = air, but what i meant is that i could instead of storing just integers in the blocks array i could store structs containing both the block type and mesh references or index reference as you suggested.
And also i would like to minimize the rendering calculations to improve fps as much as possible as i'm running this off a really slow machine so i would like to only have sides of the block that are visible for optimization purposes.

Epic-Username · Feb 5, 2018

I've completely run out of ideas, i have been trying to think of ways i could solve this for weeks and everything i have tried has failed and hasn't even come close to solving it. Perhaps someone more experienced with voxel terrain could help me solve this issue?

OswaldHurlem · Feb 9, 2018

Hi! Missed the notification for this. What hardware are you on and how many microseconds is it taking to update a mesh? If you want to share your mesh update procedure, I'd be happy to take a look.

The MeshUpdate procedure in my code (and in Minecraft's) works the same way that yours does -- it reconstructs the whole mesh each frame a chunk is updated. What you describe -- make it so that only the part of the chunk that has changed gets updated -- is what I would call an "Big-O Solution." I've tried things like it, they often aren't worth it, and they should generally be avoided. Big-O Solutions often increase the complexity of the procedure, not necessarily or soley in lines of code, but in terms of how you'd summarize what the code does. A Big-O Solution may require you to store the mesh in a more complex format, provide more info as input to the procedure, or do an operation which is more likely to have corner cases. Boo.

What I usually do if some procedure is too slow is, I look for some ways to get easy performance gains. Usually battles like these are won not through clever algorithmic techniques, but by using a standard arsenal of optimization practices. Profile profile profile. Reduce the number of library-provided abstractions you use, especially ones that may be performing allocations, especially LINQ. You may be indexing your block array in a sub-optimal way, and if you are using C#'s built in multi-dimensional array, it may be indexing in a sub-optimal way. Your code will look inelegant -- that's OK.

If that's not enough, you'll want to also change the way data is stored and preserved. This involves some extra mental load. Keep an array which has only the information your mesher needs, in the smallest possible representation. Shouldn't be more than 32 bits per block. Avoid heavy math operations by keeping around copies of slow-to-compute values. Finally, you can use the trick described in my blog article to reduce the size of the outputted mesh, especially if the bottleneck of transferring the mesh to the GPU turns out to be the problem (not uncommon).

This should be good enough (ping me if it isn't). But to give you an even better idea of how much I dislike Big-O solutions, here's some other things I'd try. I would try moving the code to C++. I would try using ISPC or a vectorizing compiler. I would try a compute shader. These are last resorts which I would still do long before changing the algorithmic complexity of my code. But I'm sort of an extremist in that regard.

That's my optimization advice. I hope it's helpful and not condescending. I encourage you to keep this thread updated!

Maybe some time soon I'll post code for a very small and simple but very fast voxel mesher.

Major · Feb 9, 2018

I'd recommend an implementation of marching cubes. Reduce your chunk sizes as well. You can store the voxel data in a global array/list/dictionary that each chunk can reference. This would allow you to make a chunk of any arbitrary size.

OswaldHurlem · Feb 9, 2018

Baloney. His performance problems are not because he isn't using marching cubes.

Screenshot of profile of my code doing exactly what Epic-Username describes, plus some extra stuff, for a 96x96x96 volume. It's not an especially optimized function. My CPU runs it in 5ms. This is with lowest-hanging-fruit optimizations.

Epic-Username · Feb 9, 2018

OswaldHurlem said: ↑

Hi! Missed the notification for this. What hardware are you on and how many microseconds is it taking to update a mesh? If you want to share your mesh update procedure, I'd be happy to take a look.

The MeshUpdate procedure in my code (and in Minecraft's) works the same way that yours does -- it reconstructs the whole mesh each frame a chunk is updated. What you describe -- make it so that only the part of the chunk that has changed gets updated -- is what I would call an "Big-O Solution." I've tried things like it, they often aren't worth it, and they should generally be avoided. Big-O Solutions often increase the complexity of the procedure, not necessarily or soley in lines of code, but in terms of how you'd summarize what the code does. A Big-O Solution may require you to store the mesh in a more complex format, provide more info as input to the procedure, or do an operation which is more likely to have corner cases. Boo.

What I usually do if some procedure is too slow is, I look for some ways to get easy performance gains. Usually battles like these are won not through clever algorithmic techniques, but by using a standard arsenal of optimization practices. Profile profile profile. Reduce the number of library-provided abstractions you use, especially ones that may be performing allocations, especially LINQ. You may be indexing your block array in a sub-optimal way, and if you are using C#'s built in multi-dimensional array, it may be indexing in a sub-optimal way. Your code will look inelegant -- that's OK.

If that's not enough, you'll want to also change the way data is stored and preserved. This involves some extra mental load. Keep an array which has only the information your mesher needs, in the smallest possible representation. Shouldn't be more than 32 bits per block. Avoid heavy math operations by keeping around copies of slow-to-compute values. Finally, you can use the trick described in my blog article to reduce the size of the outputted mesh, especially if the bottleneck of transferring the mesh to the GPU turns out to be the problem (not uncommon).

This should be good enough (ping me if it isn't). But to give you an even better idea of how much I dislike Big-O solutions, here's some other things I'd try. I would try moving the code to C++. I would try using ISPC or a vectorizing compiler. I would try a compute shader. These are last resorts which I would still do long before changing the algorithmic complexity of my code. But I'm sort of an extremist in that regard.

That's my optimization advice. I hope it's helpful and not condescending. I encourage you to keep this thread updated!

Maybe some time soon I'll post code for a very small and simple but very fast voxel mesher.
Click to expand...

If Minecraft does it a similar way to how i do it why does Minecraft run much smoother than my game even if my chunks are smaller. Do they run the chunk mesh updates on another thread or use special algorithms optimized for speed and performance? there must be something i'm missing.

OswaldHurlem · Feb 9, 2018

Again it's probably tiny little things you are doing sub-optimally. Very rarely is there a magic technique that spares you the drudgery of hand-optimization. Profile your code. Post your code.

Epic-Username · Feb 14, 2018

OswaldHurlem said: ↑

Again it's probably tiny little things you are doing sub-optimally. Very rarely is there a magic technique that spares you the drudgery of hand-optimization. Profile your code. Post your code.
Click to expand...

Through some profiling i'm found that GetBlock() is the main cause using up 50% and String.memcpy() is also a big issue using 21% in GetBlock() and 8% in UpdateChunk(), what is memcpy() and how can i optimize it?

Chunk:

Code (CSharp):

//gets a block at specified local position

public BlockData GetBlock (Vector3 localPosition)

{

if (!InChunkBounds (localPosition))

return worldManager.GetBlock (LocalToWorldPosition (localPosition));

return blocks [LocalPositionToArrayPosition (localPosition)];

}

worldManager:

Code (CSharp):

//gets the block at the target world position

public Chunk.BlockData GetBlock (Vector3 worldPosition)

{

Chunk chunk = GetChunk (worldPosition);

if (chunk == null) {

Chunk.BlockData blockData = new Chunk.BlockData ();

blockData.blockID = CalculateBlockID (worldPosition);

return blockData;

}

Vector3 finalPosition = worldPosition - chunk.GetChunkPosition ();

if (finalPosition.y >= 0 && finalPosition.y < chunkHeight) {

return chunk.GetBlock (finalPosition);

} else {

return new Chunk.BlockData ();

}

}

public Chunk GetChunk (Vector3 worldPosition)

{

Vector2 chunkPosition = new Vector2 ((int)Mathf.Floor (worldPosition.x / chunkSize), (int)Mathf.Floor (worldPosition.z / chunkSize));

if (loadedChunks.ContainsKey (chunkPosition * chunkSize)) {

return loadedChunks [chunkPosition * chunkSize];

}

return null;

}

OswaldHurlem · Feb 14, 2018

Your problem is that you're using worldManager.GetBlock at the innermost part of your meshing loop.
This means that it's performing bounds checks, getting a chunk, getting a position within the chunk, performing more bounds checks, performing even more bounds checks, converting a voxel coordinate into an array index, and then finally checking a position in an array.
You do this six times for every solid block in your game. It's concise and easy-to-read code, but it is slow code. That's just how it is... no free lunch!!

Rewrite your meshing function so that it operates only on a single chunk of data (I know there's some problems with that... just do it for now). Make it so that bounds checks, GetChunk, GetChunkPosition, GetBlock, and LocalPositionToArrayPosition are only compute once for each row or column of 32 voxels.
It will speed things up by A LOT.

Epic-Username · Feb 17, 2018

OswaldHurlem said: ↑

Your problem is that you're using worldManager.GetBlock at the innermost part of your meshing loop.
This means that it's performing bounds checks, getting a chunk, getting a position within the chunk, performing more bounds checks, performing even more bounds checks, converting a voxel coordinate into an array index, and then finally checking a position in an array.
You do this six times for every solid block in your game. It's concise and easy-to-read code, but it is slow code. That's just how it is... no free lunch!!

Rewrite your meshing function so that it operates only on a single chunk of data (I know there's some problems with that... just do it for now). Make it so that bounds checks, GetChunk, GetChunkPosition, GetBlock, and LocalPositionToArrayPosition are only compute once for each row or column of 32 voxels.
It will speed things up by A LOT.
Click to expand...

I don't understand how i can only make those checks once in a row when every voxel/block needs to know this data for itself in order to decide where to place vertices/triangles/UV's or not. Can you please explain,
Thanks.

OswaldHurlem · Feb 17, 2018

You know that an entire row is in-bounds, so you don't need to check bounds for each of them.
The chunk doesn't change within a row.
You know the array index which corresponds to the first block in a row, so you don't need to compute indices for each block in the row.

Epic-Username · Mar 3, 2018

When i was researching more ways i could optimize rebuilding chunks i found this about minecraft chunks: https://minecraft.gamepedia.com/Chunk

It says "Minecraft's render engine uses OpenGL's display list feature to divide a world chunk into sixteen 16x16x16 blocks large display lists to speed up rendering significantly. They need to be rebuilt each time when a block within them is changed and can be rendered multiple times to achieve e. g. transparency." under the "Generation" section does this mean they split chunks into smaller sections so that rebuilding chunks is faster or do they only split them for rendering?

OswaldHurlem · Mar 3, 2018

Sure, try changing it so that your meshes represent a smaller volume of blocks. This will make it so that updating a mesh takes one eighth as long. It might increase the amount of time it takes to render a frame, or it might not. I think it won't.

But do the other optimizations I mentioned too

Search Unity

Unity ID

Useful Searches

Need suggestions for optimizing changes to mesh data.