Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Poor performance related to frequent use of non-uniform scale in 2D game

Discussion in 'iOS and tvOS' started by shinymark, Jun 13, 2012.

  1. shinymark

    shinymark

    Joined:
    Aug 7, 2011
    Posts:
    66
    Hi there,

    I am making a 2D game where our characters are 3D models rigged and animated in Maya. We use non-uniform scale A LOT to bring life to the characters in their various animations. It looks great.

    We are near the end of the project and are optimizing our game to run well on older iOS devices. Unfortunately, I didn't read this until today.

    Non-uniform scaling is when the Scale in a Transform has different values for x, y, and z; for example (2, 4, 2). In contrast, uniform scaling has the same value for x, y, and z; for example (3, 3, 3). Non-uniform scaling can be useful in a few select cases but should be avoided whenever possible.

    Non-uniform scaling has a negative impact on rendering performance. In order to transform vertex normals correctly, we transform the mesh on the CPU and create an extra copy of the data. Normally we can keep the mesh shared between instances in graphics memory, but in this case you pay both a CPU and memory cost per instance.​

    Quoted from the Unity docs here: http://unity3d.com/support/documentation/Components/class-Transform.html

    I setup a test where I eliminated all of our animated scale and profiled it. The use of constant scale changes is costing us over 20 milliseconds per frame. That is insane!

    In our game we don't use vertex normals for anything. We don't use dynamic lighting and none of our code or shaders relies on them. Is there any way we can get decent performance in Unity with frequent non-uniform scale by turning off the behavior described in the docs?

    While we could cut scale out of all of our animations (hundreds, ugh) it would dramatically impact the visual quality negatively.

    Any help is appreciated, thanks.
     
  2. Dreamora

    Dreamora

    Joined:
    Apr 5, 2008
    Posts:
    26,601
    the main problem with non-uniform scale is that dynamic batching is instantly broken as non uniformly scaled game objects are no longer batched (the transformation as documented has to be done on the cpu, yet batching is a pure gpu thing). that on its own can kill your performance on iOS as you have severe drawcall limits.

    You can do non-uniform scaling if you want but in that case you best write a system that holds the whole data within a single mesh and uses bones to handle the 'sprites within it so you can still get away with only a few drawcalls.
    MikaMobile, the makers of zombie ville and Battle Hearts have written various articles and posts on how they tackled it as they have used this approach since Unity iPhone 1.x days as it has its own benefits (like modelling the animations straight in your modeling app instead of using pixel art and losing the performance from overdraw on iphone4, ipad1, itouch4)
     
    Last edited: Jun 13, 2012
    theANMATOR2b likes this.
  3. shinymark

    shinymark

    Joined:
    Aug 7, 2011
    Posts:
    66
    Hi Dreamora, thanks for the response.

    We are already very efficient in terms of how our character models are constructed. We use one texture atlas with two materials, both pointing to the same atlas. One material is for the transparent bits and one for the inner opaque parts. Here's a small bit of one of our characters showing what I mean:

    $meshshot.png

    The image isn't showing up for me for some reason in the post. I put it here in Dropbox as well: https://dl.dropbox.com/u/2003612/meshshot.png

    The triangles around the edge use a transparent shader and the inner triangles use an opaque shader. We're not vertex/triangle bound in terms of performance so this is a major performance win for us. We have very small amounts of overdraw thanks to this technique.

    Total triangles in our scenes are around 7k and draw calls hover around 70. We run under 16ms a frame on the iPhone 4S, iPad 2, and iPad 3 but on the 3GS, iPad 1, and iPhone 4 we run at about 35-45 ms a frame currently. I want to hit a consistent 33 ms or less.

    As I said in my original post our characters are animated in Maya. They aren't bound to joints, instead we create a hierarchy of transforms and animate the parent transforms of the geometry.

    We're getting killed in Mesh.CreateVBO. Eliminating the scale keys in our Maya animations completely removes all calls to this function. With scale in our animations, opaque render on an iPod Touch 4gen is about 20 ms a frame. After eliminating scale, it is around 1 ms a frame.

    Given the underlying architecture it's likely there is nothing we can do (except remove scale in animations) but I wanted to ask in case there was something we could do to modify the internal Unity engine behavior since we don't rely on vertex normals at all. The docs seem to imply all the extra CPU work is done to transform vertex normals correctly. Since we don't care about them, it's all wasted CPU time in our case.
     
    theANMATOR2b likes this.
  4. coolpowers

    coolpowers

    Joined:
    Mar 23, 2010
    Posts:
    125
    Pretty sure skinned meshes are the way to go for a lot of animated 2D objects, not hierarchical animation.

    Btw are you creating those 2D "hull" meshes manually, or is there a way to build a mesh that matches the texture shape automatically?
     
  5. Krobill

    Krobill

    Joined:
    Nov 10, 2008
    Posts:
    282
    I've never tested that extensively but does the Mesh.CreateVBO isn't related to the number of vertices/triangles transformed ? If so you may be mistaken when you say you're not vertex/triangle bound in term of performance. You might want to consider if fillrate is really a bottleneck without using your 'trick' because you could save quite a lot of triangles without this.
     
  6. Dreamora

    Dreamora

    Joined:
    Apr 5, 2008
    Posts:
    26,601
    That doesn't matter cause an atlas only helps if you can make use of dynamic batching which your project simply can not do.

    If you use non-uniform scaling then every object will draw its own mesh (hence a new drawcall), it will not be drawn as a single mesh (batch) anymore. the explanation for that is in the text you quoted.

    And SiW is right, hierarchy of transforms animations is definitely a bad way to do it, going with skinned is a much better idea (skinned renderers don't batch either but in your case that makes no difference cause yours will not batch anyway) as they are much more performt cause the skeleton animation and vertex deformation is offloaded to the mathematical coprocessor or the second cpu core which frees the main core to do the rendering and simulation more performantly, while your hierarchy of transforms animations likely runs exactly on this main core.

    If you have pro you can use the profiler to see the drawcalls, if you don't have it you need to enable the in-code one in the xcode project. Check the numbers of drawcalls you have, if you exceed 50 or so then the 3GS is gonna die, at 70-100 drawcalls you are starting to stress all newer currently available devices and going by your numbers I would assume that you currently run at 100+ drawcalls
     
    Last edited: Jun 14, 2012
    theANMATOR2b likes this.
  7. Krobill

    Krobill

    Joined:
    Nov 10, 2008
    Posts:
    282
    I'm not too sure about this.
    Quick examples with 2 Unity cubes and default material :
    - 2 unscaled ==> batch
    - 2 uniform identical scales ==> batch
    - 2 uniform different scales ==> don't batch
    - 1 uniform 1 non uniform scale ==> don't batch
    - 2 non uniform identical scales ==> batch
    - 2 non uniform different scales ==> batch
     
  8. shinymark

    shinymark

    Joined:
    Aug 7, 2011
    Posts:
    66
    It's manual. We do it by tracing the character using the pen tool in Photoshop to make a path, then the path is brought into Maya to create a mesh.

    It's essentially this technique: http://www.worldofleveldesign.com/c...op_illustrator_paths_to_maya_environments.php
     
  9. shinymark

    shinymark

    Joined:
    Aug 7, 2011
    Posts:
    66
    Yeah, that's probably true. We were massively fillrate bound before building the geo in the way I showed though due to all the overdraw. We saved something like 20-30 ms a frame on the iPad 1 by going from quads to what I showed. I'll have to see if going back to quads is better or worse.
     
  10. shinymark

    shinymark

    Joined:
    Aug 7, 2011
    Posts:
    66
    It still matters. Not every mesh in the character is scaled non-uniformly all the time. A lot of the time they have uniform scale and those parts batch. Non-uniform scale is used for squash and stretch during an animation to improve the look of the character.

    Interesting, is that documented anywhere? I'd like to read more about that.

    Regardless, even converting to skinned meshes made animation update an order of magnitude faster it wouldn't matter in our project. Animation update time is only 1-2 ms a frame right now. We're spending the vast majority of frame time in Camera.Render and its children.

    Yeah, I have Pro. Like I said in my second post we have around 65-70 draw calls a frame. We're running at about 40-45 ms a frame on the iPad 1 and 3GS currently, so about 23-25 fps. I want to get consistently under 33ms. It looks like nuking scale out of all the animations might be the only way. I've confirmed that by removing scale from an animation that all the calls to Mesh.CreateVBO disappear.
     
  11. mrmadprofessor

    mrmadprofessor

    Joined:
    May 22, 2012
    Posts:
    53
    Why aren't you using skinned meshes instead? Build a simple joint hierarchy and rig it. If you do this you can stretch and squash the character even better and it is probably simpler and faster to animate.

    With skinning each character will be one drawcall instead of one drawcall with many dynamic batches. Although if alot of animations have been done you will have to redo them.
     
  12. shinymark

    shinymark

    Joined:
    Aug 7, 2011
    Posts:
    66
    Is this true? Does skinned vs. not-skinned affect mesh batching? I can't find anything in the docs that says that.
     
  13. shinymark

    shinymark

    Joined:
    Aug 7, 2011
    Posts:
    66
    I tested this in the editor as well and got the same result.

    The batching docs seem to say that result is expected. Two non-uniformly scaled meshes with the same material DO batch.

    Uniformly scaled objects won't be batched with non-uniformly scaled ones.
    Objects with scale (1,1,1) and (1,2,1) won't be batched. On the other hand (1,2,1) and (1,3,1) will be.​

    From: http://unity3d.com/support/documentation/Manual/iphone-DrawCall-Batching
     
    theANMATOR2b likes this.
  14. mrmadprofessor

    mrmadprofessor

    Joined:
    May 22, 2012
    Posts:
    53
    Well since you have to keep the hierarchy animated mesh into several separated parts these parts will be dynamic batched. If you combine the mesh and skin it using joints it will not need to be dynamicly batched between the parts since they are one mesh. So you might have a lot of cost for the dynamic batch of all the parts in a single character.

    Someone correct me if I'm wrong.

     
  15. coolpowers

    coolpowers

    Joined:
    Mar 23, 2010
    Posts:
    125
    Thanks. I had a bit of a brain fart thinking you'd have to do a lot of hand-editing of texture coordinates before realizing that duh, you'd just apply planar mapping to the mesh.

    Back to your problem, AFAIK by converting your hierarchical animation to a skinned mesh, you'll end up with 1 draw call for the whole object, because it IS just one mesh now. I can't say for certain whether these will then batch, but you've already saved yourself several draw calls per object.

    MikaMobile's post here basically repeats this and has information on how to solve the sorting problem you face when you go with a skinned mesh as opposed to hierarchies: http://forum.unity3d.com/threads/74...ith-3d-package?p=572367&viewfull=1#post572367
     
  16. Dreamora

    Dreamora

    Joined:
    Apr 5, 2008
    Posts:
    26,601
    Which one exactly?
    Render Transparent geometry would mean that even with your current way of approximating the outline there is still too much overdraw.
    While I doubt that the object shown and similar ones would cause it due to the way you mentioned it to work (with 2 distinct sprites to handle opaque and transparent performantly) I would assume that an environment like yours likely also has backgrounds, at worst multiple ones for parallax scrolling and alike.
    Are those done like this too or with fullscreen quads? (the later would be bad)

    Also just in case you use them: stay away or at least seriously limit the size of particles as they can hit badly


    If the profiler says opaque is the hit then there is little to do any longer beside reducing the rendering amount itself. But that would be hard to believe and in that case the question would be: are you profiling on iOS or desktop - cause the later is totally useless when it comes to opaque profiling as the tile based gpu of imagination technology is years ahead of desktop gpus performance wise with not rendering stuff thats occluded by opaque pixels (or should I say its behind cause power vr cards had this feature already many years ago on the desktop)



    as for skinned meshes: They are by definition cut from batching as dynamic batching ignores them flat out. Only regular renderers are event taken into account (that means graphics.drawmesh, guitexture, guitext etc are also ignored which is the reason OnGUI doesn't batch for example which uses drawmesh internally)
     
  17. hippocoder

    hippocoder

    Digital Ape Moderator

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    Nothing stopping you from using a skinned mesh with everything you need on it, one per bone. I do this in Physynth for just about everything, else I'd be looking at close to 200 draw calls instead of 30+
     
  18. Krobill

    Krobill

    Joined:
    Nov 10, 2008
    Posts:
    282
    The problem with making massive 'batching' into a single skinned mesh is that you lose the native ability offered by Unity to z-order alpha blended polygons. You can handle it during the mesh creation if there's no z-swapping needed but it can cause problem even for a single 2D character. Skinning with a single bone weight per vertex is sometimes a cheap solution but it's not the optimal solution to all configurations.
     
    theANMATOR2b likes this.
  19. shinymark

    shinymark

    Joined:
    Aug 7, 2011
    Posts:
    66
    Thanks for all the help everyone. Looks like the main fix for us is to skin the meshes. We don't need to dynamically change mesh depth at runtime so that isn't an issue for us.

    Yeah, exactly. Our backgrounds are built in exactly the same way to reduce overdraw.

    Sure, I'm aware of that. I've been profiling on iOS. When we aren't doing any non-uniform scaling the opaque render is very fast, just a few milliseconds. During non-uniform scaling in our animations opaque rendering suddenly jumps to around 16-18ms a frame (on an iPod Touch 4gen). A lot of that time is spent in Mesh.CreateVBO under opaque render. Based on Unity's docs that I quoted in the original post, I'm guessing this is related to the statement "in order to transform vertex normals correctly, we transform the mesh on the CPU and create an extra copy of the data."

    That's why I was curious if we could just turn that behavior off since we don't use vertex normals for anything in our game. Presumably all of those transformations the engine is doing is wasted work since we're not using the output in any of our shaders.

    Good to know. On our next project we'll be sure to skin all of our characters.
     
  20. Dreamora

    Dreamora

    Joined:
    Apr 5, 2008
    Posts:
    26,601
    That indeed sounds like the problem caused by recreating the meshes constantly. Be it through the Mesh class or as in your case with non uniform scaling.

    In that case though: Did you upgrade to 3.5.2?
    UT has fixed a performance regression in this context from 3.5.0 / 3.5.1 with 3.5.2 that might potentially help significantly in your case
     
  21. shinymark

    shinymark

    Joined:
    Aug 7, 2011
    Posts:
    66
    Yeah, we are on 3.5.2. Our problem doesn't seem related to whatever fix Unity made.