Search Unity

Search

Garbage Collection, Allocations, and Third Party Assets in the Asset Store

Discussion in 'General Discussion' started by Games-Foundry, Jun 21, 2012.

Page 2 of 7

Games-Foundry

Joined:

May 19, 2011

Posts:

632

Superpig said: ↑

Deep profiling is useful for tracking down memory allocations, and it's useful if you know that something is taking 20ms and you want to see exactly what proportion of that is going where, but in terms of absolute timings I wouldn't trust it for a nanosecond.
Click to expand...

Yeah, any future tests I run I'll do with normal and deep profiling to confirm conclusions aren't being skewed.

Games-Foundry, Jun 22, 2012

#51
tatoforever

Joined:

Apr 16, 2009

Posts:

4,368

gamesfoundry said: ↑

Great! Glad some of these test are proving useful.

@Nomad I wondered when you would make an appearance I've read your optimization posts in the past too, so your input to the discussion is most welcome. I'll give that struct code a read over when I'm less tired.

@tatoforever Well we have a challenge. Detailed outdoor scenes are a bitch, it has to be said. The amount of content on screen at any one time is quite high as you can probably tell from Folk Tale screenshots. I'm using every trick I know to squeeze more performance out.

I'm using character controllers for all characters at the moment, which are indeed expensive. I tried rigidbodies several months ago and just couldn't get it behaving exactly how I wanted. I may end up trying again now before beta starts. A*Pathfinding in grid graph mode because we have building construction and need to recalculate areas of the graph at runtime ( not an option with navmesh solutions ). Megafiers is used for some real-time deformation and all facial animation. iTween for some stuff including LOD transitions ( scale / fade ) and cutscene cameras, but it's use has been declining as I rewrite. UnitySteer is still in there for water transport, but I'm migrating that code to use A*Pathfinding List Graphs as the routes are always pre-defined. And finally VLights for volumetric lighting, but that'll get the boot if the author doesn't make any progress with optimizations.
Click to expand...

I see yeah, lots of stuff running simultaneously. The nature of Navmeshes (being baked offline) doesn't allow you to do real-time moddifications on the terrain. Arong Granberg A* is fine when you dont go over certain amount of nodes (don't exactly know the numbers). But the fastest thing that you can do with is waypoint navigation.
Why don't you try waypoints? You can automatically replace each waypoint over your terrain by raycasting (in case you do some moddifications on your terrain at runtime) and then compute the links when done.
Looking at your videos/shots, you have a lots of characters (which means lots of CharactersControllers), you must find a way to create your own character controllers (kinematic rigid body, freeze some of their axis rotations, do some front/floor raycasting check to detect steps once per each X frames).
Itween is fast, never had any problem with (even if I'm not using it right now). Megafiers is mega heavy (based on their vertex manipulations which is done on the CPU) but i've never worked with, neither know how is was build so i cannot suggest anything on it. VLight is also very heavy. If you want better volumetric effects that only uses your current camera (for depth testing), you should try this: http://u3d.as/content/stu-assets/volumetric-light-beam-kit
Demo video:

Btw, this kit allow you to have multiple volume lights without breaking your performance (aka adding extra buffers).

Last edited: Jun 22, 2012

tatoforever, Jun 22, 2012

#52
Games-Foundry

Joined:

May 19, 2011

Posts:

632

@tatoforever Waypoints are fixed navigation paths, whereas our characters are controlled by the player and can go anywhere they are commanded when not in automatic mode ( where they go about their occupational tasks ). We also make use of penalties to define preferred highways so our characters take terrain painted paths instead of simply heading cross-country.

Our level map is large at 2000x2000, with 1million pathfinding nodes. We need the granularity to ensure complex environmental props are navigable, for instance where we have narrow wooden walkways on the swamp palisade. The memory footprint is 621,084 KB (Working Set). This editor shot might better demonstrate why I need to optimize everything.

I may go for a mix of navmesh in the non-constructable zones, and grid graph in the villagers where graph modification happens.

The camera can go almost this high, right down to terrain level, and go anywhere, including inside buildings which will be set dressed with props.

Last edited: Jun 22, 2012

Games-Foundry, Jun 22, 2012

#53
tatoforever

Joined:

Apr 16, 2009

Posts:

4,368

gamesfoundry said: ↑

@tatoforever Waypoints are fixed navigation paths, whereas our characters are controlled by the player and can go anywhere they are commanded when not in automatic mode ( where they go about their occupational tasks ). We also make use of penalties to define preferred highways so our characters take terrain painted paths instead of simply heading cross-country.

Our level map is large at 2000x2000, with 1million pathfinding nodes. We need the granularity to ensure complex environmental props are navigable, for instance where we have narrow wooden walkways on the swamp palisade. The memory footprint is 621,084 KB (Working Set). This editor shot might better demonstrate why I need to optimize everything.

I may go for a mix of navmesh in the non-constructable zones, and grid graph in the villagers where graph modification happens.

The camera can go almost this high, right down to terrain level, and go anywhere, including inside buildings which will be set dressed with props.
Click to expand...

I see yeah,
Well in that case, i truly suggest you to try out Xaitment Map. Their navemesh system is powerful and better than the Unity one! It's also very scalable, supports very large terrains (and the license isn't that bad 500$, actually i think they have now both of their plugins XaitControl and XaitMap for 500$), it also have lods and you can recalculate the navigation at runtime if you wish (useful for your game style). I was one of the first to try it out and believe me, it is fast! It will solve all your navigation problems (including navigation agents, you can get rid of your character controllers that way) which is the most critical problem you are having now i guess. I think you can try it out for 30 days. The API is quite simple and you have more control over your navigation than any navmesh system available to Unity. If you want to save a lots of headhaches and time, try to get it.
Btw, your game is looking awesome, hopefully you will be able to finish it! Best wishes!

tatoforever, Jun 22, 2012

#54
Games-Foundry

Joined:

May 19, 2011

Posts:

632

Already tried it and provided feedback to them. With any navmesh, I'd have to make the mesh by hand as whenever I've tried various recast implementations it never copes very well. Admittedly that was when it first came out, so they may have done something about it.

Last edited: Jun 22, 2012

Games-Foundry, Jun 22, 2012

#55
tatoforever

Joined:

Apr 16, 2009

Posts:

4,368

Oh my,
I'll try to find out the video where i saw that, but I'm sure you can compute the navmesh at runtime.

tatoforever, Jun 22, 2012

#56
Arges

Joined:

Oct 5, 2008

Posts:

359

gamesfoundry said: ↑

Great! Glad some of these test are proving useful.
Click to expand...

Definitely - it got me looking at where I was still using the gameObject/rigidbody accessors, and doing a whole bunch of refactoring for pre-allocating some values.

I'm curious: how many agents were you testing with, and with what behaviors? It's not obvious from the collapsed view on the screenshot. Also, were you using the UnitySteer version that's currently on the asset store?

Arges, Jun 23, 2012

#57
half_voxel

Joined:

Oct 20, 2007

Posts:

978
Interesting thread you have here.
My A* Pathfinding Project is one of the discussed 3rd party tools. So here's some information on what is actually allocated during normal use.
Basically it's the Path object (which is what you have highlighted in the second screenshot on the first page) and the path and vectorPath arrays (stored in the path object, these hold the calculated path as Node[] and Vector3[] representations). The first allocation I can do something about. I have actually tried, you can see the name of the function is GetFromPathPool. Though the path pooling turned out to too easily cause troubles if users tried to use the paths after they had been recycled (thinking they still contained the old info), I will try again to implement it in a better way.
The second allocation, the arrays. That's basically set, if I don't store it as List<> objects and try to recycle those, but that would be quite messy I think.

That's basic use, however I think path modifiers are the cause of the most allocations, especially the Simple Smooth modifier since it allocates a path with a much higher resolution than the original path. That allocation is quite hard to reduce. I don't really know how to remove it.

Also, regarding other optimizations in the A* Pathfinding Project. Almost all optimizations you have brought up here are used to the fullest. Except the case of storing the array.length variable outside the loops. I did think that was optimized even in Mono, too bad it isn't.

I would really like the awesome optimizations which some c++ compilers and to some extent java can do. Like optimizing this loop:

Code (csharp):

int x = 0;

for (int i=0;i<n;i++) {

x++;

}

to this:

Code (csharp):

int x = n*(n-1)/2;
half_voxel, Jun 23, 2012

#58
Games-Foundry

Joined:

May 19, 2011

Posts:

632

@Arges - around 50 active agents at one time I'd say using A*Pathfinding. However I finished up only using UnitySteer for two ( the ships ). It's probably an old version I was using as I haven't checked for an update in several months, so apologies if the tests highlight results that no longer exist - although from your posts it thankfully sounds like it was a useful exercise anyway for the non-cached transform. I've now migrated the ships to the List Graph in A*Pathfinding to reduce code dependencies.

@sturestone - more of a feature request to consider reducing the allocations through recycling than reporting any bad behaviour in A*Pathfinding. I've tried nearly every pathfinding package out there over the last 14 months and A*Pathfinding is still the best for RTS games imho. You'll notice from the 1million graph nodes that allocation is a significant challenge for us. Ideally I'd use a mix of navmesh and grid graphs, but connecting all the points together at such high resolution would be painful given the editor lag that occurs when you move around (even with no graphs shown, but the gameobject with the AstarPath component added selected ), and the fiddly connection system ( at least it's fiddly when dealing with 1 million nodes ). Any chance of an auto-connect feature? I also make use of the Texture penalties for the preferred highways which would need to work with navmeshes if we were to use them, and that may rule out their use. All our characters have the SimpleSmoothModifier further compounding the issue.

Games-Foundry, Jun 24, 2012

#59
Arges

Joined:

Oct 5, 2008

Posts:

359

gamesfoundry said: ↑

@Arges - around 50 active agents at one time I'd say using A*Pathfinding. However I finished up only using UnitySteer for two ( the ships ). It's probably an old version I was using as I haven't checked for an update in several months, so apologies if the tests highlight results that no longer exist - although from your posts it thankfully sounds like it was a useful exercise anyway for the non-cached transform.
Click to expand...

It probably was, but that's because the current version is still in development and likely to change, so I haven't made an asset store release. However, I very much appreciate it - I did find a few other cases where there were unnecessary allocations.

For reference, I was running some performance tests, and with 100 tightly-packed together boids (which increases the elements on the radar for each), each one with 4-5 active steering behaviors, the current version allocates about 0.9K on the very extreme case of every agent updating its radar on every frame.

These are likely mostly unavoidable now, since calls to Physics.OverlapSphere return an array (and we can't tell it to fill a pre-existing one), but can be significantly improved by a more reasonable approach of updating the radar only once or twice per second and staggering the updates, something that the queue settings on 2.5 allow.

Arges, Jun 25, 2012

#60
Games-Foundry

Joined:

May 19, 2011

Posts:

632

@Arges - UnitySteer is motion and steering though right, complimentary to, but not including, pathfinding. One thing I was unsure of was lets say an agent is on the edge of a cliff walking towards another agent. I can't remember if this would be an issue, with the local avoidance causing one agent to potentially route off the cliff? How do you normally go about integrating UnitySteer local avoidance with say A*Pathfinding?

In my pursuit to improve performance in Folk Tale I've replaced the CharacterControllers with rigidbodies ( took about 30 mins mostly reconfiguring prefabs ) and there is a significant performance increase.

I've also moved LOD fading away from using iTween and into custom shaders. Feels like a decent performance boost and removes some allocations, so definitely worth doing. Cleaner all round.

Games-Foundry, Jun 25, 2012

#61
Arges

Joined:

Oct 5, 2008

Posts:

359

gamesfoundry said: ↑

@Arges - UnitySteer is motion and steering though right, complimentary to, but not including, pathfinding. One thing I was unsure of was lets say an agent is on the edge of a cliff walking towards another agent. I can't remember if this would be an issue, with the local avoidance causing one agent to potentially route off the cliff? How do you normally go about integrating UnitySteer local avoidance with say A*Pathfinding?
Click to expand...

That's a bit of a complicated question, since it depends on the case. Check out the world for Hairy Tales for reference:

http://hairytalesgame.com/gallery/screenshots/

As you can see, there's empty spaces all around them that they could fall into, but I also want them walking around items on the scene (such as them getting into position to pick up the stone). On this case, what I did was implement my own RVO approach, which acts as a post-processing steering behavior (a new concept on UnitySteer 2.5) and makes the empty areas undesirable (unless they're jumping to their death). The way it works then is that the main steering behaviors - path following, in your example - would decide where the agent wants to go to, while the post-processing ones correct it within certain parameters. My next project is likely to use UnitySteer even more than HairyTales, with more agents and more complex behaviors, so I expect I'll refine that further.

gamesfoundry said: ↑

In my pursuit to improve performance in Folk Tale I've replaced the CharacterControllers with rigidbodies ( took about 30 mins mostly reconfiguring prefabs ) and there is a significant performance increase.
Click to expand...

Definitely, on my tests agents with CharacterControllers are about four times as expensive as those with just a Rigidbody+Collider.

gamesfoundry said: ↑

I've also moved LOD fading away from using iTween and into custom shaders. Feels like a decent performance boost and removes some allocations, so definitely worth doing. Cleaner all round.
Click to expand...

You may also want to consider Prime31's GoKit: https://github.com/prime31/GoKit

By the way, if you're curious about the UnitySteer changes, here's a quick write-up: http://arges-systems.com/blog/2012/06/25/unitysteer-optimization-mobile-profiling/

Arges, Jun 25, 2012

#62
Games-Foundry

Joined:

May 19, 2011

Posts:

632

Arges said: ↑

By the way, if you're curious about the UnitySteer changes, here's a quick write-up: http://arges-systems.com/blog/2012/06/25/unitysteer-optimization-mobile-profiling/
Click to expand...

Now that's interesting...Unity's .rigidbody is more expensive than providing your own say .cachedRigidbody? I wonder if that is true for the other Unity variables like .audioSource and .collider. I'll have to run some tests and publish the results when I get some brain dead time.

32K to 1K is good going on the optimization front. Maybe it's time I try the local avoidance in UnitySteer again, especially since I'm now running on rigidbody characters.

Games-Foundry, Jun 25, 2012

#63
superpig

Drink more water! Unity Technologies

Joined:

Jan 16, 2011

Posts:

4,657

gamesfoundry said: ↑

Now that's interesting...Unity's .rigidbody is more expensive than providing your own say .cachedRigidbody? I wonder if that is true for the other Unity variables like .audioSource and .collider.
Click to expand...

Yes, I believe that's true - all those properties at least used to just do GetComponent() internally. I think the .transform got changed in 3.5 to cache the component, but I don't think they've changed the rest.

superpig, Jun 25, 2012

#64
Eric5h5

Volunteer Moderator Moderator

Joined:

Jul 19, 2006

Posts:

32,401

Superpig said: ↑

Yes, I believe that's true - all those properties at least used to just do GetComponent() internally. I think the .transform got changed in 3.5 to cache the component, but I don't think they've changed the rest.
Click to expand...

No, none of those are cached (not .transform either), they are the equivalent of using GetComponent.

--Eric

Eric5h5, Jun 25, 2012

#65
half_voxel

Joined:

Oct 20, 2007

Posts:

978

Why is that? For rigidbody and alike I can understand. But the transform component is always there, why is it not cached? It is a component accessed so often so I dont see any reason not to cache it for a, perhaps not large, or even very noticable, but still a performance improvement.

half_voxel, Jun 25, 2012

#66
Eric5h5

Volunteer Moderator Moderator

Joined:

Jul 19, 2006

Posts:

32,401

Bluee.Eyess said: ↑

I do not realy understand why people dont seem to use c++ plugins with Unity
Click to expand...

Not portable, more development effort, requires Pro, doesn't always actually gain enough speed to be worth it.

--Eric

Eric5h5, Jun 25, 2012

#67
npsf3000

Joined:

Sep 19, 2010

Posts:

3,830

Eric5h5 said: ↑

No, none of those are cached (not .transform either), they are the equivalent of using GetComponent.

--Eric
Click to expand...

Dude, I disproved that ages ago: http://forum.unity3d.com/threads/130365-CachedMB

npsf3000, Jun 25, 2012

#68
Eric5h5

Volunteer Moderator Moderator

Joined:

Jul 19, 2006

Posts:

32,401

You seem to have used the generic version of GetComponent, which is slower than the non-generic version. In any case, caching transform is faster than using .transform; nothing changed in this regard in Unity 3.5.

--Eric

Eric5h5, Jun 25, 2012

#69
n0mad

Joined:

Jan 27, 2009

Posts:

3,732

Eric5h5 said: ↑

You seem to have used the generic version of GetComponent, which is slower than the non-generic version.
Click to expand...

Oh crap ... :/

*looks at all the generics to convert to non-generic*
*take a shot of whisky*

n0mad, Jun 25, 2012

#70
npsf3000

Joined:

Sep 19, 2010

Posts:

3,830

Eric5h5 said: ↑

You seem to have used the generic version of GetComponent, which is slower than the non-generic version.
Click to expand...

Err... double check that. Last I checked the generic GetComponent is only marginally slower and the data I collected talks about an order of a magnitude difference.

The source is there - so feel free to run the tests.

npsf3000, Jun 25, 2012

#71
n0mad

Joined:

Jan 27, 2009

Posts:

3,732

NPSF3000 said: ↑

Err... double check that. Last I checked the generic GetComponent is only marginally slower and the data I collected talks about an order of a magnitude difference.

The source is there - so feel free to run the tests.
Click to expand...

I'd be interested in knowing that difference, even if it's measured in microseconds

n0mad, Jun 25, 2012

#72
npsf3000

Joined:

Sep 19, 2010

Posts:

3,830
n0mad said: ↑

I'd be interested in knowing that difference, even if it's measured in microseconds
Click to expand...

I can't do it now, got to go, but write something along the lines of:

Code (csharp):

//untested

void Start(){

Transform trans;

var sw = System.Diagnostics.Stopwatch.StartNew();

for (int i=0; i<1000000; i++) trans = GetComponent<Transform>();

sw.Stop();

print("GetComponent<> " + sw.elapsedMilliseconds + "ms");

var sw2 = System.Diagnostics.Stopwatch.StartNew();

for (int i=0; i<1000000; i++) trans = GetComponent(typeof(Transform)) as Transform;

sw2.Stop();

print("GetComponent " + sw2.elapsedMilliseconds + "ms");

var sw3 = System.Diagnostics.Stopwatch.StartNew();

for (int i=0; i<1000000; i++) trans = transform;

sw3.Stop();

print("Transform " + sw3.elapsedMilliseconds + "ms");

}
npsf3000, Jun 25, 2012

#73
n0mad

Joined:

Jan 27, 2009

Posts:

3,732

Thanks for the effort, I'd test that and post it here asap.

result :

GetComponent<> : 767ms

GetComponent : 706ms

Transform : 68ms
Click to expand...

Ok, nothing to be afraid of, thank you.
(unit difference of 0.061 microseconds between generic and non-generic)

Last edited: Jun 25, 2012

n0mad, Jun 25, 2012

#74
Games-Foundry

Joined:

May 19, 2011

Posts:

632
Test Objective
To confirm that Unity's Component inherited variables .rigidbody, .transform etc are not cached

Case 1: Cached v Non-Cached Transform

Code (csharp):

public Transform cachedTransform;

public void Update ()

{

NoCache ();

Cached ();

}

public void NoCache ()

{

int i;

Transform newTransform;

for ( i=0; i<100000; i++ )

{

newTransform = transform;

}

}

public void Cached ()

{

int i;

Transform newTransform;

cachedTransform = transform;

for ( i=0; i<100000; i++ )

{

newTransform = cachedTransform;

}

}

Outcome:
* Metrics include deep profile code
- MonoBehaviour.transform takes 16.97ms ( 5.65 ms without deep profile )
- Caching takes 0.3ms ( 0.3ms to 0.8ms without deep profile )

Case 2: Cached v Non-Cached Rigidbody

Code (csharp):

public Rigidbody cachedRigidbody;

public void Update ()

{

NoCache ();

Cached ();

}

public void NoCache ()

{

int i;

Rigidbody newRigidbody;

for ( i=0; i<100000; i++ )

{

newRigidbody = rigidbody;

}

}

public void Cached ()

{

int i;

Rigidbody newRigidbody;

cachedRigidbody = rigidbody;

for ( i=0; i<100000; i++ )

{

newRigidbody = cachedRigidbody;

}

}

Outcome:
* Metrics include deep profile code
- MonoBehaviour.rigidbody takes 17.84ms ( 6.87ms without deep profile )
- Caching takes 0.26ms ( 0.3ms to 0.8ms without deep profile )

Case 3: Cached v Non-Cached Audio

Code (csharp):

public AudioSource cachedAudioSource;

public void Update ()

{

NoCache ();

Cached ();

}

public void NoCache ()

{

int i;

AudioSource newAudioSource;

for ( i=0; i<100000; i++ )

{

newAudioSource = audio;

}

}

public void Cached ()

{

int i;

AudioSource newAudioSource;

cachedAudioSource = audio;

for ( i=0; i<100000; i++ )

{

newAudioSource = cachedAudioSource;

}

}

Outcome:
* Metrics include deep profile code
- MonoBehaviour.audio takes 18.83ms ( 7.64ms without deep profile )
- Caching takes 0.36ms ( 0.3ms to 0.8ms without deep profile )

I think we've established a pattern here so I won't run all the tests.

Conclusion
Valid observation.
Component inherited variables are not cached.
Caching references is strongly recommended.
Last edited: Jun 26, 2012

Games-Foundry, Jun 25, 2012

#75
superpig

Drink more water! Unity Technologies

Joined:

Jan 16, 2011

Posts:

4,657

Ah, interesting. I could have sworn Aras or Lucas told me at one point that they'd optimized it in 3.5, but thinking about it, maybe I'm misremembering and they were talking about caching some of the properties on a Transform (like position/rotation properties etc).

superpig, Jun 25, 2012

#76
n0mad

Joined:

Jan 27, 2009

Posts:

3,732

Yup, interesting thanks.
(anyway caching is generally recommended to avoid mem bandwidth hammering)

n0mad, Jun 25, 2012

#77
Games-Foundry

Joined:

May 19, 2011

Posts:

632

n0mad said: ↑

Interesting topic.
While we're speaking about core data optimizations, I've shared a data struct format I came to create recently, which I'm using for very complicated AI calculations and predictions. In my project these operations are sometimes required more than once per frame (like in a row of deterministic calculation routines), so I had to find a way to cut the crap out of redundancy.
Click to expand...

Nice job n0mad. Filed that little gem away in the old gray matter should the need ever arise.

Games-Foundry, Jun 26, 2012

#78
Arowx

Joined:

Nov 12, 2009

Posts:

8,194

What about other optimisation tips?

E.g. using layers to identify types being faster than tags and names.

Atlasing Textures, Combining Meshes, Batching, Static, Reducing Draw Calls ect?

Arowx, Jun 26, 2012

#79
angrypenguin

Joined:

Dec 29, 2011

Posts:

15,619

There's plenty of places for people to find out about them, but this thread is about scripting/allocations/garbage collection, rather than content generation.

angrypenguin, Jun 26, 2012

#80
jasonkaler

Joined:

Feb 14, 2011

Posts:

242

Superpig said: ↑

Ah, interesting. I could have sworn Aras or Lucas told me at one point that they'd optimized it in 3.5, but thinking about it, maybe I'm misremembering and they were talking about caching some of the properties on a Transform (like position/rotation properties etc).
Click to expand...

It is possible that there is a direct internal reference (ie cache in this context) in unity but that the real overhead is swapping between mono and the unity engine.

I would assume that mono referencing a unity object is slower than mono referencing a mono object.

Anyway, it seems like the best practice is to
a) cache references in script
b) fetch references as seldom as possible
c) allocate memory as seldom as possible
d) Just a jump here back to that pathfinding tangent, if you have a huge list of stuff e.g. 1 million nodes, try use a hierarchy instead of a single collection. I personally would try use a high level graph linking areas, with seperate a graph within each area.

jasonkaler, Jun 26, 2012

#81
npsf3000

Joined:

Sep 19, 2010

Posts:

3,830

gamesfoundry said: ↑

Conclusion
...
Component inherited variables are not cached, simply calling GetComponent<T>.
Click to expand...

I love how after that entire post you didn't actually check this.

Even after sample code was posted.

Even after contrary results were obtained and posted.

Last edited: Jun 26, 2012

npsf3000, Jun 26, 2012

#82
Games-Foundry

Joined:

May 19, 2011

Posts:

632
NPSF3000 said: ↑

I love how after that entire post you didn't actually check this.

Even after sample code was posted.

Even after contrary results were obtained and posted.
Click to expand...

Took me a moment to catch on there. Ok, so the observation of what is happening is incorrect and needs revising, but the overall conclusion that developers should cache references still stands...

Code (csharp):

public Transform cachedTransform;

void Start()

{

Transform trans;

var sw = System.Diagnostics.Stopwatch.StartNew();

for (int i=0; i<1000000; i++) trans = GetComponent<Transform>();

sw.Stop();

print("GetComponent<> " + sw.ElapsedMilliseconds + "ms");

var sw2 = System.Diagnostics.Stopwatch.StartNew();

for (int i=0; i<1000000; i++) trans = GetComponent(typeof(Transform)) as Transform;

sw2.Stop();

print("GetComponent " + sw2.ElapsedMilliseconds + "ms");

var sw3 = System.Diagnostics.Stopwatch.StartNew();

for (int i=0; i<1000000; i++) trans = transform;

sw3.Stop();

print("Transform " + sw3.ElapsedMilliseconds + "ms");

var sw4 = System.Diagnostics.Stopwatch.StartNew();

cachedTransform = transform;

for (int i=0; i<1000000; i++) trans = cachedTransform;

sw4.Stop();

print("Cached Transform " + sw4.ElapsedMilliseconds + "ms");

}

GetComponent<> 663ms
GetComponent 592ms
Transform 54ms
Cached Transform 2ms
Games-Foundry, Jun 26, 2012

#83
superpig

Drink more water! Unity Technologies

Joined:

Jan 16, 2011

Posts:

4,657

Oh. Well that makes it look like .transform is cached on Unity's end, and the 54ms is managed<->native transition overhead.

superpig, Jun 26, 2012

#84
npsf3000

Joined:

Sep 19, 2010

Posts:

3,830
gamesfoundry said: ↑

Took me a moment to catch on there. Ok, so the observation of what is happening is incorrect and needs revising, but the overall conclusion that developers should cache references still stands...
Click to expand...

That was not disputed, what I did dispute was that transform was not the same as GetComponent<Transform>() /GetComponent(typeof(Transform) as Transform. This is borne out by the factor of 10 speed differance in these simple tests.

I have an entire thread about the idea of caching components, and while I do agree there may be benefits:

You are using more memory, and making code more complex.

There are issues around ensuring the cache is accurate - most components can be added/removed during runtime [transform is an exception AFAIK].

I personally think the concept of caching references often suffers from premature usage - where one focuses on one type of efficiency [saving a few ns per frame] and forgets about others [e.g. the dev time maintenance of the extra LOC]. Most of the time I believe the the tradeoff isn't worth it.
Last edited: Jun 26, 2012

npsf3000, Jun 26, 2012

#85
hippocoder

Digital Ape

Joined:

Apr 11, 2010

Posts:

29,723

no offence but made any mobile games recently that run at 60fps? I have, and believe me you need to be anal about caching. End of.

While I'm usually mr practical and I don't care much for wasting my time on useless efforts, recommending that people don't cache stuff or even implying it is pretty bad advice for mobile.

Let's look at functions which can get called several times per frame per object, such as collisions, and there becomes a real need to do so even if the object count is relatively low.

You know that, I know that, but I'm posting this for forum ref.

Last edited: Jun 26, 2012

hippocoder, Jun 26, 2012

#86
Games-Foundry

Joined:

May 19, 2011

Posts:

632

Let's try to keep the discussion away from speculative opinion and focus on discussing empirical evidence and points for consideration.

Anyone got any other interesting code optimizations that are worth testing?

Games-Foundry, Jun 26, 2012

#87
npsf3000

Joined:

Sep 19, 2010

Posts:

3,830

hippocoder said: ↑

no offence but made any mobile games recently that run at 60fps? I have, and believe me you need to be anal about caching. End of.
Click to expand...

Nope, I didn't know that was the only type of application one could release with unity

Seriously though, what performance gains did you get - and how many calls did you cache to get those improvements?

While I'm usually mr practical and I don't care much for wasting my time on useless efforts, recommending that people don't cache stuff or even implying it is pretty bad advice for mobile.
Click to expand...

What did I say?

I personally think the concept of caching references often suffers from premature usage - where one focuses on one type of efficiency [saving a few ns per frame] and forgets about others [e.g. the dev time maintenance of the extra LOC]. Most of the time I believe the the tradeoff isn't worth it.

I do not limit myself to mobile, nor do I say caching is bad. What I do do is highlight the negative side of the equation that people often forget. I also maintain it's my opinion based on my experiences. It's also important to note that this advice is based around the .transform variable that is already cached and is an order of a magnitude faster than GetComponent.

Last edited: Jun 26, 2012

npsf3000, Jun 26, 2012

#88
tatoforever

Joined:

Apr 16, 2009

Posts:

4,368

NPSF3000 said: ↑

Nope, I didn't know that was the only type of application one could release with unity

Seriously though, what performance gains did you get - and how many calls did you cache to get those improvements?

What did I say?

I personally think the concept of caching references often suffers from premature usage - where one focuses on one type of efficiency [saving a few ns per frame] and forgets about others [e.g. the dev time maintenance of the extra LOC]. Most of the time I believe the the tradeoff isn't worth it.

I do not limit myself to mobile, nor do I say caching is bad. What I do do is highlight the negative side of the equation that people often forget. I also maintain it's my opinion based on my experiences. It's also important to note that this advice is based around the .transform variable that is already cached and is an order of a magnitude faster than GetComponent.
Click to expand...

The Transform component is already cached? I don't think so.

tatoforever, Jun 26, 2012

#89
superpig

Drink more water! Unity Technologies

Joined:

Jan 16, 2011

Posts:

4,657

tatoforever said: ↑

The Transform component is already cached? I don't think so.
Click to expand...

GetComponent 592ms
Transform 54ms
Click to expand...

If .transform isn't cached, how do you explain the fact that it's ten times faster than GetComponent?

superpig, Jun 27, 2012

#90
Eric5h5

Volunteer Moderator Moderator

Joined:

Jul 19, 2006

Posts:

32,401

It's slower than manually storing transform in a variable, so I guess it's semi-cached. What is it actually doing behind the scenes?

--Eric

Eric5h5, Jun 27, 2012

#91
tatoforever

Joined:

Apr 16, 2009

Posts:

4,368

Superpig said: ↑

If .transform isn't cached, how do you explain the fact that it's ten times faster than GetComponent?
Click to expand...

I think .transform gets cached the first time you use it (not quite sure though), while GetComponent<T>() actually get the reference every time you use it (slower).
The best thing to do is cache the .transform in a variable prior to use it (that's what i do and i found it even faster than the so called "cached" .transform itself).

Last edited: Jun 27, 2012

tatoforever, Jun 27, 2012

#92
lilymontoute

Joined:

Feb 8, 2011

Posts:

1,181

tatoforever said: ↑

I think .transform gets cached the first time you use it (not quite sure though), while GetComponent<T>() actually get the reference every time you use it (slower).
The best thing to do is cache the .transform in a variable prior to use it (that's what i do and i found it even faster than the so called "cached" .transform itself).
Click to expand...

This. Caching the transform as a local variable is faster than storing it as a field in a class and caching it on initialization. There was some test of this somewhere on the forums, actually.

lilymontoute, Jun 27, 2012

#93
Games-Foundry

Joined:

May 19, 2011

Posts:

632

Thinksquirrel said: ↑

This. Caching the transform as a local variable is faster than storing it as a field in a class and caching it on initialization. There was some test of this somewhere on the forums, actually.
Click to expand...

Based on the results of the local v member variable performance test, this would make sense.

Games-Foundry, Jun 27, 2012

#94
tatoforever

Joined:

Apr 16, 2009

Posts:

4,368

Thinksquirrel said: ↑

This. Caching the transform as a local variable is faster than storing it as a field in a class and caching it on initialization. There was some test of this somewhere on the forums, actually.
Click to expand...

Sorry,
What I mean was that storing .transform in a variable prior to use it (eg: On Awake/Start) is actually faster than using .transform directly. Made several bunch of test and is faster (but only at the beginning).
[EDIT]
Also, i can confirm (as you previously mentioned) that is also faster than using a variable of Transform as a public field. The reason is that it does exactly the same thing as .transform (it gets cached only the first time you use it).
[EDIT2]
Wondering if using .transform will make the CG work a lot more? Imagine that the transform gets cached only when you use it the first time, but what if the CG deletes his reference after a little while and then when you call .transform it gets again cached? This will results in some intermittent hickups. While i cannot certify that's the behavior of things, i know that caching the .transform in a variable made my game a lot faster and lots of hickups dissipated mysteriously. xD

Last edited: Jun 27, 2012

tatoforever, Jun 27, 2012

#95
angrypenguin

Joined:

Dec 29, 2011

Posts:

15,619

Why would the GC delete your reference? The GC's job is to delete data once all references to it are removed, not to go around randomly nerfing your references.

angrypenguin, Jun 27, 2012

#96
tatoforever

Joined:

Apr 16, 2009

Posts:

4,368

angrypenguin said: ↑

Why would the GC delete your reference? The GC's job is to delete data once all references to it are removed, not to go around randomly nerfing your references.
Click to expand...

I'm perfectly aware of what's the GC does. xD
But what if .transform reference is used/cached only temporary? It then gets wiped out by the GC later and then when you try to use it the next time, it gets cached again. I'm not exactly sure what's really happens when you use .transform but why storing it in a variable is faster than simply use .transform?
Those are simply some small reflections/assumptions of what could possible be happening when you use .transform.

Last edited: Jun 27, 2012

tatoforever, Jun 27, 2012

#97
superpig

Drink more water! Unity Technologies

Joined:

Jan 16, 2011

Posts:

4,657

tatoforever said: ↑

I think .transform gets cached the first time you use it (not quite sure though), while GetComponent<T>() actually get the reference every time you use it (slower).
The best thing to do is cache the .transform in a variable prior to use it (that's what i do and i found it even faster than the so called "cached" .transform itself).
Click to expand...

Ah, I see. We're talking about different 'caching' situations: caching it ourselves on the managed side (which is by far the fastest option), versus Unity caching it on the native side (still faster than GetComponent, but a lot slower than managed-side caching due to the overhead of native<->managed transitions).

superpig, Jun 27, 2012

#98
tatoforever

Joined:

Apr 16, 2009

Posts:

4,368

Superpig said: ↑

Ah, I see. We're talking about different 'caching' situations: caching it ourselves on the managed side (which is by far the fastest option), versus Unity caching it on the native side (still faster than GetComponent, but a lot slower than managed-side caching due to the overhead of native<->managed transitions).
Click to expand...

Yes

tatoforever, Jun 27, 2012

#99
Games-Foundry

Joined:

May 19, 2011

Posts:

632
An interesting, uncommon and possibly pointless architectural test here. While there are huge benefits to events that aren't covered here - such as any object being able to become a subscriber - I wanted to test if it would outperform class inheritance overrides, and thus provide an alternative architectural model with better performance.

If this is of interest to other developers, perhaps they can advise on more appropriate test code?

Test Objective
To test the performance of self-targeted events v. inheritance-based overrides.

Code (csharp):

public GoldenRetriever goldenRetriever;

public void Awake ()

{

goldenRetriever = gameObject.AddComponent<GoldenRetriever>();

}

public void Start ()

{

// Event based calling

var sw1 = System.Diagnostics.Stopwatch.StartNew();

for ( int i=0; i<10000000; i++ ) goldenRetriever.DoBarkEvent ();

sw1.Stop();

print( "Event based calling " + sw1.ElapsedMilliseconds + "ms" );

print( "Counter->" + Dog.counter );

// Override based calling

var sw2 = System.Diagnostics.Stopwatch.StartNew();

for ( int i=0; i<10000000; i++ ) ( goldenRetriever as Dog ).DoBarkOverride ();

sw2.Stop();

print( "Override based calling " + sw2.ElapsedMilliseconds + "ms" );

print( "Counter->" + Dog.counter );

}

Code (csharp):

public class Dog : MonoBehaviour

{

public static event Action Bark;

public static int counter;

public virtual void DoBarkOverride ()

{

}

public void DoBarkEvent ()

{

if ( Bark != null )

Bark ();

}

}

Code (csharp):

public class GoldenRetriever : Dog

{

public void Awake ()

{

Dog.Bark += OnBark;

}

public void OnDestroy ()

{

Dog.Bark -= OnBark;

}

// Event handler

public void OnBark ()

{

counter++;

}

// Inheritence override version

public override void DoBarkOverride ()

{

counter++;

}

}

Outcome:
Inheritance-based overriding appears at first glance to be consistently quicker than self-targeted events.

Event-based calling 145ms
Override-based calling 80ms
Last edited: Jun 28, 2012

Games-Foundry, Jun 28, 2012

#100

(You must log in or sign up to reply here.)

Page 2 of 7