Feedback How to survive without smart pointers?

SamOld · Oct 19, 2020

One thing that I repeatedly come up against as a pain point when trying to build larger scale projects in the DOTS ecosystem is the lack of smart pointers, which are C++'s solution to managing memory ownership.

Without GC, we spend a lot of time dealing with manually allocated memory which we must manually dispose of exactly once. This gets tricky if we want to share that memory in multiple places, or build modular systems that don't know the desired ownership semantics of their inputs.

C++ solves this classic problem with "smart pointers". These are types that thinly wrap pointers and track their ownership.
unique_ptr<T>
is a pointer type that restricts copying in such a way as to only ever have one owner, and
shared_ptr<T>
is a pointer type that does reference counting to allow multiple pieces of code to share ownership of one object and deallocate it when the last reference disappears.

C# does not have these types, and I believe that it is not possible to build good implementations of them due to the lack of C++ features like overridable copy constructors and deterministic struct finalizers.

How are we meant to manage memory resource ownership without smart pointers? What patterns are you using, and what does Unity recommend as the best practice here? I think it would be good to get some input from the Unity staff on this!

I'll give an example of a problem situation that I've hit quite often.

I have some small widely reusable type that needs to be allocated. Let's say a
NativeAnimationCurve
type that can be built from the standard managed
AnimationCurve
and holds its data in a private
NativeArray
. It's an exact - but perhaps immutable -
AnimationCurve
equivalent for use from DOTS code, and many different pieces of code may want to consume it in different ways.

I have some composition type that holds - amongst other things - one or more
NativeAnimationCurve
s. Let's say that this composed type is
ParticleBehaviour
. When building this type, I need to pass in the curves. Now who owns them?

If
ParticleBehaviour
is disposed of, should it dispose the curves? If yes, then that design decision makes it impossible to share that curve instance anywhere else. If no, then I have to somehow track and dispose the curves independently of the
ParticleBehaviour
, even if I only end up using them in one place.

When I implement the
NativeAnimationCurve
, I don't know where and how I will be consuming it. In fact, it's probably in multiple places in different ways. When I implement the
ParticleBehaviour
, I don't know how I'll be providing the curves or whether they will be shared data. Separation of concerns says that neither should need to know that detail about the other, and this is a requirement for modular re-usability and composition.

The clean solution to this is to use value semantics and allocate copies of the curves going into the composed type. Of course this can be a performance issue for both speed and memory if it's done either frequently or many times.

Even if I choose one of these options, I then have to find a way of cleanly documenting it on the API of the type, and ideally enforcing correct usage.

This is cumbersome, bad for maintainability, and makes me need to pause and think about a far reaching design decision when doing something that should be as simple as creating a quick composition type.

How are people handling this practically, and what is the Unity staff recommended way to deal with this fundamental challenge introduced by the allocator / Dispose pattern?

I'm marking this as feedback rather than help wanted because I'd like staff attention and because I'm explaining a pain point, but I would also like practical tips. If there's a good solution available here, then it at least needs better documentation.

Joachim_Ante · Oct 19, 2020

unique_ptr & shared_ptr are terrible solutions. In our C++ unity codebase we ban their usage...
That said, the problem you describe is very real...

Memory allocations need to be hierarchical so that ownership can be handled hierarchically.
Safety Handles need to understand the hierarchical nature of allocations and find any incorrect usage. And they need to understand lifetime.

Eg. a world owns systems, systems own sets of containers.
if I destroy the world, why do i have to dispose the containers in the systems explicitly?

Additionally you might want to have allocations that have their life time pre-determined upfront. Eg. This allocation lives for exactly one frame etc.

We are working on making this simpler. And in the process not only making it simpler but also making allocations much cheaper. Because when you destroy things in batch, you can save a lot of time...

Lieene-Guo · Oct 19, 2020

SamOld said: ↑
One thing that I repeatedly come up against as a pain point when trying to build larger scale projects in the DOTS ecosystem is the lack of smart pointers, which are C++'s solution to managing memory ownership.

Without GC, we spend a lot of time dealing with manually allocated memory which we must manually dispose of exactly once. This gets tricky if we want to share that memory in multiple places, or build modular systems that don't know the desired ownership semantics of their inputs.

C++ solves this classic problem with "smart pointers". These are types that thinly wrap pointers and track their ownership.
unique_ptr<T>
is a pointer type that restricts copying in such a way as to only ever have one owner, and
shared_ptr<T>
is a pointer type that does reference counting to allow multiple pieces of code to share ownership of one object and deallocate it when the last reference disappears.

C# does not have these types, and I believe that it is not possible to build good implementations of them due to the lack of C++ features like overridable copy constructors and deterministic struct finalizers.

How are we meant to manage memory resource ownership without smart pointers? What patterns are you using, and what does Unity recommend as the best practice here? I think it would be good to get some input from the Unity staff on this!

I'll give an example of a problem situation that I've hit quite often.

I have some small widely reusable type that needs to be allocated. Let's say a
NativeAnimationCurve
type that can be built from the standard managed
AnimationCurve
and holds its data in a private
NativeArray
. It's an exact - but perhaps immutable -
AnimationCurve
equivalent for use from DOTS code, and many different pieces of code may want to consume it in different ways.

I have some composition type that holds - amongst other things - one or more
NativeAnimationCurve
s. Let's say that this composed type is
ParticleBehaviour
. When building this type, I need to pass in the curves. Now who owns them?

If
ParticleBehaviour
is disposed of, should it dispose the curves? If yes, then that design decision makes it impossible to share that curve instance anywhere else. If no, then I have to somehow track and dispose the curves independently of the
ParticleBehaviour
, even if I only end up using them in one place.

When I implement the
NativeAnimationCurve
, I don't know where and how I will be consuming it. In fact, it's probably in multiple places in different ways. When I implement the
ParticleBehaviour
, I don't know how I'll be providing the curves or whether they will be shared data. Separation of concerns says that neither should need to know that detail about the other, and this is a requirement for modular re-usability and composition.

The clean solution to this is to use value semantics and allocate copies of the curves going into the composed type. Of course this can be a performance issue for both speed and memory if it's done either frequently or many times.

Even if I choose one of these options, I then have to find a way of cleanly documenting it on the API of the type, and ideally enforcing correct usage.

This is cumbersome, bad for maintainability, and makes me need to pause and think about a far reaching design decision when doing something that should be as simple as creating a quick composition type.

How are people handling this practically, and what is the Unity staff recommended way to deal with this fundamental challenge introduced by the allocator / Dispose pattern?

I'm marking this as feedback rather than help wanted because I'd like staff attention and because I'm explaining a pain point, but I would also like practical tips. If there's a good solution available here, then it at least needs better documentation.
Click to expand...
I make game with OGRE 12 years ago. And I don't use smart pointers. And I survived...

Lieene-Guo · Oct 19, 2020

for curve https://forum.unity.com/threads/a-fast-blobcurve.985941/
for all shared data use Blob

SamOld · Oct 19, 2020

Joachim_Ante said: ↑

unique_ptr & shared_ptr are terrible solutions. In our C++ unity codebase we ban their usage...
Click to expand...

Fair enough! I don't have enough C++ experience to have come to that conclusion yet and have seen them widely lauded as the best solution without deep language integration.

Joachim_Ante said: ↑

We are working on making this simpler. And in the process not only making it simpler but also making allocations much cheaper. Because when you destroy things in batch, you can save a lot of time...
Click to expand...

That sounds very exciting! I would love to know more about the approach that you're taking here if you're willing to share any details or have something hidden away in a blog post or something?

In parallel to this thread, I've also created a thread on the C# github talking about how close we could get to a decent smart pointer implementation in C#, because I'm trying to find a practical solution for use today. I guess that I should link that from here.

snacktime · Oct 19, 2020

It's not so much a problem atm precisely because hierarchy is not really supported, so you just don't use it much.

That and games have a small set of dominant scopes like frame, scene, etc.. So it's not nearly as bad as the generic case.

If you are building a framework/engine you leverage that context. Solving in a generic way doesn't really make sense. Because the context is most things fall within the specific patterns of the framework. Leveraging that you can do things like create very efficient reference tracking for framework specific scopes. Some things will always fall outside the norm but that's just inherent in the approach. If you capture enough it's still a huge win.

Hierarchy at some point the correct approach is just design it out. But this is I think challenging because hierarchy can be a good approach or not depending on the specific context. So how do you guide people that lacking experience would just default to hierarchy where it's not the right choice, while still providing the choice where there are obvious use cases. Unity has shown a fairly consistent pattern here of default to restrictive and then open up based on actual use cases. Can't really fault that approach even if it's at times frustrating to code around as they work stuff out.

burningmime · Oct 20, 2020

Joachim_Ante said: ↑

unique_ptr & shared_ptr are terrible solutions. In our C++ unity codebase we ban their usage...
Click to expand...

I can agree about shared_ptr (it's useful occasionally in multithreading), but unique_ptr is just an RAII-friendly version of new/delete. When I interned at a C++ shop, the rule was the opposite: "no explicit new/delete without justification".

SamOld · Oct 20, 2020

snacktime said: ↑

It's not so much a problem atm precisely because hierarchy is not really supported, so you just don't use it much.
Click to expand...

Perhaps I'm missing an architectural trick, but isn't this hugely restrictive? I don't know how to do this without sacrificing modular resusability, abstraction, and immutability. Obviously hierarchy isn't needed much for data in the ECS, but in other places I run up against it a lot. This is particularly true of small highly reusable utility types like
NativeAnimationCurve
which logically get assembled together. Generally this is always an easy problem to solve, but the solutions tend not to feel clean and maintainable, and instead rely heavily on documentation.

I find that my code is littered with param comments like "Curve is NOT disposed, and should be kept alive as long as this struct!", and "Curve is NOT disposed, but may be disposed as soon as this function returns!", and "Claims ownership of curve! Do NOT use or dispose of curve after this call!"

It works, and perhaps "survive" in the title is a little melodramatic, but it feels very unmaintainable and error prone. I try to overcome this with consistent patterns, but even then they must be documented everywhere.

I'm doing something that I'm prone to, which is letting the perfect be the enemy of the working. I'm sure that everyone with extensive C++ experience just thinks that I'm being a baby.

snacktime said: ↑

So how do you guide people that lacking experience would just default to hierarchy where it's not the right choice, while still providing the choice where there are obvious use cases.
Click to expand...

Perhaps I'm "people" in this quote. I generally feel like I'm using hierarchy in only the right places, but this still comes up often enough to be an annoyance. This comes up more in the data loading stage pre-gameplay than during gameplay itself, where everything is usually blobified. But building those blobs can be a mess.

SamOld · Oct 20, 2020

burningmime said: ↑

I can agree about shared_ptr (it's useful occasionally in multithreading), but unique_ptr is just an RAII-friendly version of new/delete. When I interned at a C++ shop, the rule was the opposite: "no explicit new/delete without justification".
Click to expand...

I admit that I was wanting to use
shared_ptr
in ways that those with more experience than I seem to think is a bad idea.

My aim was to be able to write code with the rule that each unit always disposes of what it's given, without concern for whether the calling code may want to keep using it. Disposing then means "I'm done with this", but reference counting or similar can allow the calling code to hang onto it if it wants to keep using it.

I know that normally it would be preferred to dispose in the same place where it's created, but I find that that gets messy when sometimes I want to move that data into a long living structure. I either have to make a copy of it and let the original be disposed, or document that in this case, calling the function actually claims ownership.

I'm sure that this is lack of experience speaking. I have plenty of experience in managed languages, and almost none in languages where I have to manage memory manually. I think it's a problem that Unity is currently targeting users with the former skillset without providing them with the tools or guidance to make the transition cleanly, but from Joachim_Ante's post in this thread it sounds like they have some sort of plan for that!

This thread was partly a plea for Unity to make this easier, but also a call for guidance for those of us in my situation from community members who have the relevant experience. I'm keen to hear what patterns people use to deal with this! I know that the C++ answer is generally RAII, but unless I'm mistaken we can't do that reliably without smart pointers, can we? RAII is about clean ownership management, which is the problem I'm floundering for a solution for. Perhaps I've just missed a way to reliably apply that pattern in C#!

I don't know how to achieve RAII without struct destructors. The next best thing seems to
using
scopes, but then I have to make assumptions in certain functions that they're going to be called in safe ways, which seems like a violation of separation of concerns. Perhaps that's where I'm worrying too much and experience would tell me to not care.

Joachim_Ante · Oct 20, 2020

I took a couple of code examples, that should illustrate concretely the direction we are trying to take with allocators going forward. This is currently work in progress. I hope the examples make the hierarchical nature of allocators clear.

As always all of the samples show how to write safe multithreaded code. We will ensure that any incorrect usage is checked using the same system we already employ for containers. Both life time / double dispose / hierarchical destruction / parallel incorrect access.

We think this is an optimal combination of ease of use / performance / safety.

Code (CSharp):

// **** 1. Bump Allocator usage in inner loops

// Fastpath - Allocation in 2-3 instructions

var bumpAlloc = new BumpAllocator(1024, Allocator.Persistent);

var array = new NativeArray<float>(1024, bump);

bumpAlloc.Dispose();

// Generic path - slow (via function pointer)

Allocator bumpAlloc = Allocator.CreateAllocator(new BumpAllocator(1024), Allocator.Persistent);

var array = new NativeArray<float>(1024, bumpAlloc);

bumpAlloc.Dispose();

Code (CSharp):

// **** 2. Reusing memory with stack alloc

StackAllocator stack = new StackAllocator(1024, Allocator.Persistent);

var array = new NativeArray<float>(1024, ref stack);

array.Dispose(ref stack);

// reuse memory from previous allocation

var array2 = new NativeArray<float>(1024, ref stack);

array2.Dispose(ref stack);

Code (CSharp):

class BoidsSystem : ISystemBase

{

void OnUpdate(ref SystemState state)

{

var jobHandle = new MyJob

{

// NOTE: no explicit deallocation because the life time is determined up front

array = new NativeArray<float3>(1000, ref state.FrameLinearAllocatorFastPath),

}.Schedule();

}

struct MyJob : IJobParallelFor

{

NativeArray<float3> array;

ComponentDataFromEntity<Translation> Translation;

void Execute(int index)

{

array[index] = 5;

Translation[...] = array[...];

}

}

}

Code (CSharp):

// Usage code

struct MyJob : IJobFor

{

BumpAllocator bumpAlloc;

// This is called before the first Execute method on this thread

void Begin()

{

bumpAlloc = BumpAllocator(1024, Allocator.Temp);

}

// This is called after the last Execute method is called on this thread

void End()

{

bumpAlloc.Dispose();

}

void Execute(int index)

{

// Destroys all previous allocations

bumpAlloc.Reset();

var array = new NativeArray<float>(100, ref bumpAlloc);

var array2 = new NativeArray<int>(20, ref bumpAlloc);

}

}

Code (CSharp):

struct MyJob : IJob

{

void Execute()

{

var other = new NativeArray<float>(100, Allocator.Temporary);

NativeArray<float> array;

using(var scope = new TempScope())

{

array = new NativeArray<float>(100, Allocator.Temporary);

}

// Scope has destroyed array

Assert.Throw(array[0]);

Assert.AreEqual(0, other[0])

}

}

Code (CSharp):

// Memory is automatically returned on end temporary scope.

// Temporary scopes are nestable

Allocator.TemporaryFastPath.BeginTemporaryScope();

var arrayOuter = new NativeArray<float>(100, Allocator.TemporaryFastPath);

Allocator.TemporaryFastPath.BeginTemporaryScope();

var inner1 = new NativeArray<float>(100, Allocator.TemporaryFastPath);

var inner2 = new NativeArray<float>(100, Allocator.TemporaryFastPath);

Allocator.TemporaryFastPath.EndTemporaryScope();

arrayOuter[0] = 5;

Assert.Throws(inner1[5] = 5);

Allocator.TemporaryFastPath.EndTemporaryScope();

Assert.Throws(arrayOuter[5] = 5);

tertle · Oct 20, 2020

I am quite excited by this.

Also seems to solve the issues with allocations and Fixed Update?

Joachim_Ante · Oct 20, 2020

tertle said: ↑

I am quite excited by this.

Also seems to solve the issues with allocations and Fixed Update?
Click to expand...

Thats the intention yes.

DreamingImLatios · Oct 20, 2020

Joachim_Ante said: ↑

I took a couple of code examples, that should illustrate concretely the direction we are trying to take with allocators going forward.
Click to expand...

These are interesting, and look like they might directly solve some performance problems that are limiting the scale of my projects. However, I don't know if that discussion belongs in this thread, because...

SamOld said: ↑

I don't know how to do this without sacrificing modular resusability, abstraction, and immutability.
Click to expand...

I think the real issue here is that it is way too easy to lose a reference to a blob and not have Unity report the memory leak.

Guidance as to how to track blobs, especially those created by ConvertToEntity at runtime, that applies either for Entities 0.14 or the upcoming Entities 0.16 would be appreciated!

Lieene-Guo · Oct 21, 2020

DreamingImLatios said: ↑

These are interesting, and look like they might directly solve some performance problems that are limiting the scale of my projects. However, I don't know if that discussion belongs in this thread, because...

I think the real issue here is that it is way too easy to lose a reference to a blob and not have Unity report the memory leak.

Guidance as to how to track blobs, especially those created by ConvertToEntity at runtime, that applies either for Entities 0.14 or the upcoming Entities 0.16 would be appreciated!
Click to expand...

Look like BlobAssetStore is doing the job.
But I have no idea what happens when a BlobAsset get Disposed, while the world is still running.
The only reason for that should be not enough memory space, as far as I can think of.

DreamingImLatios · Oct 21, 2020

Lieene-Guo said: ↑

Look like BlobAssetStore is doing the job.
But I have no idea what happens when a BlobAsset get Disposed, while the world is still running.
The only reason for that should be not enough memory space, as far as I can think of.
Click to expand...

Wait...
You're right!

I did not realize ConvertToEntitySystem was caching and reusing the BlobAssetStore every conversion. And because of that, my blobs were hitting cache (because I use BlobAssetComputationContext), which my tool that was supposed to tell me if it was bugged failed to consider.

So in conclusion:

My validator was bugged.

I was not seeing memory leak warnings because the blobs were actually being disposed on system destruction. I have no memory leaks.

I do not have to dispose blobs at runtime because my blobs are only generated from GameObjectConversion from the same prefabs so they are always hitting cache.

I actually have all the tap points I need to solve proper runtime generation of blobs.

Thank you so much for making me look at the code again!

@SamOld If you wish I will have plenty of time next week to discuss this solution with you.

nyanpath · Oct 22, 2020

I appreciate putting this in but this is getting rather low-level for efficient usage. I would like to have easier methods with less lines of code needed to write this, as it would be beneficial if the game engine could provide more of these functions, otherwise it's almost like it is only funneling existing functionality derived from the language of which it is based on.

I really hope Unity can provide more tools to make this more efficient and clean to include in projects, perhaps by default.

[edit] What I mean is: I use a game engine so that I will not have to think about this because the engine is supposed to handle all of this.

rauiz · Apr 23, 2021

Joachim_Ante said: ↑

I took a couple of code examples, that should illustrate concretely the direction we are trying to take with allocators going forward. This is currently work in progress. I hope the examples make the hierarchical nature of allocators clear.

As always all of the samples show how to write safe multithreaded code. We will ensure that any incorrect usage is checked using the same system we already employ for containers. Both life time / double dispose / hierarchical destruction / parallel incorrect access.

We think this is an optimal combination of ease of use / performance / safety.

Code (CSharp):

// **** 1. Bump Allocator usage in inner loops

// Fastpath - Allocation in 2-3 instructions

var bumpAlloc = new BumpAllocator(1024, Allocator.Persistent);

var array = new NativeArray<float>(1024, bump);

bumpAlloc.Dispose();

// Generic path - slow (via function pointer)

Allocator bumpAlloc = Allocator.CreateAllocator(new BumpAllocator(1024), Allocator.Persistent);

var array = new NativeArray<float>(1024, bumpAlloc);

bumpAlloc.Dispose();

Code (CSharp):

// **** 2. Reusing memory with stack alloc

StackAllocator stack = new StackAllocator(1024, Allocator.Persistent);

var array = new NativeArray<float>(1024, ref stack);

array.Dispose(ref stack);

// reuse memory from previous allocation

var array2 = new NativeArray<float>(1024, ref stack);

array2.Dispose(ref stack);

Code (CSharp):

class BoidsSystem : ISystemBase

{

void OnUpdate(ref SystemState state)

{

var jobHandle = new MyJob

{

// NOTE: no explicit deallocation because the life time is determined up front

array = new NativeArray<float3>(1000, ref state.FrameLinearAllocatorFastPath),

}.Schedule();

}

struct MyJob : IJobParallelFor

{

NativeArray<float3> array;

ComponentDataFromEntity<Translation> Translation;

void Execute(int index)

{

array[index] = 5;

Translation[...] = array[...];

}

}

}

Code (CSharp):

// Usage code

struct MyJob : IJobFor

{

BumpAllocator bumpAlloc;

// This is called before the first Execute method on this thread

void Begin()

{

bumpAlloc = BumpAllocator(1024, Allocator.Temp);

}

// This is called after the last Execute method is called on this thread

void End()

{

bumpAlloc.Dispose();

}

void Execute(int index)

{

// Destroys all previous allocations

bumpAlloc.Reset();

var array = new NativeArray<float>(100, ref bumpAlloc);

var array2 = new NativeArray<int>(20, ref bumpAlloc);

}

}

Code (CSharp):

struct MyJob : IJob

{

void Execute()

{

var other = new NativeArray<float>(100, Allocator.Temporary);

NativeArray<float> array;

using(var scope = new TempScope())

{

array = new NativeArray<float>(100, Allocator.Temporary);

}

// Scope has destroyed array

Assert.Throw(array[0]);

Assert.AreEqual(0, other[0])

}

}

Code (CSharp):

// Memory is automatically returned on end temporary scope.

// Temporary scopes are nestable

Allocator.TemporaryFastPath.BeginTemporaryScope();

var arrayOuter = new NativeArray<float>(100, Allocator.TemporaryFastPath);

Allocator.TemporaryFastPath.BeginTemporaryScope();

var inner1 = new NativeArray<float>(100, Allocator.TemporaryFastPath);

var inner2 = new NativeArray<float>(100, Allocator.TemporaryFastPath);

Allocator.TemporaryFastPath.EndTemporaryScope();

arrayOuter[0] = 5;

Assert.Throws(inner1[5] = 5);

Allocator.TemporaryFastPath.EndTemporaryScope();

Assert.Throws(arrayOuter[5] = 5);

Click to expand...

Sorry for reviving this ~6-month old thread, but I'd like to know if there are any updates you can share on these allocators, @Joachim_Ante.

These seem like a great feature to have for anyone building their systems in DOTS and Unsafe/HPC# but without ECS's layout (as sometimes, it doesn't match all that well to the problem). Explicit custom allocators like these would be very helpful in several systems I'm looking to build: some networking stuff, some replay related stuff and some other AI-related stuff -- wish I could be more specific on this here on the forum (I'm planning on sharing most of this stuff eventually, just not right now).

Mostly though, just having that Bump Allocator that I can determine and document lifetime by myself would be enough for most cases (though that FrameLinear one --- that I assume is a single-frame lifetime Bump Allocator --- would be great as well). I could implement one but I'd like to avoid implementing things that'll be natively supported by Unity at some point in the near future as doing so would also make interacting with Native Collections (NativeArrays and others) a little more cumbersome. Hence the question.

On a side note, I'm curious if these would interact with the EntityManager, Chunks or BlobAssets in any way?

Thanks in advance

Search Unity

Feedback How to survive without smart pointers?

SamOld

Joachim_Ante

Unity Technologies

Lieene-Guo

Lieene-Guo

SamOld

snacktime

burningmime

SamOld

SamOld

Joachim_Ante

Unity Technologies

tertle

Joachim_Ante

Unity Technologies

DreamingImLatios

Lieene-Guo

DreamingImLatios

nyanpath

rauiz

Search Unity

Unity ID

Useful Searches

Feedback How to survive without smart pointers?

Unity Technologies

Unity Technologies

Unity Technologies