Search Unity

Burst and BlobString Issue (Creating Entities in IJobParallelFor)

Discussion in 'Burst' started by MantridJones, Sep 28, 2020.

  1. MantridJones

    MantridJones

    Joined:
    Oct 9, 2014
    Posts:
    20
    Hi there,

    I'm trying to create Entities in an IJobParallelFor and I would like to use Burst. One of the Components I need contains string data.

    Since the strings may have variable length, I figured, I'd use BlobStrings instead of Dynamic Buffers, so that the (potentially long) strings won't use up space in those tiny 16kb entity chunks.

    However, as far as I can see, the only way to fill a BlobString with data is the extension method:
    Code (CSharp):
    1. public static class BlobStringExtensions
    2.     {
    3.         unsafe public static void AllocateString(ref this BlobBuilder builder, ref BlobString blobStr, string value)
    4.         {
    5.             var res = builder.Allocate(ref blobStr.Data, value.Length);
    6.             var len = value.Length;
    7.             fixed (char* p = value)
    8.             {
    9.                 UnsafeUtility.MemCpy(res.GetUnsafePtr(), p, sizeof(char) * len);
    10.             }
    11.         }
    12.     }
    which takes a string as an argument. I'm wondering why there is no overload that takes a NativeArray<byte> or something else that Burst will be fine with?

    Unless I'm mistaken (please someone tell me that I am mistaken), the only choice I have, is to use a BlobArray<byte> and whenever I actually want to look at the data as a string, I'd have to convert it using System.Text.Encoding.UTF8.GetString(blobArray.ToArray())) (which also doesn't work with Burst).

    Is there any good alternative (without using unsafe) to work with strings in ECS?

    Any help would be much appreciated.
     
  2. Antypodish

    Antypodish

    Joined:
    Apr 29, 2014
    Posts:
    10,769
    Is there any reason, you want to work with strings?
    Are you trying do some manipulations of strings in Burst jobs?
     
  3. MantridJones

    MantridJones

    Joined:
    Oct 9, 2014
    Posts:
    20
    I want to use ECS as much as possible. My entities are books. They may have a title, an author, a publisher etc. Each of those is a component. I want to use the job system because of its efficiency and also for learning purposes. But even without burst, afaik, I wouldn't be able to use managed types in a job. So it really surprises me that in two years, they didn't add an overload to populate BlobStrings without the need to pass a string.
    No, after the component is created, the strings are immutable. No need for manipulation. But sometimes I need to read them to display them to the player.
     
  4. Antypodish

    Antypodish

    Joined:
    Apr 29, 2014
    Posts:
    10,769
    I thought blobs with string can be bursted too.
    But if not, I assume you are using standard Unity UI, which is on main thread right and using MonoBehaviour? You can store strings in dictionary for example and read them this way, while using key as an entity index for example.
    Or you could store string as bytes in dynamic buffer? I haven't tried that.

    But yes, there were some while ago discussions, related to strings and DOTS.
     
  5. MantridJones

    MantridJones

    Joined:
    Oct 9, 2014
    Posts:
    20
    I'm still in the concept / try and error phase. Maybe after they are set up, they might be brustable, I don't know. But I can't build the BlobString inside a job, since I need to use a managed string to do that.
    I know how I could do this very easily with MonoBehaviours, that would not be an issue. But my goal is to use DOTS as much as possible.
    I have no UI yet. This book entity system is what I'm starting with on this project. I might have to write my own text rendering system for DOTS to avoid all that managed UI stuff...the rest of the UI should be possible with DOTS PhysicsColliders and raycasting.
    As far as I know, dynamic buffers are stored inside the chunks. At least up to a certain capacity. So it uses up space in the chunks or if I set the capacity to something really low, each lookup would have to read part from the chunk and part from the place where it stored the rest. This option didn't really seem like a good idea for what I want to do.
     
  6. Antypodish

    Antypodish

    Joined:
    Apr 29, 2014
    Posts:
    10,769
    Dynamic Buffer is stored either in chunk, if small, or if capacity was unchanged, or in heap.
    So if you lets say define buffer of bytes, with capacity 10, it will be stored in a chunk. Then if you decide to resize buffer to lets say 1000, it will be moved outside chunk, using only reference to that buffer. So your buffer is not fragmented in chunk and outside at the same time.
     
    MantridJones likes this.
  7. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    4,255
    I 100% agree you found another occurrence of missing API. However, if you are generating blobs at runtime and not properly reference counting them and disposing them, you'll end up with major memory leak issues. This is not trivial to get right, and for that reason I suggest using FixedString variants instead directly in your IComponentData. While you have to make sure you use appropriate sizes for the components, they have the benefit of working in jobs with Burst in pretty much all contexts while having memory managed for you.
     
  8. MantridJones

    MantridJones

    Joined:
    Oct 9, 2014
    Posts:
    20
    I thought about FixedStrings too. But since the strings I read in, are of variable length, it seemed like a sub-optimal idea. Either I use a large string size to allow for larger texts (which wastes chunk memory) or I only allow for a small string size and risk the text not fitting inside (so that I would have to cut off the last part to make it fit).

    @Antypodish I always assumed it would be fragmented. Thank you for correcting me! With that in mind, Dynamic Buffers just got a whole lot more interesting. I could set the capacity to 1 (or maybe even 0 if that is possible) so I only have the 16 bytes for the header inside the chunks and the rest in the heap. I'm gonna try that next.