Search Unity

Multithreading Solutions

Discussion in 'Scripting' started by NathanWarden, May 25, 2011.

  1. NathanWarden

    NathanWarden

    Joined:
    Oct 4, 2005
    Posts:
    663
    This post isn't a complaining thread, but instead is a solution thread.

    I've read several posts of people complaining about how there are many places where Unity isn't multi-threaded.

    The following are places where I've found that multi-threading can be safely used in Unity. Feel free to add ideas and uses to this thread :)


    Real-time Mesh Manipulation

    1) Store the Vector3 array of the mesh's verts property into an array in your script.
    2) Start a thread (or threads at N indexes) that manipulate the Vector3 data
    3) If the data needs to be 100% accurate, set a flag or do a Thread.Sleep in the main thread (such as Unity's Update function or whatever) until the data has been updated. In most cases I suspect the data doesn't need to be 100% accurate, but can be arbitrarily assigned back to the mesh.verts property
    4) In the main thread assign the Vector3 array back to the mesh's verts property.

    NOTE: This method would be beneficial for a single time mesh manipulation, but would need to be a large mesh to offset the cost of starting the threads.


    Image processing

    1) In the main thread, grab the Color[] data using the Texture2D.GetPixels function
    2) Start the thread/threads to manipulate the Color data
    3) Wait in main thread until the threads have completed
    4) assign the manipulated color data to the texture in the main thread using the Texture2D.SetPixels function

    NOTE: This is only beneficial on large images such as 512x512 or greater to offset the cost of starting the threads.


    Pathfinding

    If you have several characters you can store the nav mesh into thread safe data such as the Vector3 class and calculate each pathfinding actor into their own thread, or all in a single thread depending on your needs.


    Calculating Distances of Vector3's

    If you have a ton of objects (maybe a few hundred or more) you can store the positions into a Vector3 array every N frames and calculate the distances between them in another thread.

    Why? Because 3D distance calculations use the Pythagorean theorem which use a square root. This could possibly alleviate hundreds or thousands of square root calculations per check. You may not want to be pushing data around every frame however, because reading/writing data can be more expensive than the square root calculation itself. You definitely will need to do testing to see if it does more harm or good.

    NOTE: this wouldn't make much difference and may make your app slower if there are only a few distances to calculate.


    Sorting

    NOTE 1: This would be pretty rare since to balance the cost of starting the threads necessary you'd need a lot of data to sort.

    NOTE 2: I'm not writing a psuedo algorithm since it depends on the algorithm you use, however it's possible to use pretty much any algorithm to sort data. I recommend the merge sort for data that must be compared. I've written multi-threaded code for the Bubble sort that has sped the sorting up by 20x or more with a dual core CPU, but, this is still slower than sorting with the merge sort in the main thread.


    Again, please feel free to add to this thread as I'm curious where others use multi-threading in Unity :)


    Thanks,
    Nathan
     
    Last edited: May 25, 2011
  2. Eric5h5

    Eric5h5

    Volunteer Moderator Moderator

    Joined:
    Jul 19, 2006
    Posts:
    32,175
    See here for an example of using multi-threading for image processing. Fractscape uses multi-threading for the terrain calculations (useful for when you jack the resolution way up and it takes a little while to compute).

    --Eric
     
  3. NathanWarden

    NathanWarden

    Joined:
    Oct 4, 2005
    Posts:
    663
    Thanks Eric,

    I hadn't really thought about doing that with terrains, makes sense though. Are you using it for painting?


    For those interested in having an example of multithreaded code, here's my multithreaded merge sort:


    Code (csharp):
    1. using UnityEngine;
    2. using System.Collections;
    3. using System.Collections.Generic;
    4. using System.Threading;
    5.  
    6. public class MergeSort : MonoBehaviour {
    7.     float[] values = new float[5000000];
    8.  
    9.     float singleThreadedTime = -1.0f;
    10.     float multiThreadedTime = -1.0f;
    11.  
    12.     private int threadCount {
    13.         get {
    14.             if ( tCount == -1 )
    15.                 tCount = System.Environment.ProcessorCount;
    16.  
    17.             return tCount;
    18.         }
    19.     }
    20.     private int tCount = -1;
    21.  
    22.  
    23.     void Update() {
    24.         if ( Input.GetKeyDown(KeyCode.S) ) {
    25.             DoSorts();
    26.         }
    27.     }
    28.  
    29.    
    30.     void DoSorts() {
    31.         // Single threaded sort
    32.         InitValues();
    33.         DoSingleThreadMergeSort();
    34.  
    35.         // Multi threaded sort
    36.         InitValues();
    37.         StartMultiThreadedSort(threadCount);
    38.     }
    39.  
    40.  
    41.     /* ****** Inits the array with unsorted values ******* */
    42.  
    43.     void InitValues() {
    44.         for ( int i = 0; i < values.Length; i++ )
    45.             values[i] = (Random.value * 1000.0f) + ((float)i / (float)values.Length);
    46.     }
    47.  
    48.  
    49.     /* ****** Single threaded Sort ******* */
    50.  
    51.     void DoSingleThreadMergeSort() {
    52.         float startTime = Time.realtimeSinceStartup;
    53.  
    54.         if ( values.Length > 1 ) {
    55.             // Split the list in half
    56.             MergeSorter(0, values.Length-1);
    57.         }
    58.  
    59.         singleThreadedTime = Time.realtimeSinceStartup - startTime;
    60.     }
    61.  
    62.  
    63.     void MergeSorter(int fIndex, int lIndex) {
    64.         // range will be greater than 1 if there are more than 2 elements in this split
    65.         if ( lIndex-fIndex > 0 ) {
    66.             int firstIndexR = 0;
    67.             int lastIndexR = 0;
    68.             int firstIndexL = 0;
    69.             int lastIndexL = 0;
    70.  
    71.             GetArrayHalf(fIndex, lIndex, out firstIndexL, out lastIndexL, true);
    72.             GetArrayHalf(fIndex, lIndex, out firstIndexR, out lastIndexR, false);
    73.  
    74.             MergeSorter(firstIndexL, lastIndexL);
    75.             MergeSorter(firstIndexR, lastIndexR);
    76.  
    77.             MergeArrays(firstIndexL, lastIndexL, firstIndexR, lastIndexR);
    78.         }
    79.     }
    80.  
    81.  
    82.     void MergeArrays(int firstIndexL, int lastIndexL, int firstIndexR, int lastIndexR) {
    83.         int leftSize = (lastIndexL-firstIndexL)+1;
    84.         int rightSize = (lastIndexR-firstIndexR)+1;
    85.         int dataSize = leftSize + rightSize;
    86.         float[] results = new float[dataSize];
    87.         int leftIndex = 0;
    88.         int rightIndex = 0;
    89.  
    90.         for ( int i = 0; i < dataSize; i++ ) {
    91.             if ( leftIndex >= leftSize ) {
    92.                 results[i] = values[firstIndexR+rightIndex++];
    93.             }
    94.             else if ( rightIndex >= rightSize ) {
    95.                 results[i] = values[firstIndexL+leftIndex++];
    96.             }
    97.             else if ( values[firstIndexL+leftIndex] < values[firstIndexR+rightIndex] ) {
    98.                 results[i] = values[firstIndexL+leftIndex++];
    99.             }
    100.             else {
    101.                 results[i] = values[firstIndexR+rightIndex++];
    102.             }
    103.         }
    104.  
    105.         for ( int i = 0; i < dataSize; i++ ) {
    106.             values[firstIndexL+i] = results[i];
    107.         }
    108.     }
    109.  
    110.  
    111.     /* ****** Multithreaded Sort ******* */
    112.  
    113.     void StartMultiThreadedSort(int numThreads) {
    114.         float startTime = Time.realtimeSinceStartup;
    115.  
    116.         // Sort on multiple threads (sorts every X elements where X == threadCount)
    117.         HandleThreads(0, numThreads);
    118.  
    119.         multiThreadedTime = Time.realtimeSinceStartup - startTime;
    120.     }
    121.  
    122.  
    123.     void HandleThreads(int additionalThreads, int threadTotal) {
    124.         int threads = threadTotal + additionalThreads;
    125.         SortParameters[] sortParameters = new SortParameters[threads];
    126.         bool done = true;
    127.  
    128.         if ( threads > values.Length ) {
    129.             threads = values.Length;
    130.             sortParameters = new SortParameters[threads];
    131.         }
    132.  
    133.         // Start the threads
    134.         for ( int i = 0; i < threads; i++ ) {
    135.             float valuesPerThread = (float)values.Length/(float)threads;
    136.             int firstIndex = (int)Mathf.Round((float)i*valuesPerThread);
    137.             int lastIndex = (int)(Mathf.Round((float)(i+1.0f) * valuesPerThread)-1.0f);
    138.  
    139.             sortParameters[i] = new SortParameters();
    140.             sortParameters[i].threadNum = i;
    141.             sortParameters[i].threadCount = threadTotal;
    142.             sortParameters[i].firstIndex = firstIndex;
    143.             sortParameters[i].lastIndex = lastIndex;
    144.  
    145.             ThreadPool.QueueUserWorkItem(SortingThread, sortParameters[i]);
    146.         }
    147.  
    148.         // Wait for threads to finish
    149.         do {
    150.             done = true;
    151.  
    152.             for ( int i = 0; i < sortParameters.Length; i++ ) {
    153.                 if ( !sortParameters[i].isDone ) {
    154.                     done = false;
    155.                     break;
    156.                 }
    157.             }
    158.  
    159.             Thread.Sleep(10);
    160.         }
    161.         while ( !done );
    162.  
    163.         for ( int i = 0; i < sortParameters.Length-1; i++ ) {
    164.             SortParameters sp1 = sortParameters[i];
    165.             SortParameters sp2 = sortParameters[i+1];
    166.  
    167.             MergeArrays(sp1.firstIndex, sp1.lastIndex,sp2.firstIndex, sp2.lastIndex);
    168.         }
    169.     }
    170.  
    171.  
    172.     // Here's were the thread does its' work
    173.     void SortingThread(object parameters) {
    174.         SortParameters sp = parameters as SortParameters;
    175.  
    176.         MergeSorter(sp.firstIndex, sp.lastIndex);
    177.  
    178.         sp.isDone = true;
    179.     }
    180.  
    181.  
    182.     class SortParameters {
    183.         public int threadNum = 0;
    184.         public int threadCount = 1;
    185.         public bool isDone = false;
    186.         public int firstIndex = 0;
    187.         public int lastIndex = 0;
    188.     }
    189.  
    190.  
    191.     /* ****** Grabs the indexes for half of an array ******* */
    192.  
    193.     void GetArrayHalf(int fIndex, int lIndex, out int leftIndex, out int rightIndex, bool left) {
    194.         int halfPoint = (int)Mathf.Lerp(fIndex, lIndex, 0.5f);
    195.  
    196.         if ( left ) {
    197.             leftIndex = fIndex;
    198.             rightIndex = halfPoint;
    199.         }
    200.         else {
    201.             leftIndex = halfPoint+1;
    202.             rightIndex = lIndex;
    203.         }
    204.     }
    205.  
    206.  
    207.     void OnGUI() {
    208.         bool showSingleThreaded = singleThreadedTime >= 0.0f;
    209.         bool showMultiThreaded = multiThreadedTime >= 0.0f;
    210.  
    211.         if ( showSingleThreaded )
    212.             GUILayout.TextField("Single thread time: " + singleThreadedTime.ToString());
    213.  
    214.         if ( showMultiThreaded )
    215.             GUILayout.TextField("Multithreaded time: " + multiThreadedTime.ToString() + " (sec) on " + threadCount + " threads.");
    216.  
    217.         if ( showSingleThreaded  showMultiThreaded ) {
    218.             if ( multiThreadedTime < singleThreadedTime )
    219.                 GUILayout.TextField("Multithreaded was " + singleThreadedTime/multiThreadedTime + " times faster!");
    220.             else
    221.                 GUILayout.TextField("Single threaded was " + multiThreadedTime/singleThreadedTime + " times faster!");
    222.         }
    223.  
    224.         if ( GUILayout.Button("Update ('s')") ) {
    225.             DoSorts();
    226.         }
    227.     }
    228. }
     
  4. Eric5h5

    Eric5h5

    Volunteer Moderator Moderator

    Joined:
    Jul 19, 2006
    Posts:
    32,175
    It's for the fractal terrain calculations, and auto-texturing.

    --Eric
     
  5. Marrrk

    Marrrk

    Joined:
    Mar 21, 2011
    Posts:
    1,032
  6. NathanWarden

    NathanWarden

    Joined:
    Oct 4, 2005
    Posts:
    663
    Cool, nice tools Mark. Looks like you've put a lot of work into them. :)


    Okay, here's a code example of how to manipulate a meshes on threads. If you want to compare the difference between multithreaded and single, just check/uncheck the isMultithreaded checkbox on the AllMeshes object and the setting will propogate to the sub objects.

    My machine gets about a 10 fps increase. 18-19 for single threaded and 25-28 for multi.
     

    Attached Files:

  7. Nick-Wiggill

    Nick-Wiggill

    Joined:
    Jan 31, 2010
    Posts:
    46
    EDIT (after erasing the rest of my post): I see, System.Threading.ThreadPool is being used. A thread pool has no thread spawn and kill overhead. Thanks for sharing your code, Nathan. So is that what all of Unity uses? Mono thread pools? If so, that's a major win IMO since that's the design pattern I'd favour any day for MT.
     
    Last edited: Feb 2, 2012
  8. NathanWarden

    NathanWarden

    Joined:
    Oct 4, 2005
    Posts:
    663
    Hi Nick, the Unity Engine itself uses C++ at it's core, so the ThreadPool you'd use for your code would be a different pool than what they're using, which shouldn't be a big deal. But, the thread pool is definitely worth using. There is a slight overhead when you use the first few threads since it spawns them as you need them, but once they're spawned they're super fast to reuse.

    Here's what I was posting based on your original post and may be stuff you already know, but I hope it's helpful in some way:

    They don't need to release the functions every cycle, they can simply sleep the threads at the end of each frame and start them up again once it needs to be drawn again. Certainly there is a slight overhead for this, but, if you need to run multi-threaded code per frame you can only benefit from Unity having their code threaded as well since it'll take less time to draw a frame, leaving you with more time between frames to do more stuff. This is actually a fundamental rule of parallelism (IE. Amdahl's Law)

    Amdahl's Law in a very tight nutshell says that as you add cores a system cannot go any faster beyond it's sequential parts. So if most of the time is spent rendering a frame (which is usually the case) and if the rendering code and core engine code is not parallel you have a serious performance problem. For instance, if 50% of the time is spent in the engine and 50% of the time is in your code and if the engine code is all sequential and all of your code is all parallel, even if you have a trillion cores running your program it will never go faster than a 2x speedup.

    Here's a reference,
    http://en.wikipedia.org/wiki/Amdahl's_law


    Hope this helps you in some way,
    Nathan
     
  9. Nick-Wiggill

    Nick-Wiggill

    Joined:
    Jan 31, 2010
    Posts:
    46
    Thanks for your reply Nathan.

    I didn't know about Amdahl's law, but yes, it seems to formalise the limits of parallelisation. It's a lot like parallelising raytracing, for instance: while you can break it down to the pixel level, each ray you cast through those pixels from the near plane, and out into space, has to do sequential raymarch processing for pixel composition and early ray termination. In the Ahmdahl example, the difference is that one component becomes the bottleneck, rather than all of them.

    Re the initial overheads, yep, normal for threadpools and very acceptable considering the alternatives to threadpooling! I am actually very happy to have discovered threadpools here, and to know I don't have to write my own implementation.