Search Unity

OpenCL from Unity

Discussion in 'Editor & General Support' started by fct509, Aug 2, 2019.

  1. fct509

    fct509

    Joined:
    Aug 15, 2018
    Posts:
    108
    Hi,

    I'm working on a Unity Editor extension for Unity 2018.4 (I'm expecting it to work on newer versions too), and I need to do some computationally intensive work from the GPU. I already got a simple prototype working with a Compute Shader, but it turns out that the only way to run a Compute Shader is by freezing the main thread. I even tried to dispatch the Compute Shader from an IJob, but the only thing that did was enable the Editor to catch the exception without me wrapping the code in my own try-catch.

    I asked the folks over on the GPU Progressive Lightmapper thread how they managed to run code on the GPU without freezing the main thread and they said that this was done by using OpenCL from a background thread.

    Does anyone have any recent advise on how to run OpenCL from Unity; please keep in mind that I've never used OpenCL before. Most of what I found was either people talking about error codes from the GPU Progressive Lightmapper or a few threads from 2012. Unity has changed a lot these last four years, so I'm not very hopeful when it comes to those older threads.

    Can anyone point me to some resources on running OpenCL from a background thread?

    Thanks,
    -Francisco
     
  2. chrisunityarto

    chrisunityarto

    Unity Technologies

    Joined:
    Nov 3, 2020
    Posts:
    1
  3. fct509

    fct509

    Joined:
    Aug 15, 2018
    Posts:
    108
    Well, I never did bother to get OpenCL to work from it, but I did figure out how to start an outside process from it. From there, I added UDP communication between the apps. I used flatbuffers for the message encoding. After that, I looked into how to create a headless(?) Unity game: a Unity game that doesn't have a screen. This means that I had a plugin for the Unity Editor that would start a new process that it would offload its work to. That new process would be a headless Unity based app that would run data through a Compute Shader and then return the results via UDP to the Unity Editor that started said process. This allowed me to use Unity's ComputeShaders asynchronously from the Unity Editor, stopping the Editor from freezing when a Compute Shader took more than a few frames to run.

    So, instead of trying to get OpenCL to run in a background thread from the Editor, I just ran a separate program, which is apparently how the Unity team that created the Progressive Lightmapper does things.
     
  4. Przemyslaw_Zaworski

    Przemyslaw_Zaworski

    Joined:
    Jun 9, 2017
    Posts:
    328
    OpenCL in Unity: Minimal Working Example.

    In Start(), we initialize OpenCL and compile kernel;

    In Update(), we pass to kernel number of current frame. Kernel multiplies it by current thread index. In this way, then we receive back three values: zero, number of current frame, and number of current frame multiplied by two;

    Reference:

    http://memeplex.blog.shinobi.jp/opencl/

    Full Source Code (OpenCL.cs):

    Code (CSharp):
    1. using System.Collections;
    2. using System.Collections.Generic;
    3. using UnityEngine;
    4. using System;
    5. using System.Runtime.InteropServices;
    6. using System.Linq;
    7.  
    8. public class OpenCL : MonoBehaviour
    9. {
    10.     static string ComputeKernel =
    11.     @"
    12.        __kernel void myKernelFunction(__global float* items, __const int number)
    13.        {
    14.            unsigned int id = get_global_id(0);
    15.            items[id] = number * id;
    16.        }      
    17.    ";
    18.    
    19.     Kernel _Kernel;
    20.     CommandQueue _CommandQueue;
    21.     Buffer _Buffer;
    22.    
    23.     void Start()
    24.     {
    25.         IntPtr device = getDevices(getPlatforms()[0], DeviceType.Default)[0];
    26.         Context context = new Context(device);
    27.         _CommandQueue = new CommandQueue(context, device);
    28.         Program program = new Program(context, ComputeKernel);
    29.         program.Build(device);
    30.         _Kernel = new Kernel(program, "myKernelFunction");
    31.         _Buffer = Buffer.FromCopiedHostMemory(context, new float[3]);
    32.         _Kernel.SetArgument(0, _Buffer);      
    33.     }
    34.    
    35.     void Update()
    36.     {
    37.         _Kernel.SetArgument(1, Time.frameCount);
    38.         _CommandQueue.EnqueueRange(_Kernel, new MultiDimension(3), new MultiDimension(1));
    39.         float[] readBack = new float[3];
    40.         _CommandQueue.ReadBuffer(_Buffer, readBack);
    41.         string result = "";
    42.         foreach (var number in readBack)
    43.         {
    44.             result = result + number.ToString("N0") + "; ";
    45.         }
    46.         Debug.Log(result);
    47.     }
    48.    
    49.     IntPtr[] getDevices(IntPtr platform, DeviceType deviceType)
    50.     {
    51.         int deviceCount;
    52.         OpenCLFunctions.clGetDeviceIDs(platform, deviceType, 0, null, out deviceCount);    
    53.         IntPtr[] result = new IntPtr[deviceCount];
    54.         OpenCLFunctions.clGetDeviceIDs(platform, deviceType, deviceCount, result, out deviceCount);
    55.         return result;
    56.     }
    57.  
    58.     IntPtr[] getPlatforms()
    59.     {
    60.         int platformCount;
    61.         OpenCLFunctions.clGetPlatformIDs(0, null, out platformCount);
    62.         IntPtr[] result = new IntPtr[platformCount];
    63.         OpenCLFunctions.clGetPlatformIDs(platformCount, result, out platformCount);
    64.         return result;
    65.     }
    66. }
    67.  
    68. class Context
    69. {
    70.     public IntPtr InternalPointer { get; private set; }
    71.  
    72.     public Context(params IntPtr[] devices)
    73.     {
    74.         int error;
    75.         InternalPointer = OpenCLFunctions.clCreateContext(null,devices.Length,devices,null,IntPtr.Zero,out error);
    76.     }
    77.  
    78.     ~Context()
    79.     {
    80.         OpenCLFunctions.clReleaseContext(InternalPointer);
    81.     }
    82. }
    83.  
    84. class CommandQueue
    85. {
    86.     public IntPtr InternalPointer { get; private set; }
    87.  
    88.     public CommandQueue(Context context, IntPtr device)
    89.     {
    90.         int error;
    91.         InternalPointer = OpenCLFunctions.clCreateCommandQueue(context.InternalPointer,device,0,out error);
    92.     }
    93.  
    94.     ~CommandQueue()
    95.     {
    96.         OpenCLFunctions.clReleaseCommandQueue(InternalPointer);
    97.     }
    98.  
    99.     public void ReadBuffer<T>(Buffer buffer, T[] systemBuffer) where T : struct
    100.     {
    101.         GCHandle handle = GCHandle.Alloc(systemBuffer, GCHandleType.Pinned);
    102.  
    103.         OpenCLFunctions.clEnqueueReadBuffer(
    104.             InternalPointer,
    105.             buffer.InternalPointer,
    106.             true,
    107.             0,
    108.             Math.Min(buffer.SizeInBytes, Marshal.SizeOf(typeof(T)) * systemBuffer.Length),
    109.             handle.AddrOfPinnedObject(),
    110.             0,
    111.             IntPtr.Zero,
    112.             IntPtr.Zero
    113.             );
    114.  
    115.         handle.Free();
    116.     }
    117.  
    118.     public void EnqueueRange(Kernel kernel, MultiDimension globalWorkSize, MultiDimension localWorkSize)
    119.     {
    120.         MultiDimension offset = new MultiDimension();
    121.         OpenCLFunctions.clEnqueueNDRangeKernel(
    122.             InternalPointer,
    123.             kernel.InternalPointer,
    124.             globalWorkSize.Dimension,
    125.             ref offset,
    126.             ref globalWorkSize,
    127.             ref localWorkSize,
    128.             0,
    129.             null,
    130.             IntPtr.Zero
    131.             );
    132.     }
    133. }
    134.  
    135. class Buffer
    136. {
    137.     public IntPtr InternalPointer { get; private set; }
    138.     public int SizeInBytes { get; private set; }
    139.  
    140.     private Buffer() { }
    141.  
    142.     ~Buffer()
    143.     {
    144.         OpenCLFunctions.clReleaseMemObject(InternalPointer);
    145.     }
    146.  
    147.     public static Buffer FromCopiedHostMemory<T>(Context context, T[] initialData) where T : struct
    148.     {
    149.         Buffer result = new Buffer();
    150.         result.SizeInBytes = Marshal.SizeOf(typeof(T)) * initialData.Length;
    151.  
    152.         int errorCode;
    153.         GCHandle handle = GCHandle.Alloc(initialData, GCHandleType.Pinned);
    154.  
    155.         result.InternalPointer = OpenCLFunctions.clCreateBuffer(
    156.             context.InternalPointer,
    157.             MemoryFlags.CopyHostMemory,
    158.             result.SizeInBytes,
    159.             handle.AddrOfPinnedObject(),
    160.             out errorCode
    161.             );
    162.  
    163.         handle.Free();
    164.         return result;
    165.     }
    166. }
    167.  
    168. class Program
    169. {
    170.     public IntPtr InternalPointer { get; private set; }
    171.  
    172.     public Program(Context context, params string[] sources)
    173.     {
    174.         int errorCode;
    175.  
    176.         InternalPointer = OpenCLFunctions.clCreateProgramWithSource(
    177.             context.InternalPointer,
    178.             sources.Length,
    179.             sources,
    180.             null,
    181.             out errorCode
    182.             );
    183.     }
    184.  
    185.     ~Program()
    186.     {
    187.         OpenCLFunctions.clReleaseProgram(InternalPointer);
    188.     }
    189.  
    190.     public void Build(params IntPtr[] devices)
    191.     {
    192.         int error = OpenCLFunctions.clBuildProgram(
    193.             InternalPointer,
    194.             devices.Length,
    195.             devices,
    196.             null,
    197.             null,
    198.             IntPtr.Zero
    199.             );
    200.  
    201.         if (error != 0)
    202.         {
    203.             int paramValueSize = 0;
    204.             OpenCLFunctions.clGetProgramBuildInfo(
    205.                 InternalPointer,
    206.                 devices.First(),
    207.                 ProgramBuildInfoString.Log,
    208.                 0,
    209.                 null,
    210.                 out paramValueSize
    211.                 );
    212.             System.Text.StringBuilder text = new System.Text.StringBuilder(paramValueSize);
    213.             OpenCLFunctions.clGetProgramBuildInfo(
    214.                 InternalPointer,
    215.                 devices.First(),
    216.                 ProgramBuildInfoString.Log,
    217.                 paramValueSize,
    218.                 text,
    219.                 out paramValueSize);
    220.             throw new Exception(text.ToString());
    221.         }
    222.     }
    223. }
    224.  
    225. class Kernel
    226. {
    227.     public IntPtr InternalPointer { get; private set; }
    228.  
    229.     public Kernel(Program program, string functionName)
    230.     {
    231.         int errorCode;
    232.         InternalPointer = OpenCLFunctions.clCreateKernel(
    233.             program.InternalPointer,
    234.             functionName,
    235.             out errorCode
    236.             );
    237.     }
    238.  
    239.     ~Kernel()
    240.     {
    241.         OpenCLFunctions.clReleaseKernel(InternalPointer);
    242.     }
    243.  
    244.     public void SetArgument(int argumentIndex, Buffer buffer)
    245.     {
    246.         IntPtr bufferPointer = buffer.InternalPointer;
    247.         OpenCLFunctions.clSetKernelArg(
    248.             InternalPointer,
    249.             argumentIndex,
    250.             Marshal.SizeOf(typeof(IntPtr)),
    251.             ref bufferPointer
    252.             );
    253.     }
    254.  
    255.     public void SetArgument<T>(int argumentIndex, T value)where T : struct
    256.     {
    257.         GCHandle handle = GCHandle.Alloc(value, GCHandleType.Pinned);
    258.  
    259.         OpenCLFunctions.clSetKernelArg(
    260.             InternalPointer,
    261.             argumentIndex,
    262.             Marshal.SizeOf(typeof(T)),
    263.             handle.AddrOfPinnedObject()
    264.             );
    265.         handle.Free();
    266.     }
    267. }
    268.  
    269. static class OpenCLFunctions
    270. {
    271.     [DllImport("OpenCL.dll")]
    272.     public static extern int clGetPlatformIDs(int entryCount, IntPtr[] platforms, out int platformCount);
    273.  
    274.     [DllImport("OpenCL.dll")]
    275.     public static extern int clGetDeviceIDs(IntPtr platform, DeviceType deviceType, int entryCount, IntPtr[] devices, out int deviceCount);
    276.  
    277.     [DllImport("OpenCL.dll")]
    278.     public static extern IntPtr clCreateContext(IntPtr[] properties, int deviceCount, IntPtr[] devices, NotifyContextCreated pfnNotify,IntPtr userData,out int errorCode);
    279.  
    280.     [DllImport("OpenCL.dll")]
    281.     public static extern int clReleaseContext(IntPtr context);
    282.  
    283.     [DllImport("OpenCL.dll")]
    284.     public static extern IntPtr clCreateCommandQueue(IntPtr context, IntPtr device, long properties, out int errorCodeReturn);
    285.  
    286.     [DllImport("OpenCL.dll")]
    287.     public static extern int clReleaseCommandQueue(IntPtr commandQueue);
    288.  
    289.     [DllImport("OpenCL.dll")]
    290.     public static extern IntPtr clCreateBuffer(IntPtr context,MemoryFlags allocationAndUsage,int sizeInBytes,IntPtr hostPtr,out int errorCodeReturn);
    291.  
    292.     [DllImport("OpenCL.dll")]
    293.     public static extern int clReleaseMemObject(IntPtr memoryObject);
    294.  
    295.     [DllImport("OpenCL.dll")]
    296.     public static extern int clEnqueueReadBuffer(IntPtr commandQueue,IntPtr buffer,bool isBlocking,int offset,int sizeInBytes,IntPtr result,int numberOfEventsInWaitList,IntPtr eventWaitList,IntPtr eventObjectOut);
    297.  
    298.     [DllImport("OpenCL.dll")]
    299.     public static extern IntPtr clCreateProgramWithSource(IntPtr context,int count,string[] programSources, int[] sourceLengths, out int errorCode);
    300.  
    301.     [DllImport("OpenCL.dll")]
    302.     public static extern int clBuildProgram(IntPtr program,int deviceCount, IntPtr[] deviceList,string buildOptions,NotifyProgramBuilt notify,IntPtr userData);
    303.  
    304.     [DllImport("OpenCL.dll")]
    305.     public static extern int clReleaseProgram(IntPtr program);
    306.  
    307.     [DllImport("OpenCL.dll")]
    308.     public static extern IntPtr clCreateKernel(IntPtr kernel, string functionName, out int errorCode);
    309.  
    310.     [DllImport("OpenCL.dll")]
    311.     public static extern int clReleaseKernel(IntPtr kernel);
    312.  
    313.     [DllImport("OpenCL.dll")]
    314.     public static extern int clSetKernelArg(IntPtr kernel, int argumentIndex, int size, ref IntPtr value);
    315.  
    316.     [DllImport("OpenCL.dll")]
    317.     public static extern int clSetKernelArg(IntPtr kernel, int argumentIndex, int size, IntPtr value);
    318.  
    319.     [DllImport("OpenCL.dll")]
    320.     public static extern int clEnqueueNDRangeKernel(IntPtr commandQueue, IntPtr kernel,int workDimension,ref MultiDimension globalWorkOffset, ref MultiDimension globalWorkSize,ref MultiDimension localWorkSize,int countOfEventsInWaitList,IntPtr[] eventList,IntPtr eventObject);
    321.  
    322.     [DllImport("OpenCL.dll")]
    323.     public static extern int clGetProgramBuildInfo(IntPtr program, IntPtr device, ProgramBuildInfoString paramName,int paramValueSize,System.Text.StringBuilder paramValue,out int paramValueSizeReturn);
    324. }
    325.  
    326. delegate void NotifyContextCreated(string errorInfo, IntPtr privateInfoSize, int cb, IntPtr userData);
    327. delegate void NotifyProgramBuilt(IntPtr program, IntPtr userData);
    328.  
    329. enum DeviceType : long
    330. {
    331.     Default = (1 << 0),
    332.     Cpu = (1 << 1),
    333.     Gpu = (1 << 2),
    334.     Accelerator = (1 << 3),
    335.     All = 0xFFFFFFFF
    336. }
    337.  
    338. enum MemoryFlags : long
    339. {
    340.     ReadWrite = (1 << 0),
    341.     WriteOnly = (1 << 1),
    342.     ReadOnly = (1 << 2),
    343.     UseHostMemory = (1 << 3),
    344.     HostAccessible = (1 << 4),
    345.     CopyHostMemory = (1 << 5)
    346. }
    347.  
    348. struct MultiDimension
    349. {
    350.     public int X;
    351.     public int Y;
    352.     public int Z;
    353.     public int Dimension;
    354.  
    355.     public MultiDimension(int x)
    356.     {
    357.         X = x;
    358.         Y = 0;
    359.         Z = 0;
    360.         Dimension = 1;
    361.     }
    362. }
    363.  
    364. enum ProgramBuildInfoString
    365. {
    366.     Options = 0x1182,
    367.     Log = 0x1183
    368. }
    369.  
    upload_2023-1-1_14-12-8.png