Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

What do You think about integration CUDA in Unity?

Discussion in 'General Discussion' started by bugfly, Dec 20, 2014.

?

Do You need CUDA integration in Unity?

  1. Yes! I need it now!

  2. Yes! it's a good feature and I will use it someday.

  3. No! I do not need this ever! I am sure!

  4. It does not matter to me! At least for now!

  5. Yes, I need calculations in real time on GPU, but I like the integration of OpenCL!

  6. Yes, I need calculations in real time on GPU, but I like the integration of DirectCompute!

  7. Yes, I need calculations in real time on GPU, but I like the integration of ATI Stream!

  8. Yes, I need calculations in real time on GPU, but I like the integration of OpenACC!

  9. Yes, I need calculations in real time on GPU, but I like the integration of C++ AMP!

  10. Yes, I need it and I do not care how You do it, just let it supported all GPU and APU!

Multiple votes are allowed.
Results are only viewable after voting.
Thread Status:
Not open for further replies.
  1. bugfly

    bugfly

    Joined:
    Mar 15, 2014
    Posts:
    21
    Hi! I asked a question on AnswerHub. Here it is: http://answers.unity3d.com/questions/858680/will-unity-integrate-cuda-someday.html
    And I was advised to go to the forum, for Your appreciation of this topic I think. So here is the topic. If You are interested in CUDA integration, then let's discuss! You interest and explanation of why it is needed to You will affect on positive decision in this regard I beleive.
     
  2. hippocoder

    hippocoder

    Digital Ape Moderator

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    I'm not very interested in cuda or opencl, but an interim language that would export to both targets (compute shader / opencl). Because the world isn't on one gpu. I don't believe CUDA is a good fit at all for unity, no.

    It's nice if you can calculate on AMD or nvidia, as opposed to just only nvidia and likely, only on windows.

    Done right, Unity can have a solution for mac, windows, ps4, xbox one and up-coming mobiles.
     
  3. bugfly

    bugfly

    Joined:
    Mar 15, 2014
    Posts:
    21
    hippocoder, good idea! I added several answers to the list...
     
  4. bugfly

    bugfly

    Joined:
    Mar 15, 2014
    Posts:
    21
    The main advantage of such solutions as CUDA and ATI Stream is the speed!
    Work is carried out directly with hardware, by means of the driver, while in other cases it works through the windows applications, such as libraries and etc it is much slower.
     
    shkar-noori likes this.
  5. RockoDyne

    RockoDyne

    Joined:
    Apr 10, 2014
    Posts:
    2,234
    Maybe in the editor, and that is a very iffy maybe. The main issue though is we are talking about a game engine that is already using the GPU. It just doesn't make much sense to me to use technology to leverage the GPU when the GPU is already being used.

    With that said, I don't know if these are just nicer enough programming environments that would make me eat my words should I ever really need to use the GPU for something (probably generation).
     
  6. bugfly

    bugfly

    Joined:
    Mar 15, 2014
    Posts:
    21
    RockoDyne, ok! Here's my case. I want to make a real time clothes modeling program.
    At first we scan the human body and get the 3D model, а kind of virtual mannequin, then we start to place a pieces of fabric on the mannequin and see what's happening. In this case the game engine rendering the mannequin, menu items, some tools and etc. The fabric will have it's own engine and it will be rendering in CUDA for example. CUDA is good enough in this case because You can install 2 video cards, and use the first for the Unity, and the other for the CUDA. CUDA can use the card, that is not attached to the monitor right now and send the calculation data to the other video card, the only problem with the GUI, CUDA unlike the graphics engine is very poor for all sorts of menu items, different tools for creating and management 3D objects and etc. The closest analogue of this type of program is marvelous designer. It's great? it's realy great! Menu, tools is ok, but it is not real time. Fabric engine require an enormous calculations, and better to do it on the second separate video card. And for this purpose the integration is needed. To indicate within Unity that some objects should Redder on a first card, and calculations on the other. If you want to do it yourself by combining two engines in MS visual studio You should be really good programmer...
     
    Last edited: Dec 20, 2014
  7. RockoDyne

    RockoDyne

    Joined:
    Apr 10, 2014
    Posts:
    2,234
    So, your going to leave a CPU that will probably have at least four cores sit mostly idle, while you force people to buy a second graphics card to run a high end cloth physics simulation? Makes perfect sense to me...

    As far as the idea though, you're probably a decade too early to be able to do it easily on your own, and currently would take a few years to be able to do it using highly optimized code made with a moderate sized team of devs. High detail cloth physics (or soft body physics in general) just hasn't been ready for prime time. There are reasons why most forms of cloth physics in games are left to fairly low joint count capes and not t-shirts
     
  8. bugfly

    bugfly

    Joined:
    Mar 15, 2014
    Posts:
    21
    RockoDyne, yes I understand that. I got a little misunderstanding here. Marvelous designer is making the clothes for the games and I want to make program for the real clothes construction like bodymetrics:
    http://www.bodymetrics.com/


    Bodymetrics started at 2005! Now it looks like this:

    And since that time nobody has repeat it. All because of calculations. Now with CUDA it can be done, even more, the universal program can be made, a program for any clothes You want, not just for the jeans. The calculations in real time is a weak part in this, everything else is already done, there is a ready solutions on the market. CUDA is for the real world, it is so fast, that can help in many real simulations. It's revolution in many areas of our life, the automation of the processes that can not be automated yet. Everyone will understand it very soon then the first successful programs for automation will appears and CUDA will evolve, it will have a features like game engines etc. But now it is just a calculated platform and it needs the help of the game engines for the interface.
     
    Last edited: Dec 21, 2014
  9. danybittel

    danybittel

    Joined:
    Aug 1, 2013
    Posts:
    68
  10. Zerot

    Zerot

    Joined:
    Jul 13, 2011
    Posts:
    135
    Unity already has support for compute shaders when targeting dx11. So you can already use it for calculations on the gpu. You are limited to the windows dx11 platform though.
     
  11. TylerPerry

    TylerPerry

    Joined:
    May 29, 2011
    Posts:
    5,577
    IIRC there's some assets out there for this.
     
  12. bugfly

    bugfly

    Joined:
    Mar 15, 2014
    Posts:
    21
    danybittel, the real fabric has a density of about two thousand fibers per meter, it is 4 million knots per square meter (knots is at the intersection of fibers). None of the existing fabric engines provide such degree of discreteness. And for real fabric You need such discreteness. The calculations is far beyond real fabric. Bodymetrics use computer cluster for the calculations. It's expensive, but CUDA is a cheap solution in this case...

    Zerot, dx11 platform is much slower than CUDA...
     
  13. Zerot

    Zerot

    Joined:
    Jul 13, 2011
    Posts:
    135
    Do you have any sources for that? Anyway, compute shaders can be used Unity, so unless you are willing to write a native plugin for unity(and I don't even know if it is possible to expose cuda using a native plugin) that is your only option for now.
     
  14. bugfly

    bugfly

    Joined:
    Mar 15, 2014
    Posts:
    21
    Zerot, then You use CUDA, You export program with all it's commands directly to the hardware, there they will run, the management is inside hardware, outside is just incoming data for calculations and output data that is already calculated. Then You use DX11 the management is not inside hardware it is inside the operational system, the operational system gives commands to the hardware it's a very expensive intermediary, it slow down calculations very much.
     
  15. bugfly

    bugfly

    Joined:
    Mar 15, 2014
    Posts:
    21
  16. bugfly

    bugfly

    Joined:
    Mar 15, 2014
    Posts:
    21
    It seems that not everyone understands the real complexity of the calculations in real time. And why is it so useful. To show it let's cover any 3D object with points, You will have many points with different coordinates, then connect this points with segments so that You will get the surface consist of many attached to each other small polygons (triangles, quadrilaterals, pentagons, and so on, irregular polygons!!!!!!). This surface that consist of many polygons called mesh. And that's it, mesh and every it's polygon is stored in file. Also You have different options in this file, all this options is the algorithms that modify the mesh they are running in the video card. But the original data of 3D object is the mesh, everything else is the different algorithms that apply to this mesh. Any mesh before the rendering must be uploaded to the video card memory. And any game engine could do it, You give to the engine the mesh and it's render it. But it's ok only then You already have this mesh. Then You deals with the fabric, You need to calculate all bending, stretching and others parameters of the fabric, only then You will have the result mesh and can render it. It's far beyond from rendering it's much more complexity calculations. The rendering is simple it is not require hard calculations, but the calculations of the result mesh that will be rendered is really hard. And that is why CUDA needed. In CUDA there can bee the result mesh calculated, and then export to the Unity for rendering. That is the point of all GPU calculations. It is not for the graphics, we already have it in all game engines, it is for the real time calculations.
     
    Last edited: Dec 21, 2014
  17. Zerot

    Zerot

    Joined:
    Jul 13, 2011
    Posts:
    135
    Do you have an actual source for that? Because afaik both dx11 compute shaders and cuda generate an intermediate format that gets transformed to a command stream by the driver. Both times the command stream will get uploaded to the gpu once and then executed (multiple times) by a dispatch.

    By all means correct me if I'm wrong. It is always good to learn new things or be corrected.
     
  18. bugfly

    bugfly

    Joined:
    Mar 15, 2014
    Posts:
    21
    Zerot, You are correct here! I am worrying about transfer data from one GPU memory to another GPU memory. Is it handled on the low level by DX11? I do not know that. If it is handled then i am making problems to myself. But I know that CUDA can make a program that will runs on one GPU and from this GPU it can transfer data to the memory of the other GPU passing the operational system through one PCI directly to another PCI. I do not know how NVidia do it, but it is the way how sli mod works and CUDA can use this memory transfer system and not necessarily in sli mode (You just need the motherboard with the sli technology). I just hope that Unity will able to hold this transferred memory for it needs. That's why the integration needed.
     
    Last edited: Dec 21, 2014
  19. Zerot

    Zerot

    Joined:
    Jul 13, 2011
    Posts:
    135
    Ah, if you are specifically talking about SLI setups, then afaik compute shaders are useless. IIRC dx11 CS can only switch between the GPUs when the data is presented(i.e. a display frame). So you can use it to precompute the data for the next frame on a gpu, but you can't use it for constant computation.
     
  20. danybittel

    danybittel

    Joined:
    Aug 1, 2013
    Posts:
    68
    Oh.. you misunderstood. Fabric Engine is not an engine to calculate fabrics. "Fabric" is just it's name. It's basically an engine that makes it easy to program GPU's / CPU's Multithreaded. It's heavily used in Visual Effects for custom tools. Click, the link, have a look maybe it's a better fit.
     
  21. thxfoo

    thxfoo

    Joined:
    Apr 4, 2014
    Posts:
    515
    I understand that cuda is nvidia only, but it is more or less the standard for most cool things. (With the exceptions of mining bitcoins and breaking hashing functions)

    So if you look for implementations of machine learning papers or crazy physics stuff or rendering directly from the scientific community, most of it will be cuda. Some of the large projects work on OpenCL support, but mostly the OpenCL versions are behind and not feature-complete.

    So if you can use cuda directly, you can use a large amount of existing code. So to create non-game specialized stuff I would pick cuda anyday atm.
     
  22. hippocoder

    hippocoder

    Digital Ape Moderator

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    OpenCL is the standard simply because it *is* a standards-driven api.

    CUDA is propriety, which is why Apple does not support it. You don't have this with OpenGL vs DirectX because Apple does not support DX and MS wouldn't want them to.

    OpenCL is another matter. It's OS and hardware neutral. It used to lag behind in performance. It doesn't any more last time I checked. The story will probably play out with direct compute on windows and OpenCL on mac. Probably :p

    http://www.dslrfilmnoob.com/2014/04/26/opencl-vs-cuda-adobe-premiere-cc-rendering-test/
     
  23. thxfoo

    thxfoo

    Joined:
    Apr 4, 2014
    Posts:
    515
    http://www.nvidia.com/object/mac-driver-archive.html

    I prefer open standards too. But atm much of the interesting stuff is just not available for OpenCL.
     
  24. hippocoder

    hippocoder

    Digital Ape Moderator

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    I think you'll find it is, like oh anything optimised for mac pro. This stuff is changing now. You can't disregard apple mac pro because it's a big deal with professional users, and it's all OpenCL.

    And I didn't say nvidia didn't support Mac, only Apple didn't support CUDA, which they don't, at least os / driver wise.

    CUDA is useless on a lot of macs, purely because of the AMD gpu deal, which makes CUDA on mac pretty much regulated to hobbyist or enthusiast hackintosh for modern gpus.

    If Apple change their mind and use nvidia gpus, it's unlikely they're going to rewrite the os to support CUDA or encourage developers to adopt it.

    Sure, windows is the biggest footprint, but Apple has a way of throwing spanners in the works and forcing the world to adopt, for example Flash, because it's segment is too big to ignore.

    Still, I'm not going to be supporting CUDA unless Unity does it all behind the scenes for me, like HLSL and so forth. I don't want to worry about the background api, just my code... being all cross platform and all :)
     
    Last edited: Dec 23, 2014
  25. bugfly

    bugfly

    Joined:
    Mar 15, 2014
    Posts:
    21
    hippocoder, Your dispute is mainly around NVidia and ATI, but not around all features of such technologies. From my point of view the main feature is the possibility of fast transferred data between two or more video cards. It can be done only in CUDA + SLI or ATI Stream + ATI Crossfire...
    It is the most powerful approaches in this case... Yes, Apple devices have a single video card, but it is for now. If the real time calculations will grow up, then just one video card will be not enough, then NVidia and ATI will transfer they SLI and ATI Crossfirex technology further in APU. They will not make another technology, now it is the beta test for the future and it is wise to focus on these two technologies.
     
  26. hippocoder

    hippocoder

    Digital Ape Moderator

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    It's better than OpenCL evolves to deal with these optimisations and/or vice versa in the long run. With DX11 and Mantle we're seeing the same propriety things pop up. This is normal. ATI (AMD) had tessellation first. This was fast, nobody used it. Eventually when things are standardised, they get used.

    The latest plan from Kronos, is that OpenGL will be optimised so you can do similar things to Metal, DX11 and Mantle. The same thing is happening for OpenCL.

    These standards move slower, because they're by committee, but they do move, and they do benefit us all. In any case, it's Unity's decision not ours. They have to make the choice of supporting a number of conflicting API's or just settling for one. It's unlikely they will stop supporting direct compute, and just simply add OpenCL support for windows, mobile and Apple.

    I'm not really disputing anything. I'm saying that it's what I believe to be the best foot forward.
     
  27. bugfly

    bugfly

    Joined:
    Mar 15, 2014
    Posts:
    21
    Last edited: Dec 23, 2014
  28. hippocoder

    hippocoder

    Digital Ape Moderator

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    I don't understand. What do you think CUDA does that OpenCL doesn't?
     
  29. bugfly

    bugfly

    Joined:
    Mar 15, 2014
    Posts:
    21
    hippocoder, CUDA already support memory transfer between two video cards based on SLI technology (6Gbit/s). For two or more cards it's a better solution, for single card it is no matter, in this case You can choose any technology You want.
     
    Last edited: Dec 24, 2014
  30. bugfly

    bugfly

    Joined:
    Mar 15, 2014
    Posts:
    21
    Ok! It seems that OpenCL is win by now. But the better choice in this conditions will be the GPU calculation system based on Unity own languages. How do You think? Unity can integrate the code translator in engine so that it can translate Unity languages (C#, js and Boo) to the current library language (library that will be chosen to integrate, for example Open CL).
    What do You think about this kind of integration?
     
  31. bugfly

    bugfly

    Joined:
    Mar 15, 2014
    Posts:
    21


    Hey, look at the tests. CUDA and GLSL shows good results. But the potential of GLSL is much higher than OpenCL. As You know GLSL is a language for OpenGL and it's a very hard for parallel computing, well for that needs the OpenCL was made. But look at the difference between GLSL and OpenCL. It seems that OpenCL is not optimized yet. Theoretically OpenCl supported both CPU and GPU in many systems (including not only NVidia and ATI. SpursEngine https://en.wikipedia.org/wiki/SpursEngine , Cell https://en.wikipedia.org/wiki/Cell_(microprocessor) , S3 Graphics Chrome https://en.wikipedia.org/wiki/S3_Chrome and others). Good idea! But Nvidia is moving forward and OpenCL stands still. NVidia even handle support of NVidia cards in OpenCl with the best possible result. What is AMD doing in OpenCL? Oh great, they already supports AMD CPU without bugs!!! Parallel computing is not the top priority for AMD, they losing this match. It seems that only NVidia move the parallel computing in bouth CUDA and OpenCL. OpenCL is multi platform, but nobody except NVidia support it in proper manner. That's why CUDA is on top for now.
     
    Last edited: Dec 26, 2014
  32. blueLED

    blueLED

    Joined:
    Jan 5, 2014
    Posts:
    102
    I use CUDA with blender for cycles rendering (real-time raycast lighting). Would be great if there was an option for enabling CUDA on the Unity lightmapping process to speed things up.
     
    shkar-noori likes this.
  33. hippocoder

    hippocoder

    Digital Ape Moderator

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    Uh I think it's just a case of just most likely using compute shader syntax and unity converts it. Like HLSL.

    Your comment regarding enlighten bake times is a good one but I wishout you guys would realise CUDA isn't available for the vast majority. It's only available for the minority. You see intel chipsets and amd combined, outnumber nvidia.
     
  34. blueLED

    blueLED

    Joined:
    Jan 5, 2014
    Posts:
    102
    You're right, but does that mean it shouldn't be implemented at some level? I'm glad the people at blender took the time to add it, would be great if UT did too. If you could cut lightmapping bake time to 1/4 the time or less, wouldn't you want that too? I know I would.
     
  35. hippocoder

    hippocoder

    Digital Ape Moderator

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    I really think it shouldn't be integrated at all, ever. I think (and hope) daily for less single-gpu propriety crap, and more investment in something the entire world can use instead of every company trying to lock you into an api just because it means more money for them, with zero actual benefit to the consumer.

    so, no.
     
  36. thxfoo

    thxfoo

    Joined:
    Apr 4, 2014
    Posts:
    515
    You are right for many cases of proprietary crap. But I think not in this case.

    Nvidia just is moving the fastest in this area. And I wouldn't want them to wait for the rest. Especially since they have done their homework, their OpenCL implementation seems solid. So as long as they also support the open standards well, I'm ok if they go ahead to rock with their own stuff.

    Cuda maybe is not ideal for something general purpose like Unity. But if you implement something that just has to be the best thing available, then at the moment there is no way around Cuda.

    That's why much of the research in rendering, physics or machine learning uses it. My deep neural nets are run by Cuda, for a reason. I evaluated many things, but it would just have been crazy to not use Cuda atm if you want to compete with the state of the art.
     
    bugfly and blueLED like this.
  37. bugfly

    bugfly

    Joined:
    Mar 15, 2014
    Posts:
    21
    Look here (page 10) http://www.nvidia.com/docs/IO/116711/sc11-multi-gpu.pdf
    It's about CUDA Multi-GPU Programming. See the exchange rate? This is what I am talking about.
    Thery fast!!! And it is only in CUDA, only in NVidia! Yes it is depending on the hardware, but look at the advantages ... None of this library will never do it, only manufacturer with its own hardware design (not only software) may do it.
     
  38. darkhog

    darkhog

    Joined:
    Dec 4, 2012
    Posts:
    2,218
    Well, how about cases when your game don't use GPU to the full extent? Some of that extra power could be used to compute things like world generation, AI, etc. so CPU could do other things and computer would be used fully to make best possible game.
     
  39. bugfly

    bugfly

    Joined:
    Mar 15, 2014
    Posts:
    21
  40. liysmagic

    liysmagic

    Joined:
    Aug 19, 2015
    Posts:
    1
    Currently I want to do such things exactly. In my case, I want to apply some GPU algorithm to current mesh and points in the Unity. After calculation, render the new generated mesh. Can I implement this use compute shader? Do I need to transfer the results to CPU first and then pass them to Unity for render? Is there any restrictions compared to normal GPU algorithm? Since I am new to this, appreciate any of your suggestions.
     
  41. Aurore

    Aurore

    Director of Real-Time Learning Unity Technologies

    Joined:
    Aug 1, 2012
    Posts:
    3,106
    8 months isn't a terrible necro post, but still a necro. Please don't resurrect old discussion topics
     
Thread Status:
Not open for further replies.