Search Unity

  1. Unity Asset Manager is now available in public beta. Try it out now and join the conversation here in the forums.
    Dismiss Notice
  2. Megacity Metro Demo now available. Download now.
    Dismiss Notice
  3. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

I cannot believe how slow WebGL builds are now...

Discussion in '2018.2 Beta' started by Arowx, May 31, 2018.

Thread Status:
Not open for further replies.
  1. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Literally tens of minutes to build a simple scene some of it with very low CPU usage and some with a multi-core CPU maxed out so can't do anything else until it's finished.

    Is this getting worse, is Unity now getting us to Bitcoin mine when we build or has my computer got a virus?

    Note: Very low GPU usage so probably not Bitcoin mining!

    If you setup a new scene with a token cube and build to WebGL/WASM how long does it take your PC/Mac to build?
     
    MoonJellyGames likes this.
  2. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Started a new project setup a one cube scene and built to WebGL/WASM

    Total build time over 58 minutes!

    Task Monitored the process and most of that time the process used very little CPU then for the last couple of minutes the process flooded the CPU with high load work.

    So is there some way you can speed up the process by using more processing power in earlier stages and ideally limiting the max load for the final stages e.g. 80% max.
     
  3. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    OK Same build to PC takes less about 30 seconds.

    Then build to PC but with IL2CPP and it takes over 44 minutes.

    Interesting note on second build to PC the build process was a lot faster.

    It looks like IL2CPP is not optimised or Unity does not have pre-IL2CPP'd assembely's as let's face facts this example has no code in it other than the code built into Unity.

    Could Unity not pre-build the default assemblies via IL2CPP and include them in the install to save time...

    Or what about more multi-threading for IL2CPP as separate assembely's will generate separate cpp files right then it's just compiling them to dlls or the executable.
     
  4. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    6,588
    Are you sure it's really converting the managed assemblies to C++, or perhaps compiling and linking the massive amount of generated C++ code instead?

    In my projects, IL2CPP performance is acceptable, but compiling the generated code is what takes a significant amount of time.
     
  5. spacefrog

    spacefrog

    Joined:
    Jun 14, 2009
    Posts:
    734
    Just tested here :
    2018.2 B6 took about less then 3 minutes to build to WebGL/webassembly with a scene + Cube
    All other player/build settings were left at default values
     
  6. JoshPeterson

    JoshPeterson

    Unity Technologies

    Joined:
    Jul 21, 2014
    Posts:
    6,920
    Yes, this is something we're looking at doing. At the moment it is difficult because the IL2CPP build toolchain removes unused IL bytecode from managed assemblies. That means the .NET class library and Unity Engine assembly can be different for each build, depending on what code in them is used but the script code or assemblies in the project. So pre-compiling won't help. We're looking for better solutions though.

    Also, you may want to consider what your Api Compatibility Level setting is. If you are using the new scripting runtime, try the .Net Standard 2.0 Api Compatibilty Level instead of .NET 4.x. The .Net Standard 2.0 profile has significantly less code to convert and compile than .NET 4.x. In 2018.3 we'll be making the code even smaller.
     
  7. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    OK I get that after the first run throught, it's like the build process caches the old build and just rebuilds the files.

    Try building to PC/Mac/Linux with IL2CPP set via the PlayerSettings instead of Mono?

    I found that switching build targets that still use IL2CPP triggered a slow build process, as opposed to the pre-cached build process when you re-build to the same target.
     
  8. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    I understand that however why does it use such low processor bandwidth, it's not even using the full capacity of a single core-thread, never mind taking advantage of the full capacity of all available cores?

    This strikes me as either their are 'slowdown bugs' in the build process or it has not been optimised for throughput.

    For example why am I seeing python.exe running in a build process, surely Unity Editor or maybe a Unity Build Manager should be running and managing the process.

    What would be a great help would be a better build in progress ui system e.g.

     
    megame_dev likes this.
  9. JoshPeterson

    JoshPeterson

    Unity Technologies

    Joined:
    Jul 21, 2014
    Posts:
    6,920
    The code conversion step (literally IL -> C++) is mostly single threaded currently. We're internally prepping the IL2CPP code to allow it to be multi-threaded, but I'm not sure if or when we will enable that. For most projects though the C++ compilation is the most time consuming part. There we do run many instances of the C++ compiler in parallel.

    Unity uses Emscripten to compile the generated C++ code for WebGL, and Emscripten uses python behind the scenes.

    I agree, this would be really nice to have. At the moment we're not working on this, but it is on our radar.
     
  10. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    The thing is it seems very low bandwidth, even single threaded it should be able to max out a core/thread on my CPU?

    It could be that your running one file at a time so the bandwidth is limited by a file load then parse/convert and file output style system?

    Whereas an asynchronous stream into memory convert in memory and then asynchronous stream out to files could be more productive, the bulk file load and save operations would be limited to storage bandwidth?

    Then again why use files surely you could run the process against memory based files streams, if all the files were loaded and saved to memory then this would massively boost performance?
     
  11. JoshPeterson

    JoshPeterson

    Unity Technologies

    Joined:
    Jul 21, 2014
    Posts:
    6,920
    We usually do see one core maxed out during code conversion. At least that is the case when the disk is an SSD. If you are using a HDD, then the performance characteristics might also change. Additionally, you might want to look at disabling a virus scanner for the files in your Unity project (if you are running on Windows). We've found that virus scanners can significantly impact file writing performance for IL2CPP.

    In general, we try to do as much in-memory as possible though. All of the C++ code generation is done in memory buffers, then later written to files when it is complete, so I don't think we can gain code conversion performance improvement from that aspect.
     
  12. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    So if you could run IL2CPP completely in memory and stream the files it produces in memory to the C++ compiler could this massively speed up the build process?

    What if I set up a RAM drive and store my project there would that have a similar effect on build performance?
     
  13. JoshPeterson

    JoshPeterson

    Unity Technologies

    Joined:
    Jul 21, 2014
    Posts:
    6,920
    I don't know. I expect that avoiding the file writing would help. However, I don't think any C++ compilers have a direct memory stream API. In addition, we get some benefits by writing the files to disk, like incremental compilation and debugging ability.

    I'm not heard about anyone trying this, but I suppose it cannot hurt.
     
  14. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    6,588
    BTW, make sure to add an exception to your anti-virus program to ignore the project/build folder, Visual Studio (compiler+linker) and Unity. You basically don't want the AV software to interfere the compiler, which often caused massive slow-downs for me in the past.
     
  15. Tautvydas-Zilys

    Tautvydas-Zilys

    Unity Technologies

    Joined:
    Jul 25, 2013
    Posts:
    10,644
    44 minutes sounds crazy. You might want to look at your system resource usage and figure out where the bottleneck lies. IL2CPP doesn't write that many files to disk (it used to in the past), so I doubt you're IO bound unless you're using an old mechanical drive. I just checked and it takes 57 seconds to build a new Unity project with a cube to Windows IL2CPP on my machine. I doubt your computer is 46 times slower than mine, so I think there's some software that's interfering with the build.
     
  16. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    So building to WebGL / WASM roughly involves...
    1. Stripping Assemblies
    2. Compiliing C#/JS to IL
    3. Converting IL 2 C++
    4. Converting C++ to ASM.js
    5. Compiling ASM.js to WASM
    Could you not just compile from C# to WASM, after all you are just stepping through 4 stages of token conversion when you could just do one.

    It's like someone converting English to Chinese but they only have an English to Dutch, Dutch to German, German to French and French to Chinese dictionaries.

    In theory you have all the token conversion tables at each stage why not bypass the middle stages?
     
  17. Tautvydas-Zilys

    Tautvydas-Zilys

    Unity Technologies

    Joined:
    Jul 25, 2013
    Posts:
    10,644
    Step 5 doesn't exist. And step 1 and 2 are swapped. This what actually happens:

    1. Compile C# to IL
    2. Strip assemblies
    3. Convert IL to C++
    4. Compile C++ into WASM.

    You can't compile C# into WASM because stripping needs to be done after C# is compiled to assemblies. And while you could theoretically convert IL to WASM directly, we don't have such a technology available today.
     
  18. charlesb_rm

    charlesb_rm

    Joined:
    Jan 18, 2017
    Posts:
    485
    Some common performance issues include things like:
    - antivirus
    - drive compression
    - etc.
    Please try disabling those for the purpose of a test and see if it improves.
     
  19. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Update: Builds a lot faster now.

    Two Things:
    1. I noticed an odd service/exe running, it was an OVR.... exe so I think it was an Occulus Rift feature but as I'm not using the hardware so I uninstalled Occulus VR. It appeared to be sending lots of network data when I was building??
    2. I have been running 3 yes three anti-virus systems, and as I noticed recently when I virus scanned a folder with all 3 that one is much slower than the others so removed that one.
    Side Note: UnityHub seems to be very active networking data why so much and so often?

    Tried setting up a ram drive and it seems a bit faster but is limited due to the fact Unity uses a Temp folder in the windows User/Local folder path, for all the intermediate file generation.

    Making the idea of setting up a super fast project ram drive for building with Unity a moot point unless...

    Is there is a way to specify where Unity locates the Temp (building/linking) folder?
     
    Last edited: Jun 2, 2018
  20. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    So what about limiting the speed of asm to wasm, this process goes 100% on my CPU could it be limited to 80% or max-1 cores so at least other processes like watching videos/web browsing do not suffer?
     
  21. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    You do it's just not in one process, if you map out the IL to CPP conversion process then the CPP to ASM then the ASM to WASM process you should be able to create an IL to ASM or WASM conversion process.

    After all we are just dealing with parsing and token conversion tables, in theory if you cut out the middle stages you could even end up with better code. After all your new Job system takes C# code and compiles it to native Burst code, what if you look at that technology and see how it can be applied to ASM / WASM?
     
  22. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    6,588
    Please don't, or at least make it an option. I do want full build server utilization, nobody is watching videos on those.
     
    dadude123 likes this.
  23. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Good point user configurable CPU bandwidth would be ideal.
     
  24. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    What about C# to LLVM then s2wasm?

    https://www.dotnetfoundation.org/blog/2015/04/14/announcing-llilc-llvm-for-dotnet
     
    Last edited: Jun 3, 2018
  25. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    OK I've tried supercharging the build process:
    1. Setting up a RAMDISK.
    2. Relocating the TEMP folder to the RAMDISK.
    3. Excluding Anti-Virus scanning of development files.
    And I can now build a scene with a cube in it to WebGL in under 10 minutes from a cold build and under 4 minutes with a warm build (re-building after a cold build).

    Judging from the Task Manager there is still a lot of slack, low CPU usage for minutes at a time, in the build process that could be improved upon.

    Would be interested to see who has the fastest WebGL build times?

    Can the Unity asm2wasm.exe be replaced with one downloaded and compiled from GitHub as it would be interesting to see what the latest build can add as well as trying a few compile options, as this process seems to be a bit of CPU hog at the moment?
     
  26. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Side note but worth looking into, I ran a program called Process Monitor for the PC it gives detailed information on what processes are using/creating what files.

    I noticed that at least for one build process there were pages and pages of data as a file was being written to in 4k chunks. As I have also been looking into the speed of my SSD and RAMDISK my SSD can only reads/write about 77 Mbs* @ 4k even my RAMDRIVE only has about 500 Mbs using 4k read/writes.

    However @128k and over it can read/write 1.4GBs/700Mbs (RAMDISK @ 128k writes 10GBs, reads 6.8GBs)

    Even though there is almost a 10x speed boost between SSD and RAMDISK the actual build process only improves by a fraction, I think this implies that IO bandwidth is not the main issue and processing is the bottleneck.

    Check out ATTO Disk Benchmark and of course profile and test to see if it's relevant to the build process.

    *This half the speed of my HD's max bandwidth!
     
    Last edited: Jun 4, 2018
  27. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    What about using your machine learning system to solve/create the technology?

    Apparently NN's that are taught to learn one language then another language do so more quickly as they pick up common patterns, although this is refering to human language it should still apply to programming languages with common structural elements.
     
  28. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194

    And why does Unity leave lots of memory using inactive processes floating about during the build?
     
  29. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Digging Through Process Monitors Reporting on the Build Process.

    Why Does Unity Scan all the Target directories when you are only building to a single target?
    Why do Shader Compilers launched by Unity use Networking protocols to send data to from Unity e.g. TCP TCPCopy PC:50071->PC:50026?
    Why so many tiny .bin files that are generated then re-read when they could just be stored in memory or one larger file that can be read from written to in larger chunks?
    Ditto for the tmp files, If you can leave hundreds of megabytes in memory from the Shader Compilers and Console Window Hosts, why can't you keep some tmp files and bin files in memory?
    When you write to the data.unity3d and data.unity3d.tmp file you are writing in 32k chunks which is better than 4k but not as good as 128k.
    At least when you copy mscorelib.dll you do so in 512k chunks!
    Then it looks like your copying over every single dll to the staging area, why copy them all over (272 Mb in total)?
    1, You already have one copy of them in the install?
    2, You probably don't need them all anyway?
    The Resources\unity default resources file you copy over in 7k chunks!
    Now when you are reading in the mscorelib.dll that you copied over earlies your reading it in 8k chunks!
    It looks like you are reading all of the dlls that you just copies over in 8k chunks!
    Hopefully I have this wrong but you appear to be reading mscorelib and a couple other core dlls multiple times (double checked you do read this file twice so far, 4, 5, 6, 7, 8, 9, 10... (lost count) times)?
    What the flip are the console window hosts doing it looks like they are reading core dlls and accessing lots of system registry keys and they really upset my Anti-Virus software.
    The UnityLinker steps in and starts reading the staging area dlls (again) and in 8k chunks!
    Wait a minute UnityLinker reads mscorlib.dll again only in 4k chunks!
    It's re-reading the dlls but using 4k chunks this time!
    Unity tries to Create/Open this exe file C:\Program Files\Unity\Hub\Editor\2018.2.0b6\Editor\Data\Tools\InternalCallRegistrationWriter\InternalCallRegistrationWriter.exe
    It really kicks up a fuss with my Anti-Virus software.
    Unity is back to reading the dll files in 8k chunks.
    Then Unity Linker steps in and reads lost of files in 4k chunky visiion. Then steps up the game to 8k chunks of those Staging Area Dlls.
    And it's re-writing those DLLs to the staging area in 4k chunks!
    It looks like Unity or UnityLinker is using the Explorer.exe to do lots of registry core dll reading e.g. C:\Windows\System32\UIAutomationCore.dll multiple times!
    Back to Unity reading some dlls at 8k!
    il2cpp.exe starts and my anti-virus gets very interested in it as it starts launching conhost.exe processes
    il2cpp reads staging area DLL's in 8k chunks.
    il2cpp reads \libil2cpp.icalls in 4k chunks.
    il2cpp writes \Bulk_mscorlib_0.cpp and Bulk_mscorlib_1 and 2 to 17 in 4k chunks!
    Ditto for all the other dll cpp files.
    Then MethodMap.tsv file in 4k chunks!
    il2cpp does lots (tens/hundreds?) of create/open files for PATH NOT FOUND C:\Program Files\Unity\Hub\Editor\2018.2.0b6\Editor\Data\il2cpp\external\mono\external\mbedtls
    il2cpp reads in some .h files in 4k chunks
    il2cpp reads in those Bulk dll now cpp files in 4k chunks this looks massivly multi-threaded.
    It looks like il2cpp launches lots of python.exe and conhost.exe which both need to load and read lots of other files and registry settings before doing any work e.g. python.exe has to load lots of library files.
    I'm seeing lots of NAME NOT FOUND and more worrying BUFFER OVERFLOW results from python.exe starting up?
    Also how many python sesssion do you open up doesn't it have multi-threading?
    It looks like python sessions are launching clang++.exes?
    Aren't there build tools like make to do this already?
    llc.exe starts writing js files using mostly < 1k chunks.
    python.exe then writes ome files using 5k chunks.
    asm2wasm now reads in the js files in 4k chunks.
    asm2wasm starts writing a js.symbols file a few bytes at a time like < 100 bytes per write.
    asm2wasm then writes the wasm file using 4k chunks.

    OK in summary the Unity Build Process, this is a file IO view so not in depth on processing:
    • Reads and Writes files with sub-optimal chunk sizes (modern hard drives can work much faster with large file chunks e.g. 128k seems to be optimal on my PC).
    • Reads the same files over and over again.
    • Launches processes like python when they should really be running a dedicated multi-threaded build system.
    • There are lots of NAME NOT FOUND and BUFFER OVERFLOW errors popping up regarding file access.
    • Uses about a dozen tools some are in built system tools to make one build making it difficult to isolate and exclude the processes from anti-virus interference.
     
    ArkadiuszR likes this.
  30. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194


    I'm thinking if you increase the file reading/writing buffer sizes to the ideal 128k you will see reduction in the fallow areas A-E.

    If you can also reduce the re-reading of dlls this will reduce the system load and the overall build time.

    If you multi-thread il2cpp this should massivly reduce C and if you can do the same for the processes like llc in D then tone down asm2wasm.

    If you can replace python with a dedicated multi-threaded build processor you should also get big reductions in service usage (the red) and much higher throughput.

    Then get an asm2wasm build that doesn't thrash the cpu above 90% or have big pauses in it's process cycle.

    I'm guessing you should be able to get this down to under 50% of the current build time.

    If you could run it all in memory without all the file io you could probably get it down to 20% of the current build time.

    Hope this helps!
     
    Jelmer123 and ArkadiuszR like this.
  31. dadude123

    dadude123

    Joined:
    Feb 26, 2014
    Posts:
    789
    It feels to me like you don't realize how much work all of those "suggestions" would be.

    And also how difficult it is to not only get some build process working once, on your local dev machine, but making it into a solid system. Like, tested and ready for production, working on all sorts of different systems.

    The people working at Unity are not dumb, they're actually incredibly talented and knowledgeable and I'm certain they are very much aware of what exactly can be done and how to do it.

    Sure, just slap some of that magical machine learning powder on the software and stir a few times...
    Read up on how those things are built, and their limitations. This isn't feasibly in the slightest for Unity at this point in time. Machine learning still has a very long way to go until we can even dare to think about things like this.
    In the meantime, maybe get yourself a book/tutorial about how to make a neural network yourself and see how hard it is to go from "wow it works" to "ok, now its actually fast enough AND deterministic/solid enough to be used in production AND I can add new features in a quick, deterministic, and controlled way just like I can with traditional software"
     
  32. hippocoder

    hippocoder

    Digital Ape

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    Hey! Thanks for feedback, it's all really great however as a reminder: Beta forum is for feedback on beta issues. This thread began with feedback but sharply veered off into theoretical discussion and so forth so it is adding noise, not signal to the Beta forum and so I have locked it.

    It's still useful as a reference but it it should not be constantly bumped. Staff are aware of it, and thanks!
     
    MadeFromPolygons and dadude123 like this.
Thread Status:
Not open for further replies.