Search Unity

Are there instruction sets within modern CPUs that Burst could unlock for developers?

Discussion in 'Burst' started by Arowx, Apr 26, 2022.

  1. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Burst currently works on SIMD instructions for speeding up batches of 3D calculations.

    The latest version of Intel's Alder Lake CPU has the following instructions extensions:
    • AES - Advanced Encryption Standard
    • CLMUL - Carry-less Multiplication - Useful for fast calculation of CRC values
    • RDRAND - On chip Random Number Generator - Cryptographically compliant
    • SHA - SHA1 SHA256 - Hashing functions
    • TXT - Trusted Execution Technology - Security of OS/Apps
    • MMX, SSE, SSE2, SSE3, SSSE3, SSE4, SSE4.1, SSE4.2, - SIMD instructions*
    • AVX, AVX2, - Big SIMD instructions*
    • FMA3, multi operand SIMD*?
    • AVX-VNNI, big SIMD number crunchers used in raytracing and cryptographic systems. ?*
    • VT-x, VT-d Virtualisation, anyone want to do a Sim Computer within your Sim??
    *should be in Burst already

    And this is Just Intel and AMD CPUs there are probably many more instruction sets on modern Mobile CPUs that could benefit game developers and become accessible via Burst compilation technology.
     
  2. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    And how well does Burst Utilise available SIMD instructions e.g. does it use the best instruction set available on the hardware?
     
  3. Zuntatos

    Zuntatos

    Joined:
    Nov 18, 2012
    Posts:
    612
    Arowx likes this.
  4. sheredom

    sheredom

    Unity Technologies

    Joined:
    Jul 15, 2019
    Posts:
    300
    I'll attempt an answer - what we provide access to is about the safest set we can possibly provide while maximising the number of CPUs that can use the functionality. Like @Zuntatos points out - we have 4 categories of functionality that are all carefully layered upon each other (so AVX2 has all of SSE2 + SSE4 + AVX and also AVX2, FMA, F16C, BMI1, and BMI2). The reason we do this is that for every category of instruction supported we have to compile all your code another time. This is why we also defaulted in the Burst AOT Settings to running with SSE2 + AVX2 as the supported instruction sets for player builds - running the compiler twice instead of four times means... the compiler is twice as fast!

    No doubt in future there will be some post AVX2 feature set of instructions that will be widely enough supported across CPUs that gamers have that will make it good enough for us to expose, but again if and when we did that we'd have to layer that functionality ontop of AVX2, and group as many things as was reasonable together to keep the number of compilations to a minimum.