Search Unity

Question Optimal means of boolean to int?

Discussion in 'Burst' started by Unifikation, Jun 7, 2023.

  1. Unifikation

    Unifikation

    Joined:
    Jan 4, 2023
    Posts:
    1,086
    Is there some way with Mathematics or any other part of DOTS/Burst to optimally convert booleans to their int value?

    The best I've seen outside of something optimised and specific is this:


    using System;
    static unsafe int H(bool b)
    {
    return *(Byte*)&b;
    }


    From here:

    https://stackoverflow.com/questions/66985162/how-to-convert-bool-to-int-efficiently

    but this could have some odd side effects and does mean having to hope that working with a mere byte value works out.
     
  2. brunocoimbra

    brunocoimbra

    Joined:
    Sep 2, 2015
    Posts:
    679
    There is math.select, but not sure about its performance when compared to your code, would need to measure.
     
  3. Unifikation

    Unifikation

    Joined:
    Jan 4, 2023
    Posts:
    1,086
    THANK YOU!!!

    Will test and report back, once I've overcome all the AABB disasters of "upgrading" to 2022.LTS
     
    brunocoimbra likes this.
  4. Mortuus17

    Mortuus17

    Joined:
    Jan 6, 2020
    Posts:
    105
    Highly individual case, where the assembly output is relevant.
    The variant you provided is most useful when performing arithmetic with 1 or 0 (or {0 or -1}, {0 or 1 << 31} etc.), and it will be translated into 1 CMP instruction (plus maybe an additional, unnecessary one - again: highly individual) and a SETcc instruction. That's the point where you have a 0 or 1 in a register...
    Unity.Mathematics'
    select
    is literally implemented as
    c ? b : a
    - Unity.Mathematics does perform some magic using Burst IL post processing, but that's not where it's at. It looks like bad code because of a branch but this will in most cases be translated into 1 CMP and 1 CMOVcc instruction. Due to ILP and OOOE (i.e. both possible values often being actually computed at the same time) this is, for most cases, by far the most optimal way of using booleans - i.e. a + 0 and a + 1 could've been calculated in parallel to the CMP instruction and you'd be done with the CMOVcc already, compared to having to perform a CMP and a SETcc instead, followed by an ADD (so: "clever" code being 1 cycle ^= 33.3% slower and utilizing less of the CPU). Dereferencing as a byte comes in handy often but you really need to know what you're doing or what the assembly looks like in the context of superscalar CPUs.
    And btw: this is, since you're never actually avoiding branches, one of the most-microest of micro optimizations there are.
     
    Last edited: Jun 8, 2023
    Unifikation likes this.
  5. Unifikation

    Unifikation

    Joined:
    Jan 4, 2023
    Posts:
    1,086
    This zero and one situation is the most common problem I'm facing, that I want to stop branching, during massive amounts of ADSR-like grooming of values, per frame, for some audio shaping. But I'm just generally keen to see and learn exactly the sort of stuff you're talking about. THANK YOU!!!