• Rémi Denis-Courmont's avatar
    lavu/riscv: use Zbb REV8 at run-time · 324899b7
    Rémi Denis-Courmont authored
    This adds runtime support to use Zbb REV8 for 32- and 64-bit byte-wise
    swaps. The result is about five times slower than if targetting Zbb
    statically, but still a lot faster than the default bespoke C code or a
    call to GCC run-time functions.
    
    For 16-bit swap, this is however unsurprisingly a lot worse, and so this
    sticks to the baseline. In fact, even using REV8 statically does not
    seem to be beneficial in that case.
    
             Zbb static    Zbb dynamic   I baseline
    bswap16:  0.668184765   3.340764069   0.668029012
    bswap32:  0.668174014   3.340763319   9.353855435
    bswap64:  0.668221765   3.340496313  14.698672283
    (seconds for 1 billion iterations on a SiFive-U74 core)
    324899b7
bswap.h 2.04 KB