• Lynne's avatar
    x86/tx_float: add 15xN PFA FFT AVX SIMD · ace42cf5
    Lynne authored
    ~4x faster than the C version.
    The shuffles in the 15pt dim1 are seriously expensive. Not happy with it,
    but I'm contempt.
    
    Can be easily converted to pure AVX by removing all vpermpd/vpermps
    instructions.
    ace42cf5
Name
Last commit
Last update
..
api Loading commit data...
checkasm Loading commit data...
fate Loading commit data...
filtergraphs Loading commit data...
ref Loading commit data...
.gitignore Loading commit data...
Makefile Loading commit data...
audiogen.c Loading commit data...
audiomatch.c Loading commit data...
base64.c Loading commit data...
copycooker.sh Loading commit data...
extended.ffconcat Loading commit data...
fate-run.sh Loading commit data...
fate-valgrind.supp Loading commit data...
fate.sh Loading commit data...
md5.sh Loading commit data...
refcmp-metadata.awk Loading commit data...
reference.pnm Loading commit data...
rotozoom.c Loading commit data...
simple1.ffconcat Loading commit data...
simple2.ffconcat Loading commit data...
test.ffmeta Loading commit data...
tiny_psnr.c Loading commit data...
tiny_ssim.c Loading commit data...
utils.c Loading commit data...
videogen.c Loading commit data...