• Martin Storsjö's avatar
    aarch64: hevc: Produce plain neon versions of qpel_bi_hv · f872b197
    Martin Storsjö authored
    As the plain neon qpel_h functions process two rows at a time,
    we need to allocate storage for h+8 rows instead of h+7.
    
    By allocating storage for h+8 rows, incrementing the stack
    pointer won't end up at the right spot in the end. Store the
    intended final stack pointer value in a register x14 which we
    store on the stack.
    
    AWS Graviton 3:
    put_hevc_qpel_bi_hv4_8_c: 385.7
    put_hevc_qpel_bi_hv4_8_neon: 131.0
    put_hevc_qpel_bi_hv4_8_i8mm: 92.2
    put_hevc_qpel_bi_hv6_8_c: 701.0
    put_hevc_qpel_bi_hv6_8_neon: 239.5
    put_hevc_qpel_bi_hv6_8_i8mm: 191.0
    put_hevc_qpel_bi_hv8_8_c: 1162.0
    put_hevc_qpel_bi_hv8_8_neon: 228.0
    put_hevc_qpel_bi_hv8_8_i8mm: 225.2
    put_hevc_qpel_bi_hv12_8_c: 2305.0
    put_hevc_qpel_bi_hv12_8_neon: 558.0
    put_hevc_qpel_bi_hv12_8_i8mm: 483.2
    put_hevc_qpel_bi_hv16_8_c: 3965.2
    put_hevc_qpel_bi_hv16_8_neon: 732.7
    put_hevc_qpel_bi_hv16_8_i8mm: 656.5
    put_hevc_qpel_bi_hv24_8_c: 8709.7
    put_hevc_qpel_bi_hv24_8_neon: 1555.2
    put_hevc_qpel_bi_hv24_8_i8mm: 1448.7
    put_hevc_qpel_bi_hv32_8_c: 14818.0
    put_hevc_qpel_bi_hv32_8_neon: 2763.7
    put_hevc_qpel_bi_hv32_8_i8mm: 2468.0
    put_hevc_qpel_bi_hv48_8_c: 32855.5
    put_hevc_qpel_bi_hv48_8_neon: 6107.2
    put_hevc_qpel_bi_hv48_8_i8mm: 5452.7
    put_hevc_qpel_bi_hv64_8_c: 57591.5
    put_hevc_qpel_bi_hv64_8_neon: 10660.2
    put_hevc_qpel_bi_hv64_8_i8mm: 9580.0
    Signed-off-by: 's avatarMartin Storsjö <martin@martin.st>
    f872b197
Name
Last commit
Last update
..
Makefile Loading commit data...
aacpsdsp_init_aarch64.c Loading commit data...
aacpsdsp_neon.S Loading commit data...
cabac.h Loading commit data...
fmtconvert_init.c Loading commit data...
fmtconvert_neon.S Loading commit data...
h264chroma_init_aarch64.c Loading commit data...
h264cmc_neon.S Loading commit data...
h264dsp_init_aarch64.c Loading commit data...
h264dsp_neon.S Loading commit data...
h264idct_neon.S Loading commit data...
h264pred_init.c Loading commit data...
h264pred_neon.S Loading commit data...
h264qpel_init_aarch64.c Loading commit data...
h264qpel_neon.S Loading commit data...
hevcdsp_deblock_neon.S Loading commit data...
hevcdsp_epel_neon.S Loading commit data...
hevcdsp_idct_neon.S Loading commit data...
hevcdsp_init_aarch64.c Loading commit data...
hevcdsp_qpel_neon.S Loading commit data...
hevcdsp_sao_neon.S Loading commit data...
hpeldsp_init_aarch64.c Loading commit data...
hpeldsp_neon.S Loading commit data...
idct.h Loading commit data...
idctdsp_init_aarch64.c Loading commit data...
idctdsp_neon.S Loading commit data...
me_cmp_init_aarch64.c Loading commit data...
me_cmp_neon.S Loading commit data...
mpegaudiodsp_init.c Loading commit data...
mpegaudiodsp_neon.S Loading commit data...
neon.S Loading commit data...
neontest.c Loading commit data...
opusdsp_init.c Loading commit data...
opusdsp_neon.S Loading commit data...
pixblockdsp_init_aarch64.c Loading commit data...
pixblockdsp_neon.S Loading commit data...
rv40dsp_init_aarch64.c Loading commit data...
sbrdsp_init_aarch64.c Loading commit data...
sbrdsp_neon.S Loading commit data...
simple_idct_neon.S Loading commit data...
synth_filter_init.c Loading commit data...
synth_filter_neon.S Loading commit data...
vc1dsp_init_aarch64.c Loading commit data...
vc1dsp_neon.S Loading commit data...
videodsp.S Loading commit data...
videodsp_init.c Loading commit data...
vorbisdsp_init.c Loading commit data...
vorbisdsp_neon.S Loading commit data...
vp8dsp.h Loading commit data...
vp8dsp_init_aarch64.c Loading commit data...
vp8dsp_neon.S Loading commit data...
vp9dsp_init.h Loading commit data...
vp9dsp_init_10bpp_aarch64.c Loading commit data...
vp9dsp_init_12bpp_aarch64.c Loading commit data...
vp9dsp_init_16bpp_aarch64_template.c Loading commit data...
vp9dsp_init_aarch64.c Loading commit data...
vp9itxfm_16bpp_neon.S Loading commit data...
vp9itxfm_neon.S Loading commit data...
vp9lpf_16bpp_neon.S Loading commit data...
vp9lpf_neon.S Loading commit data...
vp9mc_16bpp_neon.S Loading commit data...
vp9mc_aarch64.S Loading commit data...
vp9mc_neon.S Loading commit data...