• Martin Storsjö's avatar
    swscale: aarch64: Optimize the final summation in the hscale routine · 70db1437
    Martin Storsjö authored
    Before:                     Cortex A53      A72      A73  Graviton 2  Graviton 3
    hscale_8_to_15_width8_neon:     8273.0   4602.5   4289.5      2429.7      1629.1
    hscale_8_to_15_width16_neon:   12405.7   6803.0   6359.0      3549.0      2378.4
    hscale_8_to_15_width32_neon:   21258.7  11491.7  11469.2      5797.2      3919.6
    hscale_8_to_15_width40_neon:   25652.0  14173.7  12488.2      6893.5      4810.4
    
    After:
    hscale_8_to_15_width8_neon:     7633.0   3981.5   3350.2      1980.7      1261.1
    hscale_8_to_15_width16_neon:   11666.7   5951.0   5512.0      3080.7      2131.4
    hscale_8_to_15_width32_neon:   20900.7  10733.2   9481.7      5275.2      3862.1
    hscale_8_to_15_width40_neon:   24826.0  13536.2  11502.0      6397.2      4731.9
    
    Thus, this gives overall a 8-29% speedup for the smaller filter
    sizes, around 1-8% for the larger filter sizes.
    
    Inspired by a patch by Jonathan Swinney <jswinney@amazon.com>.
    Signed-off-by: 's avatarMartin Storsjö <martin@martin.st>
    70db1437
Name
Last commit
Last update
..
aarch64 Loading commit data...
arm Loading commit data...
ppc Loading commit data...
tests Loading commit data...
x86 Loading commit data...
Makefile Loading commit data...
alphablend.c Loading commit data...
bayer_template.c Loading commit data...
gamma.c Loading commit data...
hscale.c Loading commit data...
hscale_fast_bilinear.c Loading commit data...
input.c Loading commit data...
libswscale.v Loading commit data...
log2_tab.c Loading commit data...
options.c Loading commit data...
output.c Loading commit data...
rgb2rgb.c Loading commit data...
rgb2rgb.h Loading commit data...
rgb2rgb_template.c Loading commit data...
slice.c Loading commit data...
swscale.c Loading commit data...
swscale.h Loading commit data...
swscale_internal.h Loading commit data...
swscale_unscaled.c Loading commit data...
swscaleres.rc Loading commit data...
utils.c Loading commit data...
version.h Loading commit data...
version_major.h Loading commit data...
vscale.c Loading commit data...
yuv2rgb.c Loading commit data...