• Swinney, Jonathan's avatar
    swscale/aarch64: add hscale specializations · 0ea61725
    Swinney, Jonathan authored
    This patch adds code to support specializations of the hscale function
    and adds a specialization for filterSize == 4.
    
    ff_hscale8to15_4_neon is a complete rewrite. Since the main bottleneck
    here is loading the data from src, this data is loaded a whole block
    ahead and stored back to the stack to be loaded again with ld4. This
    arranges the data for most efficient use of the vector instructions and
    removes the need for completion adds at the end. The number of
    iterations of the C per iteration of the assembly is increased from 4 to
    8, but because of the prefetching, there must be a special section
    without prefetching when dstW < 16.
    
    This improves speed on Graviton 2 (Neoverse N1) dramatically in the case
    where previously fs=8 would have been required.
    
    before: hscale_8_to_15__fs_8_dstW_512_neon: 1962.8
    after : hscale_8_to_15__fs_4_dstW_512_neon: 1220.9
    Signed-off-by: 's avatarJonathan Swinney <jswinney@amazon.com>
    Signed-off-by: 's avatarMartin Storsjö <martin@martin.st>
    0ea61725
Name
Last commit
Last update
..
aarch64 Loading commit data...
arm Loading commit data...
ppc Loading commit data...
tests Loading commit data...
x86 Loading commit data...
Makefile Loading commit data...
alphablend.c Loading commit data...
bayer_template.c Loading commit data...
gamma.c Loading commit data...
hscale.c Loading commit data...
hscale_fast_bilinear.c Loading commit data...
input.c Loading commit data...
libswscale.v Loading commit data...
log2_tab.c Loading commit data...
options.c Loading commit data...
output.c Loading commit data...
rgb2rgb.c Loading commit data...
rgb2rgb.h Loading commit data...
rgb2rgb_template.c Loading commit data...
slice.c Loading commit data...
swscale.c Loading commit data...
swscale.h Loading commit data...
swscale_internal.h Loading commit data...
swscale_unscaled.c Loading commit data...
swscaleres.rc Loading commit data...
utils.c Loading commit data...
version.c Loading commit data...
version.h Loading commit data...
version_major.h Loading commit data...
vscale.c Loading commit data...
yuv2rgb.c Loading commit data...