Commit 1e9cfa5b authored by Hubert Mazur's avatar Hubert Mazur Committed by Martin Storsjö

sw_scale: Add specializations for hscale 8 to 19

Add arm64 neon implementations for hscale 8 to 19 with filter
sizes 4, 4X and 8. Both implementations are based on very similar ones
dedicated to hscale 8 to 15. The major changes refer to saving
the data - instead of writing the result as int16_t it is done
with int32_t.

These functions are heavily inspired on patches provided by J. Swinney
and M. Storsjö for hscale8to15 which were slightly adapted for
hscale8to19.

The tests and benchmarks run on AWS Graviton 2 instances. The results
from a checkasm tool shown below.

hscale_8_to_19__fs_4_dstW_512_c: 5663.2
hscale_8_to_19__fs_4_dstW_512_neon: 1259.7
hscale_8_to_19__fs_8_dstW_512_c: 9306.0
hscale_8_to_19__fs_8_dstW_512_neon: 2020.2
hscale_8_to_19__fs_12_dstW_512_c: 12932.7
hscale_8_to_19__fs_12_dstW_512_neon: 2462.5
hscale_8_to_19__fs_16_dstW_512_c: 16844.2
hscale_8_to_19__fs_16_dstW_512_neon: 4671.2
hscale_8_to_19__fs_32_dstW_512_c: 32803.7
hscale_8_to_19__fs_32_dstW_512_neon: 5474.2
hscale_8_to_19__fs_40_dstW_512_c: 40948.0
hscale_8_to_19__fs_40_dstW_512_neon: 6669.7
Signed-off-by: 's avatarHubert Mazur <hum@semihalf.com>
Signed-off-by: 's avatarMartin Storsjö <martin@martin.st>
parent 16af424b
This diff is collapsed.
......@@ -29,7 +29,8 @@ void ff_hscale ## from_bpc ## to ## to_bpc ## _ ## filter_n ## _ ## opt( \
const int16_t *filter, \
const int32_t *filterPos, int filterSize)
#define SCALE_FUNCS(filter_n, opt) \
SCALE_FUNC(filter_n, 8, 15, opt);
SCALE_FUNC(filter_n, 8, 15, opt); \
SCALE_FUNC(filter_n, 8, 19, opt);
#define ALL_SCALE_FUNCS(opt) \
SCALE_FUNCS(4, opt); \
SCALE_FUNCS(X8, opt); \
......@@ -48,9 +49,13 @@ void ff_yuv2plane1_8_neon(
int offset);
#define ASSIGN_SCALE_FUNC2(hscalefn, filtersize, opt) do { \
if (c->srcBpc == 8 && c->dstBpc <= 14) { \
hscalefn = \
ff_hscale8to15_ ## filtersize ## _ ## opt; \
if (c->srcBpc == 8) { \
if(c->dstBpc <= 14) { \
hscalefn = \
ff_hscale8to15_ ## filtersize ## _ ## opt; \
} else \
hscalefn = \
ff_hscale8to19_ ## filtersize ## _ ## opt; \
} \
} while (0)
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment