• Zhao Zhili's avatar
    swscale/aarch64: Add rgb24 to yuv implementation · 9dac8495
    Zhao Zhili authored
    Test on Apple M1:
    
    rgb24_to_uv_8_c: 0.0
    rgb24_to_uv_8_neon: 0.2
    rgb24_to_uv_128_c: 1.0
    rgb24_to_uv_128_neon: 0.5
    rgb24_to_uv_1080_c: 7.0
    rgb24_to_uv_1080_neon: 5.7
    rgb24_to_uv_1920_c: 12.5
    rgb24_to_uv_1920_neon: 9.5
    rgb24_to_uv_half_8_c: 0.2
    rgb24_to_uv_half_8_neon: 0.2
    rgb24_to_uv_half_128_c: 1.0
    rgb24_to_uv_half_128_neon: 0.5
    rgb24_to_uv_half_1080_c: 6.2
    rgb24_to_uv_half_1080_neon: 3.0
    rgb24_to_uv_half_1920_c: 11.2
    rgb24_to_uv_half_1920_neon: 5.2
    rgb24_to_y_8_c: 0.2
    rgb24_to_y_8_neon: 0.0
    rgb24_to_y_128_c: 0.5
    rgb24_to_y_128_neon: 0.5
    rgb24_to_y_1080_c: 4.7
    rgb24_to_y_1080_neon: 3.2
    rgb24_to_y_1920_c: 8.0
    rgb24_to_y_1920_neon: 5.7
    
    On Pixel 6:
    
    rgb24_to_uv_8_c: 30.7
    rgb24_to_uv_8_neon: 56.9
    rgb24_to_uv_128_c: 213.9
    rgb24_to_uv_128_neon: 173.2
    rgb24_to_uv_1080_c: 1649.9
    rgb24_to_uv_1080_neon: 1424.4
    rgb24_to_uv_1920_c: 2907.9
    rgb24_to_uv_1920_neon: 2480.7
    rgb24_to_uv_half_8_c: 36.2
    rgb24_to_uv_half_8_neon: 33.4
    rgb24_to_uv_half_128_c: 167.9
    rgb24_to_uv_half_128_neon: 99.4
    rgb24_to_uv_half_1080_c: 1293.9
    rgb24_to_uv_half_1080_neon: 778.7
    rgb24_to_uv_half_1920_c: 2292.7
    rgb24_to_uv_half_1920_neon: 1328.7
    rgb24_to_y_8_c: 19.7
    rgb24_to_y_8_neon: 27.7
    rgb24_to_y_128_c: 129.9
    rgb24_to_y_128_neon: 96.7
    rgb24_to_y_1080_c: 995.4
    rgb24_to_y_1080_neon: 767.7
    rgb24_to_y_1920_c: 1747.4
    rgb24_to_y_1920_neon: 1337.2
    
    Note both tests use clang as compiler, which has vectorization
    enabled by default with -O3.
    Reviewed-by: 's avatarRémi Denis-Courmont <remi@remlab.net>
    Reviewed-by: 's avatarMartin Storsjö <martin@martin.st>
    Signed-off-by: 's avatarZhao Zhili <zhilizhao@tencent.com>
    9dac8495
input.S 8.9 KB