• Lauri Kasanen's avatar
    libswscale/ppc: VSX-optimize 9-16 bit yuv2planeX · 8522d219
    Lauri Kasanen authored
    ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p16be \
    -s 1920x1728 -f null -vframes 100 -v error -nostats -
    
    9-14 bit funcs get about 6x speedup, 16-bit gets about 15x.
    Fate passes, each format tested with an image to video conversion.
    
    Only POWER8 includes 32-bit vector multiplies, so POWER7 is locked out
    of the 16-bit function. This includes the vec_mulo/mule functions too,
    not just vmuluwm.
    
    With TIMER_REPORT skips disabled:
    yuv420p9le
      12412 UNITS in planarX,  131072 runs,      0 skips
      73136 UNITS in planarX,  131072 runs,      0 skips
    yuv420p9be
      12481 UNITS in planarX,  131072 runs,      0 skips
      73410 UNITS in planarX,  131072 runs,      0 skips
    yuv420p10le
      12322 UNITS in planarX,  131072 runs,      0 skips
      72546 UNITS in planarX,  131072 runs,      0 skips
    yuv420p10be
      12291 UNITS in planarX,  131072 runs,      0 skips
      72935 UNITS in planarX,  131072 runs,      0 skips
    yuv420p12le
      12316 UNITS in planarX,  131072 runs,      0 skips
      72708 UNITS in planarX,  131072 runs,      0 skips
    yuv420p12be
      12319 UNITS in planarX,  131072 runs,      0 skips
      72577 UNITS in planarX,  131072 runs,      0 skips
    yuv420p14le
      12259 UNITS in planarX,  131072 runs,      0 skips
      72516 UNITS in planarX,  131072 runs,      0 skips
    yuv420p14be
      12440 UNITS in planarX,  131072 runs,      0 skips
      72962 UNITS in planarX,  131072 runs,      0 skips
    yuv420p16le
      10548 UNITS in planarX,  131072 runs,      0 skips
      73429 UNITS in planarX,  131072 runs,      0 skips
    yuv420p16be
      10634 UNITS in planarX,  131072 runs,      0 skips
     150959 UNITS in planarX,  131072 runs,      0 skips
    Signed-off-by: 's avatarLauri Kasanen <cand@gmx.com>
    8522d219
Name
Last commit
Last update
..
aarch64 Loading commit data...
arm Loading commit data...
ppc Loading commit data...
tests Loading commit data...
x86 Loading commit data...
Makefile Loading commit data...
alphablend.c Loading commit data...
bayer_template.c Loading commit data...
gamma.c Loading commit data...
hscale.c Loading commit data...
hscale_fast_bilinear.c Loading commit data...
input.c Loading commit data...
libswscale.v Loading commit data...
log2_tab.c Loading commit data...
options.c Loading commit data...
output.c Loading commit data...
rgb2rgb.c Loading commit data...
rgb2rgb.h Loading commit data...
rgb2rgb_template.c Loading commit data...
slice.c Loading commit data...
swscale.c Loading commit data...
swscale.h Loading commit data...
swscale_internal.h Loading commit data...
swscale_unscaled.c Loading commit data...
swscaleres.rc Loading commit data...
utils.c Loading commit data...
version.h Loading commit data...
vscale.c Loading commit data...
yuv2rgb.c Loading commit data...