• Andreas Rheinhardt's avatar
    avutil/cpu_internal: Fix check for SSE2SLOW · ac322ec2
    Andreas Rheinhardt authored
    For SSE2 and SSE3, there are four states that the two flags
    involved (AV_CPU_FLAG_SSE[23] and AV_CPU_FLAG_SSE[23]SLOW) can convey.
    When ordered from worst to best they are:
    1. both flags unset (SSE[23] unavailable)
    2. the slow flag set, the ordinary flag unset (this is designed
    for cases where SSE2 is available, but so slow that MMX(EXT)/SSE
    code is usually faster)
    3. both flags set (SSE2 is available, but there might be scenarios
    where MMX(EXT)/SSE code is faster)
    4. the ordinary flag set, the slow flag unset (this is the normal case)
    
    The ordinary macros for checking cpuflags return true
    in the latter two cases; the fast macros only return true for
    the latter case. Yet the macros to check for slow currently
    only return true in case three.
    
    This seems unintended. In fact, the only uses of the slow macros
    are all of the form
    if (EXTERNAL_SSE2(cpu_flags) || EXTERNAL_SSE2_SLOW(cpu_flags))
    where the check for EXTERNAL_SSE2_SLOW is completely redundant.
    Even more importantly, it is not what was intended. Before
    6369ba3c, the checks passed
    in cases 2 to 4. Said commit changed this to something that
    only passes for the third case. Commits
    7fb758cd and
    c1913064 restored the old behaviour,
    yet merging 4efab893 (in commit
    ac774cfa) broke this again
    by changing it to what it is now.*
    
    This commit changes the macros to make the slow macros check
    whether a specific instruction is supported, even if slow.
    This restores the intended meaning to all uses of the SLOW macros
    and is generally more natural.
    
    *: Libav only checks for EXTERNAL_SSE2_SLOW, i.e. for the third case
    only.
    Signed-off-by: 's avatarAndreas Rheinhardt <andreas.rheinhardt@outlook.com>
    ac322ec2
cpu_internal.h 2.41 KB