Module sse41

Source
Available on x86 or x86-64 only.
Expand description

Streaming SIMD Extensions 4.1 (SSE4.1)

Constants§

_MM_FROUND_CEIL
round up and do not suppress exceptions
_MM_FROUND_CUR_DIRECTION
use MXCSR.RC; see vendor::_MM_SET_ROUNDING_MODE
_MM_FROUND_FLOOR
round down and do not suppress exceptions
_MM_FROUND_NEARBYINT
use MXCSR.RC and suppress exceptions; see vendor::_MM_SET_ROUNDING_MODE
_MM_FROUND_NINT
round to nearest and do not suppress exceptions
_MM_FROUND_NO_EXC
suppress exceptions
_MM_FROUND_RAISE_EXC
do not suppress exceptions
_MM_FROUND_RINT
use MXCSR.RC and do not suppress exceptions; see vendor::_MM_SET_ROUNDING_MODE
_MM_FROUND_TO_NEAREST_INT
round to nearest
_MM_FROUND_TO_NEG_INF
round down
_MM_FROUND_TO_POS_INF
round up
_MM_FROUND_TO_ZERO
truncate
_MM_FROUND_TRUNC
truncate and do not suppress exceptions

Functions§

_mm_blend_epi16sse4.1
Blend packed 16-bit integers from a and b using the mask IMM8.
_mm_blend_pdsse4.1
Blend packed double-precision (64-bit) floating-point elements from a and b using control mask IMM2
_mm_blend_pssse4.1
Blend packed single-precision (32-bit) floating-point elements from a and b using mask IMM4
_mm_blendv_epi8sse4.1
Blend packed 8-bit integers from a and b using mask
_mm_blendv_pdsse4.1
Blend packed double-precision (64-bit) floating-point elements from a and b using mask
_mm_blendv_pssse4.1
Blend packed single-precision (32-bit) floating-point elements from a and b using mask
_mm_ceil_pdsse4.1
Round the packed double-precision (64-bit) floating-point elements in a up to an integer value, and stores the results as packed double-precision floating-point elements.
_mm_ceil_pssse4.1
Round the packed single-precision (32-bit) floating-point elements in a up to an integer value, and stores the results as packed single-precision floating-point elements.
_mm_ceil_sdsse4.1
Round the lower double-precision (64-bit) floating-point element in b up to an integer value, store the result as a double-precision floating-point element in the lower element of the intrinsic result, and copies the upper element from a to the upper element of the intrinsic result.
_mm_ceil_sssse4.1
Round the lower single-precision (32-bit) floating-point element in b up to an integer value, store the result as a single-precision floating-point element in the lower element of the intrinsic result, and copies the upper 3 packed elements from a to the upper elements of the intrinsic result.
_mm_cmpeq_epi64sse4.1
Compares packed 64-bit integers in a and b for equality
_mm_cvtepi8_epi16sse4.1
Sign extend packed 8-bit integers in a to packed 16-bit integers
_mm_cvtepi8_epi32sse4.1
Sign extend packed 8-bit integers in a to packed 32-bit integers
_mm_cvtepi8_epi64sse4.1
Sign extend packed 8-bit integers in the low 8 bytes of a to packed 64-bit integers
_mm_cvtepi16_epi32sse4.1
Sign extend packed 16-bit integers in a to packed 32-bit integers
_mm_cvtepi16_epi64sse4.1
Sign extend packed 16-bit integers in a to packed 64-bit integers
_mm_cvtepi32_epi64sse4.1
Sign extend packed 32-bit integers in a to packed 64-bit integers
_mm_cvtepu8_epi16sse4.1
Zeroes extend packed unsigned 8-bit integers in a to packed 16-bit integers
_mm_cvtepu8_epi32sse4.1
Zeroes extend packed unsigned 8-bit integers in a to packed 32-bit integers
_mm_cvtepu8_epi64sse4.1
Zeroes extend packed unsigned 8-bit integers in a to packed 64-bit integers
_mm_cvtepu16_epi32sse4.1
Zeroes extend packed unsigned 16-bit integers in a to packed 32-bit integers
_mm_cvtepu16_epi64sse4.1
Zeroes extend packed unsigned 16-bit integers in a to packed 64-bit integers
_mm_cvtepu32_epi64sse4.1
Zeroes extend packed unsigned 32-bit integers in a to packed 64-bit integers
_mm_dp_pdsse4.1
Returns the dot product of two __m128d vectors.
_mm_dp_pssse4.1
Returns the dot product of two __m128 vectors.
_mm_extract_epi8sse4.1
Extracts an 8-bit integer from a, selected with IMM8. Returns a 32-bit integer containing the zero-extended integer data.
_mm_extract_epi32sse4.1
Extracts an 32-bit integer from a selected with IMM8
_mm_extract_pssse4.1
Extracts a single-precision (32-bit) floating-point element from a, selected with IMM8. The returned i32 stores the float’s bit-pattern, and may be converted back to a floating point number via casting.
_mm_floor_pdsse4.1
Round the packed double-precision (64-bit) floating-point elements in a down to an integer value, and stores the results as packed double-precision floating-point elements.
_mm_floor_pssse4.1
Round the packed single-precision (32-bit) floating-point elements in a down to an integer value, and stores the results as packed single-precision floating-point elements.
_mm_floor_sdsse4.1
Round the lower double-precision (64-bit) floating-point element in b down to an integer value, store the result as a double-precision floating-point element in the lower element of the intrinsic result, and copies the upper element from a to the upper element of the intrinsic result.
_mm_floor_sssse4.1
Round the lower single-precision (32-bit) floating-point element in b down to an integer value, store the result as a single-precision floating-point element in the lower element of the intrinsic result, and copies the upper 3 packed elements from a to the upper elements of the intrinsic result.
_mm_insert_epi8sse4.1
Returns a copy of a with the 8-bit integer from i inserted at a location specified by IMM8.
_mm_insert_epi32sse4.1
Returns a copy of a with the 32-bit integer from i inserted at a location specified by IMM8.
_mm_insert_pssse4.1
Select a single value in b to store at some position in a, Then zero elements according to IMM8.
_mm_max_epi8sse4.1
Compares packed 8-bit integers in a and b and returns packed maximum values in dst.
_mm_max_epi32sse4.1
Compares packed 32-bit integers in a and b, and returns packed maximum values.
_mm_max_epu16sse4.1
Compares packed unsigned 16-bit integers in a and b, and returns packed maximum.
_mm_max_epu32sse4.1
Compares packed unsigned 32-bit integers in a and b, and returns packed maximum values.
_mm_min_epi8sse4.1
Compares packed 8-bit integers in a and b and returns packed minimum values in dst.
_mm_min_epi32sse4.1
Compares packed 32-bit integers in a and b, and returns packed minimum values.
_mm_min_epu16sse4.1
Compares packed unsigned 16-bit integers in a and b, and returns packed minimum.
_mm_min_epu32sse4.1
Compares packed unsigned 32-bit integers in a and b, and returns packed minimum values.
_mm_minpos_epu16sse4.1
Finds the minimum unsigned 16-bit element in the 128-bit __m128i vector, returning a vector containing its value in its first position, and its index in its second position; all other elements are set to zero.
_mm_mpsadbw_epu8sse4.1
Subtracts 8-bit unsigned integer values and computes the absolute values of the differences to the corresponding bits in the destination. Then sums of the absolute differences are returned according to the bit fields in the immediate operand.
_mm_mul_epi32sse4.1
Multiplies the low 32-bit integers from each packed 64-bit element in a and b, and returns the signed 64-bit result.
_mm_mullo_epi32sse4.1
Multiplies the packed 32-bit integers in a and b, producing intermediate 64-bit integers, and returns the lowest 32-bit, whatever they might be, reinterpreted as a signed integer. While pmulld __m128i::splat(2), __m128i::splat(2) returns the obvious __m128i::splat(4), due to wrapping arithmetic pmulld __m128i::splat(i32::MAX), __m128i::splat(2) would return a negative number.
_mm_packus_epi32sse4.1
Converts packed 32-bit integers from a and b to packed 16-bit integers using unsigned saturation
_mm_round_pdsse4.1
Round the packed double-precision (64-bit) floating-point elements in a using the ROUNDING parameter, and stores the results as packed double-precision floating-point elements. Rounding is done according to the rounding parameter, which can be one of:
_mm_round_pssse4.1
Round the packed single-precision (32-bit) floating-point elements in a using the ROUNDING parameter, and stores the results as packed single-precision floating-point elements. Rounding is done according to the rounding parameter, which can be one of:
_mm_round_sdsse4.1
Round the lower double-precision (64-bit) floating-point element in b using the ROUNDING parameter, store the result as a double-precision floating-point element in the lower element of the intrinsic result, and copies the upper element from a to the upper element of the intrinsic result. Rounding is done according to the rounding parameter, which can be one of:
_mm_round_sssse4.1
Round the lower single-precision (32-bit) floating-point element in b using the ROUNDING parameter, store the result as a single-precision floating-point element in the lower element of the intrinsic result, and copies the upper 3 packed elements from a to the upper elements of the intrinsic result. Rounding is done according to the rounding parameter, which can be one of:
_mm_stream_load_si128sse4.1
Load 128-bits of integer data from memory into dst. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated. To minimize caching, the data is flagged as non-temporal (unlikely to be used again soon)
_mm_test_all_onessse4.1
Tests whether the specified bits in a 128-bit integer vector are all ones.
_mm_test_all_zerossse4.1
Tests whether the specified bits in a 128-bit integer vector are all zeros.
_mm_test_mix_ones_zerossse4.1
Tests whether the specified bits in a 128-bit integer vector are neither all zeros nor all ones.
_mm_testc_si128sse4.1
Tests whether the specified bits in a 128-bit integer vector are all ones.
_mm_testnzc_si128sse4.1
Tests whether the specified bits in a 128-bit integer vector are neither all zeros nor all ones.
_mm_testz_si128sse4.1
Tests whether the specified bits in a 128-bit integer vector are all zeros.
dppd 🔒
dpps 🔒
insertps 🔒
mpsadbw 🔒
packusdw 🔒
phminposuw 🔒
ptestc 🔒
ptestnzc 🔒
ptestz 🔒
roundpd 🔒
roundps 🔒
roundsd 🔒
roundss 🔒