Module sse

Source
Available on x86 or x86-64 only.
Expand description

Streaming SIMD Extensions (SSE)

Constants§

_MM_EXCEPT_DENORM
See _mm_setcsr
_MM_EXCEPT_DIV_ZERO
See _mm_setcsr
_MM_EXCEPT_INEXACT
See _mm_setcsr
_MM_EXCEPT_INVALID
See _mm_setcsr
_MM_EXCEPT_MASK
See _MM_GET_EXCEPTION_STATE
_MM_EXCEPT_OVERFLOW
See _mm_setcsr
_MM_EXCEPT_UNDERFLOW
See _mm_setcsr
_MM_FLUSH_ZERO_MASK
See _MM_GET_FLUSH_ZERO_MODE
_MM_FLUSH_ZERO_OFF
See _mm_setcsr
_MM_FLUSH_ZERO_ON
See _mm_setcsr
_MM_HINT_ET0
See _mm_prefetch.
_MM_HINT_ET1
See _mm_prefetch.
_MM_HINT_NTA
See _mm_prefetch.
_MM_HINT_T0
See _mm_prefetch.
_MM_HINT_T1
See _mm_prefetch.
_MM_HINT_T2
See _mm_prefetch.
_MM_MASK_DENORM
See _mm_setcsr
_MM_MASK_DIV_ZERO
See _mm_setcsr
_MM_MASK_INEXACT
See _mm_setcsr
_MM_MASK_INVALID
See _mm_setcsr
_MM_MASK_MASK
See _MM_GET_EXCEPTION_MASK
_MM_MASK_OVERFLOW
See _mm_setcsr
_MM_MASK_UNDERFLOW
See _mm_setcsr
_MM_ROUND_DOWN
See _mm_setcsr
_MM_ROUND_MASK
See _MM_GET_ROUNDING_MODE
_MM_ROUND_NEAREST
See _mm_setcsr
_MM_ROUND_TOWARD_ZERO
See _mm_setcsr
_MM_ROUND_UP
See _mm_setcsr

Functions§

_MM_GET_EXCEPTION_MASKDeprecatedsse
See _mm_setcsr
_MM_GET_EXCEPTION_STATEDeprecatedsse
See _mm_setcsr
_MM_GET_FLUSH_ZERO_MODEDeprecatedsse
See _mm_setcsr
_MM_GET_ROUNDING_MODEDeprecatedsse
See _mm_setcsr
_MM_SET_EXCEPTION_MASKDeprecatedsse
See _mm_setcsr
_MM_SET_EXCEPTION_STATEDeprecatedsse
See _mm_setcsr
_MM_SET_FLUSH_ZERO_MODEDeprecatedsse
See _mm_setcsr
_MM_SET_ROUNDING_MODEDeprecatedsse
See _mm_setcsr
_MM_TRANSPOSE4_PSsse
Transpose the 4x4 matrix formed by 4 rows of __m128 in place.
_mm_add_pssse
Adds packed single-precision (32-bit) floating-point elements in a and b.
_mm_add_sssse
Adds the first component of a and b, the other components are copied from a.
_mm_and_pssse
Bitwise AND of packed single-precision (32-bit) floating-point elements.
_mm_andnot_pssse
Bitwise AND-NOT of packed single-precision (32-bit) floating-point elements.
_mm_cmpeq_pssse
Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input elements were equal, or 0 otherwise.
_mm_cmpeq_sssse
Compares the lowest f32 of both inputs for equality. The lowest 32 bits of the result will be 0xffffffff if the two inputs are equal, or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpge_pssse
Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is greater than or equal to the corresponding element in b, or 0 otherwise.
_mm_cmpge_sssse
Compares the lowest f32 of both inputs for greater than or equal. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is greater than or equal b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpgt_pssse
Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is greater than the corresponding element in b, or 0 otherwise.
_mm_cmpgt_sssse
Compares the lowest f32 of both inputs for greater than. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is greater than b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmple_pssse
Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is less than or equal to the corresponding element in b, or 0 otherwise.
_mm_cmple_sssse
Compares the lowest f32 of both inputs for less than or equal. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is less than or equal b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmplt_pssse
Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is less than the corresponding element in b, or 0 otherwise.
_mm_cmplt_sssse
Compares the lowest f32 of both inputs for less than. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is less than b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpneq_pssse
Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input elements are not equal, or 0 otherwise.
_mm_cmpneq_sssse
Compares the lowest f32 of both inputs for inequality. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is not equal to b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpnge_pssse
Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is not greater than or equal to the corresponding element in b, or 0 otherwise.
_mm_cmpnge_sssse
Compares the lowest f32 of both inputs for not-greater-than-or-equal. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is not greater than or equal to b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpngt_pssse
Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is not greater than the corresponding element in b, or 0 otherwise.
_mm_cmpngt_sssse
Compares the lowest f32 of both inputs for not-greater-than. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is not greater than b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpnle_pssse
Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is not less than or equal to the corresponding element in b, or 0 otherwise.
_mm_cmpnle_sssse
Compares the lowest f32 of both inputs for not-less-than-or-equal. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is not less than or equal to b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpnlt_pssse
Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is not less than the corresponding element in b, or 0 otherwise.
_mm_cmpnlt_sssse
Compares the lowest f32 of both inputs for not-less-than. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is not less than b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpord_pssse
Compares each of the four floats in a to the corresponding element in b. Returns four floats that have one of two possible bit patterns. The element in the output vector will be 0xffffffff if the input elements in a and b are ordered (i.e., neither of them is a NaN), or 0 otherwise.
_mm_cmpord_sssse
Checks if the lowest f32 of both inputs are ordered. The lowest 32 bits of the result will be 0xffffffff if neither of a.extract(0) or b.extract(0) is a NaN, or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpunord_pssse
Compares each of the four floats in a to the corresponding element in b. Returns four floats that have one of two possible bit patterns. The element in the output vector will be 0xffffffff if the input elements in a and b are unordered (i.e., at least on of them is a NaN), or 0 otherwise.
_mm_cmpunord_sssse
Checks if the lowest f32 of both inputs are unordered. The lowest 32 bits of the result will be 0xffffffff if any of a.extract(0) or b.extract(0) is a NaN, or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_comieq_sssse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if they are equal, or 0 otherwise.
_mm_comige_sssse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is greater than or equal to the one from b, or 0 otherwise.
_mm_comigt_sssse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is greater than the one from b, or 0 otherwise.
_mm_comile_sssse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is less than or equal to the one from b, or 0 otherwise.
_mm_comilt_sssse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is less than the one from b, or 0 otherwise.
_mm_comineq_sssse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if they are not equal, or 0 otherwise.
_mm_cvt_si2sssse
Alias for _mm_cvtsi32_ss.
_mm_cvt_ss2sisse
Alias for _mm_cvtss_si32.
_mm_cvtsi32_sssse
Converts a 32 bit integer to a 32 bit float. The result vector is the input vector a with the lowest 32 bit float replaced by the converted integer.
_mm_cvtss_f32sse
Extracts the lowest 32 bit float from the input vector.
_mm_cvtss_si32sse
Converts the lowest 32 bit float in the input vector to a 32 bit integer.
_mm_cvtt_ss2sisse
Alias for _mm_cvttss_si32.
_mm_cvttss_si32sse
Converts the lowest 32 bit float in the input vector to a 32 bit integer with truncation.
_mm_div_pssse
Divides packed single-precision (32-bit) floating-point elements in a and b.
_mm_div_sssse
Divides the first component of b by a, the other components are copied from a.
_mm_getcsrDeprecatedsse
Gets the unsigned 32-bit value of the MXCSR control and status register.
_mm_load1_pssse
Construct a __m128 by duplicating the value read from p into all elements.
_mm_load_pssse
Loads four f32 values from aligned memory into a __m128. If the pointer is not aligned to a 128-bit boundary (16 bytes) a general protection fault will be triggered (fatal program crash).
_mm_load_ps1sse
Alias for _mm_load1_ps
_mm_load_sssse
Construct a __m128 with the lowest element read from p and the other elements set to zero.
_mm_loadr_pssse
Loads four f32 values from aligned memory into a __m128 in reverse order.
_mm_loadu_pssse
Loads four f32 values from memory into a __m128. There are no restrictions on memory alignment. For aligned memory _mm_load_ps may be faster.
_mm_max_pssse
Compares packed single-precision (32-bit) floating-point elements in a and b, and return the corresponding maximum values.
_mm_max_sssse
Compares the first single-precision (32-bit) floating-point element of a and b, and return the maximum value in the first element of the return value, the other elements are copied from a.
_mm_min_pssse
Compares packed single-precision (32-bit) floating-point elements in a and b, and return the corresponding minimum values.
_mm_min_sssse
Compares the first single-precision (32-bit) floating-point element of a and b, and return the minimum value in the first element of the return value, the other elements are copied from a.
_mm_move_sssse
Returns a __m128 with the first component from b and the remaining components from a.
_mm_movehl_pssse
Combine higher half of a and b. The higher half of b occupies the lower half of result.
_mm_movelh_pssse
Combine lower half of a and b. The lower half of b occupies the higher half of result.
_mm_movemask_pssse
Returns a mask of the most significant bit of each element in a.
_mm_mul_pssse
Multiplies packed single-precision (32-bit) floating-point elements in a and b.
_mm_mul_sssse
Multiplies the first component of a and b, the other components are copied from a.
_mm_or_pssse
Bitwise OR of packed single-precision (32-bit) floating-point elements.
_mm_prefetchsse
Fetch the cache line that contains address p using the given STRATEGY.
_mm_rcp_pssse
Returns the approximate reciprocal of packed single-precision (32-bit) floating-point elements in a.
_mm_rcp_sssse
Returns the approximate reciprocal of the first single-precision (32-bit) floating-point element in a, the other elements are unchanged.
_mm_rsqrt_pssse
Returns the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a.
_mm_rsqrt_sssse
Returns the approximate reciprocal square root of the first single-precision (32-bit) floating-point element in a, the other elements are unchanged.
_mm_set1_pssse
Construct a __m128 with all element set to a.
_mm_set_pssse
Construct a __m128 from four floating point values highest to lowest.
_mm_set_ps1sse
Alias for _mm_set1_ps
_mm_set_sssse
Construct a __m128 with the lowest element set to a and the rest set to zero.
_mm_setcsrDeprecatedsse
Sets the MXCSR register with the 32-bit unsigned integer value.
_mm_setr_pssse
Construct a __m128 from four floating point values lowest to highest.
_mm_setzero_pssse
Construct a __m128 with all elements initialized to zero.
_mm_sfencesse
Performs a serializing operation on all non-temporal (“streaming”) store instructions that were issued by the current thread prior to this instruction.
_mm_shuffle_pssse
Shuffles packed single-precision (32-bit) floating-point elements in a and b using MASK.
_mm_sqrt_pssse
Returns the square root of packed single-precision (32-bit) floating-point elements in a.
_mm_sqrt_sssse
Returns the square root of the first single-precision (32-bit) floating-point element in a, the other elements are unchanged.
_mm_store1_pssse
Stores the lowest 32 bit float of a repeated four times into aligned memory.
_mm_store_pssse
Stores four 32-bit floats into aligned memory.
_mm_store_ps1sse
Alias for _mm_store1_ps
_mm_store_sssse
Stores the lowest 32 bit float of a into memory.
_mm_storer_pssse
Stores four 32-bit floats into aligned memory in reverse order.
_mm_storeu_pssse
Stores four 32-bit floats into memory. There are no restrictions on memory alignment. For aligned memory _mm_store_ps may be faster.
_mm_stream_pssse
Stores a into the memory at mem_addr using a non-temporal memory hint.
_mm_sub_pssse
Subtracts packed single-precision (32-bit) floating-point elements in a and b.
_mm_sub_sssse
Subtracts the first component of b from a, the other components are copied from a.
_mm_ucomieq_sssse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if they are equal, or 0 otherwise. This instruction will not signal an exception if either argument is a quiet NaN.
_mm_ucomige_sssse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is greater than or equal to the one from b, or 0 otherwise. This instruction will not signal an exception if either argument is a quiet NaN.
_mm_ucomigt_sssse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is greater than the one from b, or 0 otherwise. This instruction will not signal an exception if either argument is a quiet NaN.
_mm_ucomile_sssse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is less than or equal to the one from b, or 0 otherwise. This instruction will not signal an exception if either argument is a quiet NaN.
_mm_ucomilt_sssse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is less than the one from b, or 0 otherwise. This instruction will not signal an exception if either argument is a quiet NaN.
_mm_ucomineq_sssse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if they are not equal, or 0 otherwise. This instruction will not signal an exception if either argument is a quiet NaN.
_mm_undefined_pssse
Returns vector of type __m128 with indeterminate elements.with indetermination elements. Despite using the word “undefined” (following Intel’s naming scheme), this non-deterministically picks some valid value and is not equivalent to mem::MaybeUninit. In practice, this is typically equivalent to mem::zeroed.
_mm_unpackhi_pssse
Unpacks and interleave single-precision (32-bit) floating-point elements from the higher half of a and b.
_mm_unpacklo_pssse
Unpacks and interleave single-precision (32-bit) floating-point elements from the lower half of a and b.
_mm_xor_pssse
Bitwise exclusive OR of packed single-precision (32-bit) floating-point elements.
cmpps 🔒
cmpss 🔒
comieq_ss 🔒
comige_ss 🔒
comigt_ss 🔒
comile_ss 🔒
comilt_ss 🔒
comineq_ss 🔒
cvtsi2ss 🔒
cvtss2si 🔒
cvttss2si 🔒
ldmxcsr 🔒
maxps 🔒
maxss 🔒
minps 🔒
minss 🔒
prefetch 🔒
rcpps 🔒
rcpss 🔒
rsqrtps 🔒
rsqrtss 🔒
sfence 🔒
stmxcsr 🔒
ucomieq_ss 🔒
ucomige_ss 🔒
ucomigt_ss 🔒
ucomile_ss 🔒
ucomilt_ss 🔒
ucomineq_ss 🔒
_MM_SHUFFLEExperimental
A utility function for creating masks to use with Intel shuffle and permute intrinsics.