Module sse

Source

Available on x86 or x86-64 only.

Expand description

Streaming SIMD Extensions (SSE)

Constants§

_MM_EXCEPT_DENORM: See _mm_setcsr
_MM_EXCEPT_DIV_ZERO: See _mm_setcsr
_MM_EXCEPT_INEXACT: See _mm_setcsr
_MM_EXCEPT_INVALID: See _mm_setcsr
_MM_EXCEPT_MASK: See _MM_GET_EXCEPTION_STATE
_MM_EXCEPT_OVERFLOW: See _mm_setcsr
_MM_EXCEPT_UNDERFLOW: See _mm_setcsr
_MM_FLUSH_ZERO_MASK: See _MM_GET_FLUSH_ZERO_MODE
_MM_FLUSH_ZERO_OFF: See _mm_setcsr
_MM_FLUSH_ZERO_ON: See _mm_setcsr
_MM_HINT_ET0: See _mm_prefetch.
_MM_HINT_ET1: See _mm_prefetch.
_MM_HINT_NTA: See _mm_prefetch.
_MM_HINT_T0: See _mm_prefetch.
_MM_HINT_T1: See _mm_prefetch.
_MM_HINT_T2: See _mm_prefetch.
_MM_MASK_DENORM: See _mm_setcsr
_MM_MASK_DIV_ZERO: See _mm_setcsr
_MM_MASK_INEXACT: See _mm_setcsr
_MM_MASK_INVALID: See _mm_setcsr
_MM_MASK_MASK: See _MM_GET_EXCEPTION_MASK
_MM_MASK_OVERFLOW: See _mm_setcsr
_MM_MASK_UNDERFLOW: See _mm_setcsr
_MM_ROUND_DOWN: See _mm_setcsr
_MM_ROUND_MASK: See _MM_GET_ROUNDING_MODE
_MM_ROUND_NEAREST: See _mm_setcsr
_MM_ROUND_TOWARD_ZERO: See _mm_setcsr
_MM_ROUND_UP: See _mm_setcsr

Functions§

_MM_GET_EXCEPTION_MASK^⚠Deprecatedsse: See _mm_setcsr
_MM_GET_EXCEPTION_STATE^⚠Deprecatedsse: See _mm_setcsr
_MM_GET_FLUSH_ZERO_MODE^⚠Deprecatedsse: See _mm_setcsr
_MM_GET_ROUNDING_MODE^⚠Deprecatedsse: See _mm_setcsr
_MM_SET_EXCEPTION_MASK^⚠Deprecatedsse: See _mm_setcsr
_MM_SET_EXCEPTION_STATE^⚠Deprecatedsse: See _mm_setcsr
_MM_SET_FLUSH_ZERO_MODE^⚠Deprecatedsse: See _mm_setcsr
_MM_SET_ROUNDING_MODE^⚠Deprecatedsse: See _mm_setcsr
_MM_TRANSPOSE4_PS^⚠sse: Transpose the 4x4 matrix formed by 4 rows of __m128 in place.
_mm_add_ps^⚠sse: Adds packed single-precision (32-bit) floating-point elements in a and b.
_mm_add_ss^⚠sse: Adds the first component of a and b, the other components are copied from a.
_mm_and_ps^⚠sse: Bitwise AND of packed single-precision (32-bit) floating-point elements.
_mm_andnot_ps^⚠sse: Bitwise AND-NOT of packed single-precision (32-bit) floating-point elements.
_mm_cmpeq_ps^⚠sse: Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input elements were equal, or 0 otherwise.
_mm_cmpeq_ss^⚠sse: Compares the lowest f32 of both inputs for equality. The lowest 32 bits of the result will be 0xffffffff if the two inputs are equal, or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpge_ps^⚠sse: Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is greater than or equal to the corresponding element in b, or 0 otherwise.
_mm_cmpge_ss^⚠sse: Compares the lowest f32 of both inputs for greater than or equal. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is greater than or equal b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpgt_ps^⚠sse: Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is greater than the corresponding element in b, or 0 otherwise.
_mm_cmpgt_ss^⚠sse: Compares the lowest f32 of both inputs for greater than. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is greater than b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmple_ps^⚠sse: Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is less than or equal to the corresponding element in b, or 0 otherwise.
_mm_cmple_ss^⚠sse: Compares the lowest f32 of both inputs for less than or equal. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is less than or equal b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmplt_ps^⚠sse: Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is less than the corresponding element in b, or 0 otherwise.
_mm_cmplt_ss^⚠sse: Compares the lowest f32 of both inputs for less than. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is less than b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpneq_ps^⚠sse: Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input elements are not equal, or 0 otherwise.
_mm_cmpneq_ss^⚠sse: Compares the lowest f32 of both inputs for inequality. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is not equal to b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpnge_ps^⚠sse: Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is not greater than or equal to the corresponding element in b, or 0 otherwise.
_mm_cmpnge_ss^⚠sse: Compares the lowest f32 of both inputs for not-greater-than-or-equal. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is not greater than or equal to b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpngt_ps^⚠sse: Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is not greater than the corresponding element in b, or 0 otherwise.
_mm_cmpngt_ss^⚠sse: Compares the lowest f32 of both inputs for not-greater-than. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is not greater than b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpnle_ps^⚠sse: Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is not less than or equal to the corresponding element in b, or 0 otherwise.
_mm_cmpnle_ss^⚠sse: Compares the lowest f32 of both inputs for not-less-than-or-equal. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is not less than or equal to b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpnlt_ps^⚠sse: Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is not less than the corresponding element in b, or 0 otherwise.
_mm_cmpnlt_ss^⚠sse: Compares the lowest f32 of both inputs for not-less-than. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is not less than b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpord_ps^⚠sse: Compares each of the four floats in a to the corresponding element in b. Returns four floats that have one of two possible bit patterns. The element in the output vector will be 0xffffffff if the input elements in a and b are ordered (i.e., neither of them is a NaN), or 0 otherwise.
_mm_cmpord_ss^⚠sse: Checks if the lowest f32 of both inputs are ordered. The lowest 32 bits of the result will be 0xffffffff if neither of a.extract(0) or b.extract(0) is a NaN, or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpunord_ps^⚠sse: Compares each of the four floats in a to the corresponding element in b. Returns four floats that have one of two possible bit patterns. The element in the output vector will be 0xffffffff if the input elements in a and b are unordered (i.e., at least on of them is a NaN), or 0 otherwise.
_mm_cmpunord_ss^⚠sse: Checks if the lowest f32 of both inputs are unordered. The lowest 32 bits of the result will be 0xffffffff if any of a.extract(0) or b.extract(0) is a NaN, or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_comieq_ss^⚠sse: Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if they are equal, or 0 otherwise.
_mm_comige_ss^⚠sse: Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is greater than or equal to the one from b, or 0 otherwise.
_mm_comigt_ss^⚠sse: Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is greater than the one from b, or 0 otherwise.
_mm_comile_ss^⚠sse: Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is less than or equal to the one from b, or 0 otherwise.
_mm_comilt_ss^⚠sse: Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is less than the one from b, or 0 otherwise.
_mm_comineq_ss^⚠sse: Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if they are not equal, or 0 otherwise.
_mm_cvt_si2ss^⚠sse: Alias for _mm_cvtsi32_ss.
_mm_cvt_ss2si^⚠sse: Alias for _mm_cvtss_si32.
_mm_cvtsi32_ss^⚠sse: Converts a 32 bit integer to a 32 bit float. The result vector is the input vector a with the lowest 32 bit float replaced by the converted integer.
_mm_cvtss_f32^⚠sse: Extracts the lowest 32 bit float from the input vector.
_mm_cvtss_si32^⚠sse: Converts the lowest 32 bit float in the input vector to a 32 bit integer.
_mm_cvtt_ss2si^⚠sse: Alias for _mm_cvttss_si32.
_mm_cvttss_si32^⚠sse: Converts the lowest 32 bit float in the input vector to a 32 bit integer with truncation.
_mm_div_ps^⚠sse: Divides packed single-precision (32-bit) floating-point elements in a and b.
_mm_div_ss^⚠sse: Divides the first component of b by a, the other components are copied from a.
_mm_getcsr^⚠Deprecatedsse: Gets the unsigned 32-bit value of the MXCSR control and status register.
_mm_load1_ps^⚠sse: Construct a __m128 by duplicating the value read from p into all elements.
_mm_load_ps^⚠sse: Loads four f32 values from aligned memory into a __m128. If the pointer is not aligned to a 128-bit boundary (16 bytes) a general protection fault will be triggered (fatal program crash).
_mm_load_ps1^⚠sse: Alias for _mm_load1_ps
_mm_load_ss^⚠sse: Construct a __m128 with the lowest element read from p and the other elements set to zero.
_mm_loadr_ps^⚠sse: Loads four f32 values from aligned memory into a __m128 in reverse order.
_mm_loadu_ps^⚠sse: Loads four f32 values from memory into a __m128. There are no restrictions on memory alignment. For aligned memory _mm_load_ps may be faster.
_mm_max_ps^⚠sse: Compares packed single-precision (32-bit) floating-point elements in a and b, and return the corresponding maximum values.
_mm_max_ss^⚠sse: Compares the first single-precision (32-bit) floating-point element of a and b, and return the maximum value in the first element of the return value, the other elements are copied from a.
_mm_min_ps^⚠sse: Compares packed single-precision (32-bit) floating-point elements in a and b, and return the corresponding minimum values.
_mm_min_ss^⚠sse: Compares the first single-precision (32-bit) floating-point element of a and b, and return the minimum value in the first element of the return value, the other elements are copied from a.
_mm_move_ss^⚠sse: Returns a __m128 with the first component from b and the remaining components from a.
_mm_movehl_ps^⚠sse: Combine higher half of a and b. The higher half of b occupies the lower half of result.
_mm_movelh_ps^⚠sse: Combine lower half of a and b. The lower half of b occupies the higher half of result.
_mm_movemask_ps^⚠sse: Returns a mask of the most significant bit of each element in a.
_mm_mul_ps^⚠sse: Multiplies packed single-precision (32-bit) floating-point elements in a and b.
_mm_mul_ss^⚠sse: Multiplies the first component of a and b, the other components are copied from a.
_mm_or_ps^⚠sse: Bitwise OR of packed single-precision (32-bit) floating-point elements.
_mm_prefetch^⚠sse: Fetch the cache line that contains address p using the given STRATEGY.
_mm_rcp_ps^⚠sse: Returns the approximate reciprocal of packed single-precision (32-bit) floating-point elements in a.
_mm_rcp_ss^⚠sse: Returns the approximate reciprocal of the first single-precision (32-bit) floating-point element in a, the other elements are unchanged.
_mm_rsqrt_ps^⚠sse: Returns the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a.
_mm_rsqrt_ss^⚠sse: Returns the approximate reciprocal square root of the first single-precision (32-bit) floating-point element in a, the other elements are unchanged.
_mm_set1_ps^⚠sse: Construct a __m128 with all element set to a.
_mm_set_ps^⚠sse: Construct a __m128 from four floating point values highest to lowest.
_mm_set_ps1^⚠sse: Alias for _mm_set1_ps
_mm_set_ss^⚠sse: Construct a __m128 with the lowest element set to a and the rest set to zero.
_mm_setcsr^⚠Deprecatedsse: Sets the MXCSR register with the 32-bit unsigned integer value.
_mm_setr_ps^⚠sse: Construct a __m128 from four floating point values lowest to highest.
_mm_setzero_ps^⚠sse: Construct a __m128 with all elements initialized to zero.
_mm_sfence^⚠sse: Performs a serializing operation on all non-temporal (“streaming”) store instructions that were issued by the current thread prior to this instruction.
_mm_shuffle_ps^⚠sse: Shuffles packed single-precision (32-bit) floating-point elements in a and b using MASK.
_mm_sqrt_ps^⚠sse: Returns the square root of packed single-precision (32-bit) floating-point elements in a.
_mm_sqrt_ss^⚠sse: Returns the square root of the first single-precision (32-bit) floating-point element in a, the other elements are unchanged.
_mm_store1_ps^⚠sse: Stores the lowest 32 bit float of a repeated four times into aligned memory.
_mm_store_ps^⚠sse: Stores four 32-bit floats into aligned memory.
_mm_store_ps1^⚠sse: Alias for _mm_store1_ps
_mm_store_ss^⚠sse: Stores the lowest 32 bit float of a into memory.
_mm_storer_ps^⚠sse: Stores four 32-bit floats into aligned memory in reverse order.
_mm_storeu_ps^⚠sse: Stores four 32-bit floats into memory. There are no restrictions on memory alignment. For aligned memory _mm_store_ps may be faster.
_mm_stream_ps^⚠sse: Stores a into the memory at mem_addr using a non-temporal memory hint.
_mm_sub_ps^⚠sse: Subtracts packed single-precision (32-bit) floating-point elements in a and b.
_mm_sub_ss^⚠sse: Subtracts the first component of b from a, the other components are copied from a.
_mm_ucomieq_ss^⚠sse: Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if they are equal, or 0 otherwise. This instruction will not signal an exception if either argument is a quiet NaN.
_mm_ucomige_ss^⚠sse: Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is greater than or equal to the one from b, or 0 otherwise. This instruction will not signal an exception if either argument is a quiet NaN.
_mm_ucomigt_ss^⚠sse: Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is greater than the one from b, or 0 otherwise. This instruction will not signal an exception if either argument is a quiet NaN.
_mm_ucomile_ss^⚠sse: Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is less than or equal to the one from b, or 0 otherwise. This instruction will not signal an exception if either argument is a quiet NaN.
_mm_ucomilt_ss^⚠sse: Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is less than the one from b, or 0 otherwise. This instruction will not signal an exception if either argument is a quiet NaN.
_mm_ucomineq_ss^⚠sse: Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if they are not equal, or 0 otherwise. This instruction will not signal an exception if either argument is a quiet NaN.
_mm_undefined_ps^⚠sse: Returns vector of type __m128 with indeterminate elements.with indetermination elements. Despite using the word “undefined” (following Intel’s naming scheme), this non-deterministically picks some valid value and is not equivalent to mem::MaybeUninit. In practice, this is typically equivalent to mem::zeroed.
_mm_unpackhi_ps^⚠sse: Unpacks and interleave single-precision (32-bit) floating-point elements from the higher half of a and b.
_mm_unpacklo_ps^⚠sse: Unpacks and interleave single-precision (32-bit) floating-point elements from the lower half of a and b.
_mm_xor_ps^⚠sse: Bitwise exclusive OR of packed single-precision (32-bit) floating-point elements.
cmpps 🔒 ^⚠
cmpss 🔒 ^⚠
comieq_ss 🔒 ^⚠
comige_ss 🔒 ^⚠
comigt_ss 🔒 ^⚠
comile_ss 🔒 ^⚠
comilt_ss 🔒 ^⚠
comineq_ss 🔒 ^⚠
cvtsi2ss 🔒 ^⚠
cvtss2si 🔒 ^⚠
cvttss2si 🔒 ^⚠
ldmxcsr 🔒 ^⚠
maxps 🔒 ^⚠
maxss 🔒 ^⚠
minps 🔒 ^⚠
minss 🔒 ^⚠
prefetch 🔒 ^⚠
rcpps 🔒 ^⚠
rcpss 🔒 ^⚠
rsqrtps 🔒 ^⚠
rsqrtss 🔒 ^⚠
sfence 🔒 ^⚠
stmxcsr 🔒 ^⚠
ucomieq_ss 🔒 ^⚠
ucomige_ss 🔒 ^⚠
ucomigt_ss 🔒 ^⚠
ucomile_ss 🔒 ^⚠
ucomilt_ss 🔒 ^⚠
ucomineq_ss 🔒 ^⚠
_MM_SHUFFLEExperimental: A utility function for creating masks to use with Intel shuffle and permute intrinsics.

Module sseCopy item path

Constants§

Functions§

Module sse