Module avx

Source

Available on x86 or x86-64 only.

Expand description

Advanced Vector Extensions (AVX)

The references are:

Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 2: Instruction Set Reference, A-Z. - AMD64 Architecture Programmer’s Manual, Volume 3: General-Purpose and System Instructions.

Wikipedia provides a quick overview of the instructions available.

Constants§

_CMP_EQ_OQ: Equal (ordered, non-signaling)
_CMP_EQ_OS: Equal (ordered, signaling)
_CMP_EQ_UQ: Equal (unordered, non-signaling)
_CMP_EQ_US: Equal (unordered, signaling)
_CMP_FALSE_OQ: False (ordered, non-signaling)
_CMP_FALSE_OS: False (ordered, signaling)
_CMP_GE_OQ: Greater-than-or-equal (ordered, non-signaling)
_CMP_GE_OS: Greater-than-or-equal (ordered, signaling)
_CMP_GT_OQ: Greater-than (ordered, non-signaling)
_CMP_GT_OS: Greater-than (ordered, signaling)
_CMP_LE_OQ: Less-than-or-equal (ordered, non-signaling)
_CMP_LE_OS: Less-than-or-equal (ordered, signaling)
_CMP_LT_OQ: Less-than (ordered, non-signaling)
_CMP_LT_OS: Less-than (ordered, signaling)
_CMP_NEQ_OQ: Not-equal (ordered, non-signaling)
_CMP_NEQ_OS: Not-equal (ordered, signaling)
_CMP_NEQ_UQ: Not-equal (unordered, non-signaling)
_CMP_NEQ_US: Not-equal (unordered, signaling)
_CMP_NGE_UQ: Not-greater-than-or-equal (unordered, non-signaling)
_CMP_NGE_US: Not-greater-than-or-equal (unordered, signaling)
_CMP_NGT_UQ: Not-greater-than (unordered, non-signaling)
_CMP_NGT_US: Not-greater-than (unordered, signaling)
_CMP_NLE_UQ: Not-less-than-or-equal (unordered, non-signaling)
_CMP_NLE_US: Not-less-than-or-equal (unordered, signaling)
_CMP_NLT_UQ: Not-less-than (unordered, non-signaling)
_CMP_NLT_US: Not-less-than (unordered, signaling)
_CMP_ORD_Q: Ordered (non-signaling)
_CMP_ORD_S: Ordered (signaling)
_CMP_TRUE_UQ: True (unordered, non-signaling)
_CMP_TRUE_US: True (unordered, signaling)
_CMP_UNORD_Q: Unordered (non-signaling)
_CMP_UNORD_S: Unordered (signaling)

Functions§

_mm256_add_pd^⚠avx: Adds packed double-precision (64-bit) floating-point elements in a and b.
_mm256_add_ps^⚠avx: Adds packed single-precision (32-bit) floating-point elements in a and b.
_mm256_addsub_pd^⚠avx: Alternatively adds and subtracts packed double-precision (64-bit) floating-point elements in a to/from packed elements in b.
_mm256_addsub_ps^⚠avx: Alternatively adds and subtracts packed single-precision (32-bit) floating-point elements in a to/from packed elements in b.
_mm256_and_pd^⚠avx: Computes the bitwise AND of a packed double-precision (64-bit) floating-point elements in a and b.
_mm256_and_ps^⚠avx: Computes the bitwise AND of packed single-precision (32-bit) floating-point elements in a and b.
_mm256_andnot_pd^⚠avx: Computes the bitwise NOT of packed double-precision (64-bit) floating-point elements in a, and then AND with b.
_mm256_andnot_ps^⚠avx: Computes the bitwise NOT of packed single-precision (32-bit) floating-point elements in a and then AND with b.
_mm256_blend_pd^⚠avx: Blends packed double-precision (64-bit) floating-point elements from a and b using control mask imm8.
_mm256_blend_ps^⚠avx: Blends packed single-precision (32-bit) floating-point elements from a and b using control mask imm8.
_mm256_blendv_pd^⚠avx: Blends packed double-precision (64-bit) floating-point elements from a and b using c as a mask.
_mm256_blendv_ps^⚠avx: Blends packed single-precision (32-bit) floating-point elements from a and b using c as a mask.
_mm256_broadcast_pd^⚠avx: Broadcasts 128 bits from memory (composed of 2 packed double-precision (64-bit) floating-point elements) to all elements of the returned vector.
_mm256_broadcast_ps^⚠avx: Broadcasts 128 bits from memory (composed of 4 packed single-precision (32-bit) floating-point elements) to all elements of the returned vector.
_mm256_broadcast_sd^⚠avx: Broadcasts a double-precision (64-bit) floating-point element from memory to all elements of the returned vector.
_mm256_broadcast_ss^⚠avx: Broadcasts a single-precision (32-bit) floating-point element from memory to all elements of the returned vector.
_mm256_castpd128_pd256^⚠avx: Casts vector of type __m128d to type __m256d; the upper 128 bits of the result are undefined.
_mm256_castpd256_pd128^⚠avx: Casts vector of type __m256d to type __m128d.
_mm256_castpd_ps^⚠avx: Cast vector of type __m256d to type __m256.
_mm256_castpd_si256^⚠avx: Casts vector of type __m256d to type __m256i.
_mm256_castps128_ps256^⚠avx: Casts vector of type __m128 to type __m256; the upper 128 bits of the result are undefined.
_mm256_castps256_ps128^⚠avx: Casts vector of type __m256 to type __m128.
_mm256_castps_pd^⚠avx: Cast vector of type __m256 to type __m256d.
_mm256_castps_si256^⚠avx: Casts vector of type __m256 to type __m256i.
_mm256_castsi128_si256^⚠avx: Casts vector of type __m128i to type __m256i; the upper 128 bits of the result are undefined.
_mm256_castsi256_pd^⚠avx: Casts vector of type __m256i to type __m256d.
_mm256_castsi256_ps^⚠avx: Casts vector of type __m256i to type __m256.
_mm256_castsi256_si128^⚠avx: Casts vector of type __m256i to type __m128i.
_mm256_ceil_pd^⚠avx: Rounds packed double-precision (64-bit) floating point elements in a toward positive infinity.
_mm256_ceil_ps^⚠avx: Rounds packed single-precision (32-bit) floating point elements in a toward positive infinity.
_mm256_cmp_pd^⚠avx: Compares packed double-precision (64-bit) floating-point elements in a and b based on the comparison operand specified by IMM5.
_mm256_cmp_ps^⚠avx: Compares packed single-precision (32-bit) floating-point elements in a and b based on the comparison operand specified by IMM5.
_mm256_cvtepi32_pd^⚠avx: Converts packed 32-bit integers in a to packed double-precision (64-bit) floating-point elements.
_mm256_cvtepi32_ps^⚠avx: Converts packed 32-bit integers in a to packed single-precision (32-bit) floating-point elements.
_mm256_cvtpd_epi32^⚠avx: Converts packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers.
_mm256_cvtpd_ps^⚠avx: Converts packed double-precision (64-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements.
_mm256_cvtps_epi32^⚠avx: Converts packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers.
_mm256_cvtps_pd^⚠avx: Converts packed single-precision (32-bit) floating-point elements in a to packed double-precision (64-bit) floating-point elements.
_mm256_cvtsd_f64^⚠avx: Returns the first element of the input vector of [4 x double].
_mm256_cvtsi256_si32^⚠avx: Returns the first element of the input vector of [8 x i32].
_mm256_cvtss_f32^⚠avx: Returns the first element of the input vector of [8 x float].
_mm256_cvttpd_epi32^⚠avx: Converts packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers with truncation.
_mm256_cvttps_epi32^⚠avx: Converts packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers with truncation.
_mm256_div_pd^⚠avx: Computes the division of each of the 4 packed 64-bit floating-point elements in a by the corresponding packed elements in b.
_mm256_div_ps^⚠avx: Computes the division of each of the 8 packed 32-bit floating-point elements in a by the corresponding packed elements in b.
_mm256_dp_ps^⚠avx: Conditionally multiplies the packed single-precision (32-bit) floating-point elements in a and b using the high 4 bits in imm8, sum the four products, and conditionally return the sum using the low 4 bits of imm8.
_mm256_extract_epi32^⚠avx: Extracts a 32-bit integer from a, selected with INDEX.
_mm256_extractf128_pd^⚠avx: Extracts 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from a, selected with imm8.
_mm256_extractf128_ps^⚠avx: Extracts 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from a, selected with imm8.
_mm256_extractf128_si256^⚠avx: Extracts 128 bits (composed of integer data) from a, selected with imm8.
_mm256_floor_pd^⚠avx: Rounds packed double-precision (64-bit) floating point elements in a toward negative infinity.
_mm256_floor_ps^⚠avx: Rounds packed single-precision (32-bit) floating point elements in a toward negative infinity.
_mm256_hadd_pd^⚠avx: Horizontal addition of adjacent pairs in the two packed vectors of 4 64-bit floating points a and b. In the result, sums of elements from a are returned in even locations, while sums of elements from b are returned in odd locations.
_mm256_hadd_ps^⚠avx: Horizontal addition of adjacent pairs in the two packed vectors of 8 32-bit floating points a and b. In the result, sums of elements from a are returned in locations of indices 0, 1, 4, 5; while sums of elements from b are locations 2, 3, 6, 7.
_mm256_hsub_pd^⚠avx: Horizontal subtraction of adjacent pairs in the two packed vectors of 4 64-bit floating points a and b. In the result, sums of elements from a are returned in even locations, while sums of elements from b are returned in odd locations.
_mm256_hsub_ps^⚠avx: Horizontal subtraction of adjacent pairs in the two packed vectors of 8 32-bit floating points a and b. In the result, sums of elements from a are returned in locations of indices 0, 1, 4, 5; while sums of elements from b are locations 2, 3, 6, 7.
_mm256_insert_epi8^⚠avx: Copies a to result, and inserts the 8-bit integer i into result at the location specified by index.
_mm256_insert_epi16^⚠avx: Copies a to result, and inserts the 16-bit integer i into result at the location specified by index.
_mm256_insert_epi32^⚠avx: Copies a to result, and inserts the 32-bit integer i into result at the location specified by index.
_mm256_insertf128_pd^⚠avx: Copies a to result, then inserts 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from b into result at the location specified by imm8.
_mm256_insertf128_ps^⚠avx: Copies a to result, then inserts 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from b into result at the location specified by imm8.
_mm256_insertf128_si256^⚠avx: Copies a to result, then inserts 128 bits from b into result at the location specified by imm8.
_mm256_lddqu_si256^⚠avx: Loads 256-bits of integer data from unaligned memory into result. This intrinsic may perform better than _mm256_loadu_si256 when the data crosses a cache line boundary.
_mm256_load_pd^⚠avx: Loads 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from memory into result. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
_mm256_load_ps^⚠avx: Loads 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from memory into result. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
_mm256_load_si256^⚠avx: Loads 256-bits of integer data from memory into result. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
_mm256_loadu2_m128^⚠avx: Loads two 128-bit values (composed of 4 packed single-precision (32-bit) floating-point elements) from memory, and combine them into a 256-bit value. hiaddr and loaddr do not need to be aligned on any particular boundary.
_mm256_loadu2_m128d^⚠avx: Loads two 128-bit values (composed of 2 packed double-precision (64-bit) floating-point elements) from memory, and combine them into a 256-bit value. hiaddr and loaddr do not need to be aligned on any particular boundary.
_mm256_loadu2_m128i^⚠avx: Loads two 128-bit values (composed of integer data) from memory, and combine them into a 256-bit value. hiaddr and loaddr do not need to be aligned on any particular boundary.
_mm256_loadu_pd^⚠avx: Loads 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from memory into result. mem_addr does not need to be aligned on any particular boundary.
_mm256_loadu_ps^⚠avx: Loads 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from memory into result. mem_addr does not need to be aligned on any particular boundary.
_mm256_loadu_si256^⚠avx: Loads 256-bits of integer data from memory into result. mem_addr does not need to be aligned on any particular boundary.
_mm256_maskload_pd^⚠avx: Loads packed double-precision (64-bit) floating-point elements from memory into result using mask (elements are zeroed out when the high bit of the corresponding element is not set).
_mm256_maskload_ps^⚠avx: Loads packed single-precision (32-bit) floating-point elements from memory into result using mask (elements are zeroed out when the high bit of the corresponding element is not set).
_mm256_maskstore_pd^⚠avx: Stores packed double-precision (64-bit) floating-point elements from a into memory using mask.
_mm256_maskstore_ps^⚠avx: Stores packed single-precision (32-bit) floating-point elements from a into memory using mask.
_mm256_max_pd^⚠avx: Compares packed double-precision (64-bit) floating-point elements in a and b, and returns packed maximum values
_mm256_max_ps^⚠avx: Compares packed single-precision (32-bit) floating-point elements in a and b, and returns packed maximum values
_mm256_min_pd^⚠avx: Compares packed double-precision (64-bit) floating-point elements in a and b, and returns packed minimum values
_mm256_min_ps^⚠avx: Compares packed single-precision (32-bit) floating-point elements in a and b, and returns packed minimum values
_mm256_movedup_pd^⚠avx: Duplicate even-indexed double-precision (64-bit) floating-point elements from a, and returns the results.
_mm256_movehdup_ps^⚠avx: Duplicate odd-indexed single-precision (32-bit) floating-point elements from a, and returns the results.
_mm256_moveldup_ps^⚠avx: Duplicate even-indexed single-precision (32-bit) floating-point elements from a, and returns the results.
_mm256_movemask_pd^⚠avx: Sets each bit of the returned mask based on the most significant bit of the corresponding packed double-precision (64-bit) floating-point element in a.
_mm256_movemask_ps^⚠avx: Sets each bit of the returned mask based on the most significant bit of the corresponding packed single-precision (32-bit) floating-point element in a.
_mm256_mul_pd^⚠avx: Multiplies packed double-precision (64-bit) floating-point elements in a and b.
_mm256_mul_ps^⚠avx: Multiplies packed single-precision (32-bit) floating-point elements in a and b.
_mm256_or_pd^⚠avx: Computes the bitwise OR packed double-precision (64-bit) floating-point elements in a and b.
_mm256_or_ps^⚠avx: Computes the bitwise OR packed single-precision (32-bit) floating-point elements in a and b.
_mm256_permute2f128_pd^⚠avx: Shuffles 256 bits (composed of 4 packed double-precision (64-bit) floating-point elements) selected by imm8 from a and b.
_mm256_permute2f128_ps^⚠avx: Shuffles 256 bits (composed of 8 packed single-precision (32-bit) floating-point elements) selected by imm8 from a and b.
_mm256_permute2f128_si256^⚠avx: Shuffles 128-bits (composed of integer data) selected by imm8 from a and b.
_mm256_permute_pd^⚠avx: Shuffles double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in imm8.
_mm256_permute_ps^⚠avx: Shuffles single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8.
_mm256_permutevar_pd^⚠avx: Shuffles double-precision (64-bit) floating-point elements in a within 256-bit lanes using the control in b.
_mm256_permutevar_ps^⚠avx: Shuffles single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in b.
_mm256_rcp_ps^⚠avx: Computes the approximate reciprocal of packed single-precision (32-bit) floating-point elements in a, and returns the results. The maximum relative error for this approximation is less than 1.5*2^-12.
_mm256_round_pd^⚠avx: Rounds packed double-precision (64-bit) floating point elements in a according to the flag ROUNDING. The value of ROUNDING may be as follows:
_mm256_round_ps^⚠avx: Rounds packed single-precision (32-bit) floating point elements in a according to the flag ROUNDING. The value of ROUNDING may be as follows:
_mm256_rsqrt_ps^⚠avx: Computes the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a, and returns the results. The maximum relative error for this approximation is less than 1.5*2^-12.
_mm256_set1_epi8^⚠avx: Broadcasts 8-bit integer a to all elements of returned vector. This intrinsic may generate the vpbroadcastb.
_mm256_set1_epi16^⚠avx: Broadcasts 16-bit integer a to all elements of returned vector. This intrinsic may generate the vpbroadcastw.
_mm256_set1_epi32^⚠avx: Broadcasts 32-bit integer a to all elements of returned vector. This intrinsic may generate the vpbroadcastd.
_mm256_set1_epi64x^⚠avx: Broadcasts 64-bit integer a to all elements of returned vector. This intrinsic may generate the vpbroadcastq.
_mm256_set1_pd^⚠avx: Broadcasts double-precision (64-bit) floating-point value a to all elements of returned vector.
_mm256_set1_ps^⚠avx: Broadcasts single-precision (32-bit) floating-point value a to all elements of returned vector.
_mm256_set_epi8^⚠avx: Sets packed 8-bit integers in returned vector with the supplied values.
_mm256_set_epi16^⚠avx: Sets packed 16-bit integers in returned vector with the supplied values.
_mm256_set_epi32^⚠avx: Sets packed 32-bit integers in returned vector with the supplied values.
_mm256_set_epi64x^⚠avx: Sets packed 64-bit integers in returned vector with the supplied values.
_mm256_set_m128^⚠avx: Sets packed __m256 returned vector with the supplied values.
_mm256_set_m128d^⚠avx: Sets packed __m256d returned vector with the supplied values.
_mm256_set_m128i^⚠avx: Sets packed __m256i returned vector with the supplied values.
_mm256_set_pd^⚠avx: Sets packed double-precision (64-bit) floating-point elements in returned vector with the supplied values.
_mm256_set_ps^⚠avx: Sets packed single-precision (32-bit) floating-point elements in returned vector with the supplied values.
_mm256_setr_epi8^⚠avx: Sets packed 8-bit integers in returned vector with the supplied values in reverse order.
_mm256_setr_epi16^⚠avx: Sets packed 16-bit integers in returned vector with the supplied values in reverse order.
_mm256_setr_epi32^⚠avx: Sets packed 32-bit integers in returned vector with the supplied values in reverse order.
_mm256_setr_epi64x^⚠avx: Sets packed 64-bit integers in returned vector with the supplied values in reverse order.
_mm256_setr_m128^⚠avx: Sets packed __m256 returned vector with the supplied values.
_mm256_setr_m128d^⚠avx: Sets packed __m256d returned vector with the supplied values.
_mm256_setr_m128i^⚠avx: Sets packed __m256i returned vector with the supplied values.
_mm256_setr_pd^⚠avx: Sets packed double-precision (64-bit) floating-point elements in returned vector with the supplied values in reverse order.
_mm256_setr_ps^⚠avx: Sets packed single-precision (32-bit) floating-point elements in returned vector with the supplied values in reverse order.
_mm256_setzero_pd^⚠avx: Returns vector of type __m256d with all elements set to zero.
_mm256_setzero_ps^⚠avx: Returns vector of type __m256 with all elements set to zero.
_mm256_setzero_si256^⚠avx: Returns vector of type __m256i with all elements set to zero.
_mm256_shuffle_pd^⚠avx: Shuffles double-precision (64-bit) floating-point elements within 128-bit lanes using the control in imm8.
_mm256_shuffle_ps^⚠avx: Shuffles single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8.
_mm256_sqrt_pd^⚠avx: Returns the square root of packed double-precision (64-bit) floating point elements in a.
_mm256_sqrt_ps^⚠avx: Returns the square root of packed single-precision (32-bit) floating point elements in a.
_mm256_store_pd^⚠avx: Stores 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a into memory. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
_mm256_store_ps^⚠avx: Stores 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a into memory. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
_mm256_store_si256^⚠avx: Stores 256-bits of integer data from a into memory. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
_mm256_storeu2_m128^⚠avx: Stores the high and low 128-bit halves (each composed of 4 packed single-precision (32-bit) floating-point elements) from a into memory two different 128-bit locations. hiaddr and loaddr do not need to be aligned on any particular boundary.
_mm256_storeu2_m128d^⚠avx: Stores the high and low 128-bit halves (each composed of 2 packed double-precision (64-bit) floating-point elements) from a into memory two different 128-bit locations. hiaddr and loaddr do not need to be aligned on any particular boundary.
_mm256_storeu2_m128i^⚠avx: Stores the high and low 128-bit halves (each composed of integer data) from a into memory two different 128-bit locations. hiaddr and loaddr do not need to be aligned on any particular boundary.
_mm256_storeu_pd^⚠avx: Stores 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a into memory. mem_addr does not need to be aligned on any particular boundary.
_mm256_storeu_ps^⚠avx: Stores 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a into memory. mem_addr does not need to be aligned on any particular boundary.
_mm256_storeu_si256^⚠avx: Stores 256-bits of integer data from a into memory. mem_addr does not need to be aligned on any particular boundary.
_mm256_stream_pd^⚠avx: Moves double-precision values from a 256-bit vector of [4 x double] to a 32-byte aligned memory location. To minimize caching, the data is flagged as non-temporal (unlikely to be used again soon).
_mm256_stream_ps^⚠avx: Moves single-precision floating point values from a 256-bit vector of [8 x float] to a 32-byte aligned memory location. To minimize caching, the data is flagged as non-temporal (unlikely to be used again soon).
_mm256_stream_si256^⚠avx: Moves integer data from a 256-bit integer vector to a 32-byte aligned memory location. To minimize caching, the data is flagged as non-temporal (unlikely to be used again soon)
_mm256_sub_pd^⚠avx: Subtracts packed double-precision (64-bit) floating-point elements in b from packed elements in a.
_mm256_sub_ps^⚠avx: Subtracts packed single-precision (32-bit) floating-point elements in b from packed elements in a.
_mm256_testc_pd^⚠avx: Computes the bitwise AND of 256 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.
_mm256_testc_ps^⚠avx: Computes the bitwise AND of 256 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.
_mm256_testc_si256^⚠avx: Computes the bitwise AND of 256 bits (representing integer data) in a and b, and set ZF to 1 if the result is zero, otherwise set ZF to 0. Computes the bitwise NOT of a and then AND with b, and set CF to 1 if the result is zero, otherwise set CF to 0. Return the CF value.
_mm256_testnzc_pd^⚠avx: Computes the bitwise AND of 256 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.
_mm256_testnzc_ps^⚠avx: Computes the bitwise AND of 256 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.
_mm256_testnzc_si256^⚠avx: Computes the bitwise AND of 256 bits (representing integer data) in a and b, and set ZF to 1 if the result is zero, otherwise set ZF to 0. Computes the bitwise NOT of a and then AND with b, and set CF to 1 if the result is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.
_mm256_testz_pd^⚠avx: Computes the bitwise AND of 256 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return the ZF value.
_mm256_testz_ps^⚠avx: Computes the bitwise AND of 256 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return the ZF value.
_mm256_testz_si256^⚠avx: Computes the bitwise AND of 256 bits (representing integer data) in a and b, and set ZF to 1 if the result is zero, otherwise set ZF to 0. Computes the bitwise NOT of a and then AND with b, and set CF to 1 if the result is zero, otherwise set CF to 0. Return the ZF value.
_mm256_undefined_pd^⚠avx: Returns vector of type __m256d with indeterminate elements. Despite using the word “undefined” (following Intel’s naming scheme), this non-deterministically picks some valid value and is not equivalent to mem::MaybeUninit. In practice, this is typically equivalent to mem::zeroed.
_mm256_undefined_ps^⚠avx: Returns vector of type __m256 with indeterminate elements. Despite using the word “undefined” (following Intel’s naming scheme), this non-deterministically picks some valid value and is not equivalent to mem::MaybeUninit. In practice, this is typically equivalent to mem::zeroed.
_mm256_undefined_si256^⚠avx: Returns vector of type __m256i with with indeterminate elements. Despite using the word “undefined” (following Intel’s naming scheme), this non-deterministically picks some valid value and is not equivalent to mem::MaybeUninit. In practice, this is typically equivalent to mem::zeroed.
_mm256_unpackhi_pd^⚠avx: Unpacks and interleave double-precision (64-bit) floating-point elements from the high half of each 128-bit lane in a and b.
_mm256_unpackhi_ps^⚠avx: Unpacks and interleave single-precision (32-bit) floating-point elements from the high half of each 128-bit lane in a and b.
_mm256_unpacklo_pd^⚠avx: Unpacks and interleave double-precision (64-bit) floating-point elements from the low half of each 128-bit lane in a and b.
_mm256_unpacklo_ps^⚠avx: Unpacks and interleave single-precision (32-bit) floating-point elements from the low half of each 128-bit lane in a and b.
_mm256_xor_pd^⚠avx: Computes the bitwise XOR of packed double-precision (64-bit) floating-point elements in a and b.
_mm256_xor_ps^⚠avx: Computes the bitwise XOR of packed single-precision (32-bit) floating-point elements in a and b.
_mm256_zeroall^⚠avx: Zeroes the contents of all XMM or YMM registers.
_mm256_zeroupper^⚠avx: Zeroes the upper 128 bits of all YMM registers; the lower 128-bits of the registers are unmodified.
_mm256_zextpd128_pd256^⚠avx: Constructs a 256-bit floating-point vector of [4 x double] from a 128-bit floating-point vector of [2 x double]. The lower 128 bits contain the value of the source vector. The upper 128 bits are set to zero.
_mm256_zextps128_ps256^⚠avx: Constructs a 256-bit floating-point vector of [8 x float] from a 128-bit floating-point vector of [4 x float]. The lower 128 bits contain the value of the source vector. The upper 128 bits are set to zero.
_mm256_zextsi128_si256^⚠avx: Constructs a 256-bit integer vector from a 128-bit integer vector. The lower 128 bits contain the value of the source vector. The upper 128 bits are set to zero.
_mm_broadcast_ss^⚠avx: Broadcasts a single-precision (32-bit) floating-point element from memory to all elements of the returned vector.
_mm_cmp_pd^⚠avx: Compares packed double-precision (64-bit) floating-point elements in a and b based on the comparison operand specified by IMM5.
_mm_cmp_ps^⚠avx: Compares packed single-precision (32-bit) floating-point elements in a and b based on the comparison operand specified by IMM5.
_mm_cmp_sd^⚠avx: Compares the lower double-precision (64-bit) floating-point element in a and b based on the comparison operand specified by IMM5, store the result in the lower element of returned vector, and copies the upper element from a to the upper element of returned vector.
_mm_cmp_ss^⚠avx: Compares the lower single-precision (32-bit) floating-point element in a and b based on the comparison operand specified by IMM5, store the result in the lower element of returned vector, and copies the upper 3 packed elements from a to the upper elements of returned vector.
_mm_maskload_pd^⚠avx: Loads packed double-precision (64-bit) floating-point elements from memory into result using mask (elements are zeroed out when the high bit of the corresponding element is not set).
_mm_maskload_ps^⚠avx: Loads packed single-precision (32-bit) floating-point elements from memory into result using mask (elements are zeroed out when the high bit of the corresponding element is not set).
_mm_maskstore_pd^⚠avx: Stores packed double-precision (64-bit) floating-point elements from a into memory using mask.
_mm_maskstore_ps^⚠avx: Stores packed single-precision (32-bit) floating-point elements from a into memory using mask.
_mm_permute_pd^⚠avx: Shuffles double-precision (64-bit) floating-point elements in a using the control in imm8.
_mm_permute_ps^⚠avx: Shuffles single-precision (32-bit) floating-point elements in a using the control in imm8.
_mm_permutevar_pd^⚠avx: Shuffles double-precision (64-bit) floating-point elements in a using the control in b.
_mm_permutevar_ps^⚠avx: Shuffles single-precision (32-bit) floating-point elements in a using the control in b.
_mm_testc_pd^⚠avx: Computes the bitwise AND of 128 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.
_mm_testc_ps^⚠avx: Computes the bitwise AND of 128 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.
_mm_testnzc_pd^⚠avx: Computes the bitwise AND of 128 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.
_mm_testnzc_ps^⚠avx: Computes the bitwise AND of 128 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.
_mm_testz_pd^⚠avx: Computes the bitwise AND of 128 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return the ZF value.
_mm_testz_ps^⚠avx: Computes the bitwise AND of 128 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return the ZF value.
maskloadpd 🔒 ^⚠
maskloadpd256 🔒 ^⚠
maskloadps 🔒 ^⚠
maskloadps256 🔒 ^⚠
maskstorepd 🔒 ^⚠
maskstorepd256 🔒 ^⚠
maskstoreps 🔒 ^⚠
maskstoreps256 🔒 ^⚠
ptestc256 🔒 ^⚠
ptestnzc256 🔒 ^⚠
ptestz256 🔒 ^⚠
roundpd256 🔒 ^⚠
roundps256 🔒 ^⚠
vcmppd 🔒 ^⚠
vcmppd256 🔒 ^⚠
vcmpps 🔒 ^⚠
vcmpps256 🔒 ^⚠
vcmpsd 🔒 ^⚠
vcmpss 🔒 ^⚠
vcvtpd2dq 🔒 ^⚠
vcvtps2dq 🔒 ^⚠
vcvttpd2dq 🔒 ^⚠
vcvttps2dq 🔒 ^⚠
vdpps 🔒 ^⚠
vhaddpd 🔒 ^⚠
vhaddps 🔒 ^⚠
vhsubpd 🔒 ^⚠
vhsubps 🔒 ^⚠
vlddqu 🔒 ^⚠
vmaxpd 🔒 ^⚠
vmaxps 🔒 ^⚠
vminpd 🔒 ^⚠
vminps 🔒 ^⚠
vperm2f128pd256 🔒 ^⚠
vperm2f128ps256 🔒 ^⚠
vperm2f128si256 🔒 ^⚠
vpermilpd 🔒 ^⚠
vpermilpd256 🔒 ^⚠
vpermilps 🔒 ^⚠
vpermilps256 🔒 ^⚠
vrcpps 🔒 ^⚠
vrsqrtps 🔒 ^⚠
vtestcpd 🔒 ^⚠
vtestcpd256 🔒 ^⚠
vtestcps 🔒 ^⚠
vtestcps256 🔒 ^⚠
vtestnzcpd 🔒 ^⚠
vtestnzcpd256 🔒 ^⚠
vtestnzcps 🔒 ^⚠
vtestnzcps256 🔒 ^⚠
vtestzpd 🔒 ^⚠
vtestzpd256 🔒 ^⚠
vtestzps 🔒 ^⚠
vtestzps256 🔒 ^⚠
vzeroall 🔒 ^⚠
vzeroupper 🔒 ^⚠

Module avxCopy item path

Constants§

Functions§

Module avx