Available on x86 or x86-64 only.
Functionsยง
- vcvtpd2qq_
128 ๐ โ - vcvtpd2qq_
256 ๐ โ - vcvtpd2qq_
512 ๐ โ - vcvtpd2uqq_
128 ๐ โ - vcvtpd2uqq_
256 ๐ โ - vcvtpd2uqq_
512 ๐ โ - vcvtps2qq_
128 ๐ โ - vcvtps2qq_
256 ๐ โ - vcvtps2qq_
512 ๐ โ - vcvtps2uqq_
128 ๐ โ - vcvtps2uqq_
256 ๐ โ - vcvtps2uqq_
512 ๐ โ - vcvtqq2pd_
128 ๐ โ - vcvtqq2pd_
256 ๐ โ - vcvtqq2pd_
512 ๐ โ - vcvtqq2ps_
128 ๐ โ - vcvtqq2ps_
256 ๐ โ - vcvtqq2ps_
512 ๐ โ - vcvttpd2qq_
128 ๐ โ - vcvttpd2qq_
256 ๐ โ - vcvttpd2qq_
512 ๐ โ - vcvttpd2uqq_
128 ๐ โ - vcvttpd2uqq_
256 ๐ โ - vcvttpd2uqq_
512 ๐ โ - vcvttps2qq_
128 ๐ โ - vcvttps2qq_
256 ๐ โ - vcvttps2qq_
512 ๐ โ - vcvttps2uqq_
128 ๐ โ - vcvttps2uqq_
256 ๐ โ - vcvttps2uqq_
512 ๐ โ - vcvtuqq2pd_
128 ๐ โ - vcvtuqq2pd_
256 ๐ โ - vcvtuqq2pd_
512 ๐ โ - vcvtuqq2ps_
128 ๐ โ - vcvtuqq2ps_
256 ๐ โ - vcvtuqq2ps_
512 ๐ โ - vfpclasspd_
128 ๐ โ - vfpclasspd_
256 ๐ โ - vfpclasspd_
512 ๐ โ - vfpclassps_
128 ๐ โ - vfpclassps_
256 ๐ โ - vfpclassps_
512 ๐ โ - vfpclasssd ๐ โ
- vfpclassss ๐ โ
- vrangepd_
128 ๐ โ - vrangepd_
256 ๐ โ - vrangepd_
512 ๐ โ - vrangeps_
128 ๐ โ - vrangeps_
256 ๐ โ - vrangeps_
512 ๐ โ - vrangesd ๐ โ
- vrangess ๐ โ
- vreducepd_
128 ๐ โ - vreducepd_
256 ๐ โ - vreducepd_
512 ๐ โ - vreduceps_
128 ๐ โ - vreduceps_
256 ๐ โ - vreduceps_
512 ๐ โ - vreducesd ๐ โ
- vreducess ๐ โ
- _cvtmask8_
u32 โExperimental avx512dq
- Convert 8-bit mask a to a 32-bit integer value and store the result in dst.
- _cvtu32_
mask8 โExperimental avx512dq
- Convert 32-bit integer value a to an 8-bit mask and store the result in dst.
- _kadd_
mask8 โExperimental avx512dq
- Add 8-bit masks a and b, and store the result in dst.
- _kadd_
mask16 โExperimental avx512dq
- Add 16-bit masks a and b, and store the result in dst.
- _kand_
mask8 โExperimental avx512dq
- Bitwise AND of 8-bit masks a and b, and store the result in dst.
- _kandn_
mask8 โExperimental avx512dq
- Bitwise AND NOT of 8-bit masks a and b, and store the result in dst.
- _knot_
mask8 โExperimental avx512dq
- Bitwise NOT of 8-bit mask a, and store the result in dst.
- _kor_
mask8 โExperimental avx512dq
- Bitwise OR of 8-bit masks a and b, and store the result in dst.
- _kortest_
mask8_ โu8 Experimental avx512dq
- Compute the bitwise OR of 8-bit masks a and b. If the result is all zeros, store 1 in dst, otherwise store 0 in dst. If the result is all ones, store 1 in all_ones, otherwise store 0 in all_ones.
- _kortestc_
mask8_ โu8 Experimental avx512dq
- Compute the bitwise OR of 8-bit masks a and b. If the result is all ones, store 1 in dst, otherwise store 0 in dst.
- _kortestz_
mask8_ โu8 Experimental avx512dq
- Compute the bitwise OR of 8-bit masks a and b. If the result is all zeros, store 1 in dst, otherwise store 0 in dst.
- _kshiftli_
mask8 โExperimental avx512dq
- Shift 8-bit mask a left by count bits while shifting in zeros, and store the result in dst.
- _kshiftri_
mask8 โExperimental avx512dq
- Shift 8-bit mask a right by count bits while shifting in zeros, and store the result in dst.
- _ktest_
mask8_ โu8 Experimental avx512dq
- Compute the bitwise AND of 8-bit masks a and b, and if the result is all zeros, store 1 in dst, otherwise store 0 in dst. Compute the bitwise NOT of a and then AND with b, if the result is all zeros, store 1 in and_not, otherwise store 0 in and_not.
- _ktest_
mask16_ โu8 Experimental avx512dq
- Compute the bitwise AND of 16-bit masks a and b, and if the result is all zeros, store 1 in dst, otherwise store 0 in dst. Compute the bitwise NOT of a and then AND with b, if the result is all zeros, store 1 in and_not, otherwise store 0 in and_not.
- _ktestc_
mask8_ โu8 Experimental avx512dq
- Compute the bitwise NOT of 8-bit mask a and then AND with 8-bit mask b, if the result is all zeros, store 1 in dst, otherwise store 0 in dst.
- _ktestc_
mask16_ โu8 Experimental avx512dq
- Compute the bitwise NOT of 16-bit mask a and then AND with 16-bit mask b, if the result is all zeros, store 1 in dst, otherwise store 0 in dst.
- _ktestz_
mask8_ โu8 Experimental avx512dq
- Compute the bitwise AND of 8-bit masks a and b, if the result is all zeros, store 1 in dst, otherwise store 0 in dst.
- _ktestz_
mask16_ โu8 Experimental avx512dq
- Compute the bitwise AND of 16-bit masks a and b, if the result is all zeros, store 1 in dst, otherwise store 0 in dst.
- _kxnor_
mask8 โExperimental avx512dq
- Bitwise XNOR of 8-bit masks a and b, and store the result in dst.
- _kxor_
mask8 โExperimental avx512dq
- Bitwise XOR of 8-bit masks a and b, and store the result in dst.
- _load_
mask8 โExperimental avx512dq
- Load 8-bit mask from memory
- _mm256_
broadcast_ โf32x2 Experimental avx512dq,avx512vl
- Broadcasts the lower 2 packed single-precision (32-bit) floating-point elements from a to all elements of dst.
- _mm256_
broadcast_ โf64x2 Experimental avx512dq,avx512vl
- Broadcasts the 2 packed double-precision (64-bit) floating-point elements from a to all elements of dst.
- _mm256_
broadcast_ โi32x2 Experimental avx512dq,avx512vl
- Broadcasts the lower 2 packed 32-bit integers from a to all elements of dst.
- _mm256_
broadcast_ โi64x2 Experimental avx512dq,avx512vl
- Broadcasts the 2 packed 64-bit integers from a to all elements of dst.
- _mm256_
cvtepi64_ โpd Experimental avx512dq,avx512vl
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.
- _mm256_
cvtepi64_ โps Experimental avx512dq,avx512vl
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.
- _mm256_
cvtepu64_ โpd Experimental avx512dq,avx512vl
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.
- _mm256_
cvtepu64_ โps Experimental avx512dq,avx512vl
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.
- _mm256_
cvtpd_ โepi64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst.
- _mm256_
cvtpd_ โepu64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst.
- _mm256_
cvtps_ โepi64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst.
- _mm256_
cvtps_ โepu64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst.
- _mm256_
cvttpd_ โepi64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst.
- _mm256_
cvttpd_ โepu64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst.
- _mm256_
cvttps_ โepi64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst.
- _mm256_
cvttps_ โepu64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst.
- _mm256_
extractf64x2_ โpd Experimental avx512dq,avx512vl
- Extracts 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from a, selected with IMM8, and stores the result in dst.
- _mm256_
extracti64x2_ โepi64 Experimental avx512dq,avx512vl
- Extracts 128 bits (composed of 2 packed 64-bit integers) from a, selected with IMM8, and stores the result in dst.
- _mm256_
fpclass_ โpd_ mask Experimental avx512dq,avx512vl
- Test packed double-precision (64-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k. imm can be a combination of:
- _mm256_
fpclass_ โps_ mask Experimental avx512dq,avx512vl
- Test packed single-precision (32-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k. imm can be a combination of:
- _mm256_
insertf64x2 โExperimental avx512dq,avx512vl
- Copy a to dst, then insert 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from b into dst at the location specified by IMM8.
- _mm256_
inserti64x2 โExperimental avx512dq,avx512vl
- Copy a to dst, then insert 128 bits (composed of 2 packed 64-bit integers) from b into dst at the location specified by IMM8.
- _mm256_
mask_ โand_ pd Experimental avx512dq,avx512vl
- Compute the bitwise AND of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โand_ ps Experimental avx512dq,avx512vl
- Compute the bitwise AND of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โandnot_ pd Experimental avx512dq,avx512vl
- Compute the bitwise NOT of packed double-precision (64-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โandnot_ ps Experimental avx512dq,avx512vl
- Compute the bitwise NOT of packed single-precision (32-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โbroadcast_ f32x2 Experimental avx512dq,avx512vl
- Broadcasts the lower 2 packed single-precision (32-bit) floating-point elements from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โbroadcast_ f64x2 Experimental avx512dq,avx512vl
- Broadcasts the 2 packed double-precision (64-bit) floating-point elements from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โbroadcast_ i32x2 Experimental avx512dq,avx512vl
- Broadcasts the lower 2 packed 32-bit integers from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โbroadcast_ i64x2 Experimental avx512dq,avx512vl
- Broadcasts the 2 packed 64-bit integers from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โcvtepi64_ pd Experimental avx512dq,avx512vl
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โcvtepi64_ ps Experimental avx512dq,avx512vl
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โcvtepu64_ pd Experimental avx512dq,avx512vl
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โcvtepu64_ ps Experimental avx512dq,avx512vl
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โcvtpd_ epi64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โcvtpd_ epu64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โcvtps_ epi64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โcvtps_ epu64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โcvttpd_ epi64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โcvttpd_ epu64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โcvttps_ epi64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โcvttps_ epu64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โextractf64x2_ pd Experimental avx512dq,avx512vl
- Extracts 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from a, selected with IMM8, and stores the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โextracti64x2_ epi64 Experimental avx512dq,avx512vl
- Extracts 128 bits (composed of 2 packed 64-bit integers) from a, selected with IMM8, and stores the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โfpclass_ pd_ mask Experimental avx512dq,avx512vl
- Test packed double-precision (64-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). imm can be a combination of:
- _mm256_
mask_ โfpclass_ ps_ mask Experimental avx512dq,avx512vl
- Test packed single-precision (32-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). imm can be a combination of:
- _mm256_
mask_ โinsertf64x2 Experimental avx512dq,avx512vl
- Copy a to tmp, then insert 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from b into tmp at the location specified by IMM8, and copy tmp to dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โinserti64x2 Experimental avx512dq,avx512vl
- Copy a to tmp, then insert 128 bits (composed of 2 packed 64-bit integers) from b into tmp at the location specified by IMM8, and copy tmp to dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โmullo_ epi64 Experimental avx512dq,avx512vl
- Multiply packed 64-bit integers in
a
andb
, producing intermediate 128-bit integers, and store the low 64 bits of the intermediate integers indst
using writemaskk
(elements are copied fromsrc
if the corresponding bit is not set). - _mm256_
mask_ โor_ pd Experimental avx512dq,avx512vl
- Compute the bitwise OR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โor_ ps Experimental avx512dq,avx512vl
- Compute the bitwise OR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โrange_ pd Experimental avx512dq,avx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm256_
mask_ โrange_ ps Experimental avx512dq,avx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm256_
mask_ โreduce_ pd Experimental avx512dq,avx512vl
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm256_
mask_ โreduce_ ps Experimental avx512dq,avx512vl
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm256_
mask_ โxor_ pd Experimental avx512dq,avx512vl
- Compute the bitwise XOR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ โxor_ ps Experimental avx512dq,avx512vl
- Compute the bitwise XOR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
maskz_ โand_ pd Experimental avx512dq,avx512vl
- Compute the bitwise AND of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โand_ ps Experimental avx512dq,avx512vl
- Compute the bitwise AND of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โandnot_ pd Experimental avx512dq,avx512vl
- Compute the bitwise NOT of packed double-precision (64-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โandnot_ ps Experimental avx512dq,avx512vl
- Compute the bitwise NOT of packed single-precision (32-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โbroadcast_ f32x2 Experimental avx512dq,avx512vl
- Broadcasts the lower 2 packed single-precision (32-bit) floating-point elements from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โbroadcast_ f64x2 Experimental avx512dq,avx512vl
- Broadcasts the 2 packed double-precision (64-bit) floating-point elements from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โbroadcast_ i32x2 Experimental avx512dq,avx512vl
- Broadcasts the lower 2 packed 32-bit integers from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โbroadcast_ i64x2 Experimental avx512dq,avx512vl
- Broadcasts the 2 packed 64-bit integers from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โcvtepi64_ pd Experimental avx512dq,avx512vl
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โcvtepi64_ ps Experimental avx512dq,avx512vl
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โcvtepu64_ pd Experimental avx512dq,avx512vl
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โcvtepu64_ ps Experimental avx512dq,avx512vl
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โcvtpd_ epi64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โcvtpd_ epu64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โcvtps_ epi64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โcvtps_ epu64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โcvttpd_ epi64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โcvttpd_ epu64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โcvttps_ epi64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โcvttps_ epu64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โextractf64x2_ pd Experimental avx512dq,avx512vl
- Extracts 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from a, selected with IMM8, and stores the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โextracti64x2_ epi64 Experimental avx512dq,avx512vl
- Extracts 128 bits (composed of 2 packed 64-bit integers) from a, selected with IMM8, and stores the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โinsertf64x2 Experimental avx512dq,avx512vl
- Copy a to tmp, then insert 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from b into tmp at the location specified by IMM8, and copy tmp to dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โinserti64x2 Experimental avx512dq,avx512vl
- Copy a to tmp, then insert 128 bits (composed of 2 packed 64-bit integers) from b into tmp at the location specified by IMM8, and copy tmp to dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โmullo_ epi64 Experimental avx512dq,avx512vl
- Multiply packed 64-bit integers in
a
andb
, producing intermediate 128-bit integers, and store the low 64 bits of the intermediate integers indst
using zeromaskk
(elements are zeroed out if the corresponding bit is not set). - _mm256_
maskz_ โor_ pd Experimental avx512dq,avx512vl
- Compute the bitwise OR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โor_ ps Experimental avx512dq,avx512vl
- Compute the bitwise OR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โrange_ pd Experimental avx512dq,avx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm256_
maskz_ โrange_ ps Experimental avx512dq,avx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm256_
maskz_ โreduce_ pd Experimental avx512dq,avx512vl
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm256_
maskz_ โreduce_ ps Experimental avx512dq,avx512vl
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm256_
maskz_ โxor_ pd Experimental avx512dq,avx512vl
- Compute the bitwise XOR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ โxor_ ps Experimental avx512dq,avx512vl
- Compute the bitwise XOR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
movepi32_ โmask Experimental avx512dq,avx512vl
- Set each bit of mask register k based on the most significant bit of the corresponding packed 32-bit integer in a.
- _mm256_
movepi64_ โmask Experimental avx512dq,avx512vl
- Set each bit of mask register k based on the most significant bit of the corresponding packed 64-bit integer in a.
- _mm256_
movm_ โepi32 Experimental avx512dq,avx512vl
- Set each packed 32-bit integer in dst to all ones or all zeros based on the value of the corresponding bit in k.
- _mm256_
movm_ โepi64 Experimental avx512dq,avx512vl
- Set each packed 64-bit integer in dst to all ones or all zeros based on the value of the corresponding bit in k.
- _mm256_
mullo_ โepi64 Experimental avx512dq,avx512vl
- Multiply packed 64-bit integers in
a
andb
, producing intermediate 128-bit integers, and store the low 64 bits of the intermediate integers indst
. - _mm256_
range_ โpd Experimental avx512dq,avx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm256_
range_ โps Experimental avx512dq,avx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm256_
reduce_ โpd Experimental avx512dq,avx512vl
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm256_
reduce_ โps Experimental avx512dq,avx512vl
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
and_ โpd Experimental avx512dq
- Compute the bitwise AND of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst.
- _mm512_
and_ โps Experimental avx512dq
- Compute the bitwise AND of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst.
- _mm512_
andnot_ โpd Experimental avx512dq
- Compute the bitwise NOT of packed double-precision (64-bit) floating point numbers in a and then bitwise AND with b and store the results in dst.
- _mm512_
andnot_ โps Experimental avx512dq
- Compute the bitwise NOT of packed single-precision (32-bit) floating point numbers in a and then bitwise AND with b and store the results in dst.
- _mm512_
broadcast_ โf32x2 Experimental avx512dq
- Broadcasts the lower 2 packed single-precision (32-bit) floating-point elements from a to all elements of dst.
- _mm512_
broadcast_ โf32x8 Experimental avx512dq
- Broadcasts the 8 packed single-precision (32-bit) floating-point elements from a to all elements of dst.
- _mm512_
broadcast_ โf64x2 Experimental avx512dq
- Broadcasts the 2 packed double-precision (64-bit) floating-point elements from a to all elements of dst.
- _mm512_
broadcast_ โi32x2 Experimental avx512dq
- Broadcasts the lower 2 packed 32-bit integers from a to all elements of dst.
- _mm512_
broadcast_ โi32x8 Experimental avx512dq
- Broadcasts the 8 packed 32-bit integers from a to all elements of dst.
- _mm512_
broadcast_ โi64x2 Experimental avx512dq
- Broadcasts the 2 packed 64-bit integers from a to all elements of dst.
- _mm512_
cvt_ โroundepi64_ pd Experimental avx512dq
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst. Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
cvt_ โroundepi64_ ps Experimental avx512dq
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst. Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
cvt_ โroundepu64_ pd Experimental avx512dq
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst. Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
cvt_ โroundepu64_ ps Experimental avx512dq
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst. Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
cvt_ โroundpd_ epi64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst. Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
cvt_ โroundpd_ epu64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst. Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
cvt_ โroundps_ epi64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst. Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
cvt_ โroundps_ epu64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst. Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
cvtepi64_ โpd Experimental avx512dq
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.
- _mm512_
cvtepi64_ โps Experimental avx512dq
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.
- _mm512_
cvtepu64_ โpd Experimental avx512dq
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.
- _mm512_
cvtepu64_ โps Experimental avx512dq
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.
- _mm512_
cvtpd_ โepi64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst.
- _mm512_
cvtpd_ โepu64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst.
- _mm512_
cvtps_ โepi64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst.
- _mm512_
cvtps_ โepu64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst.
- _mm512_
cvtt_ โroundpd_ epi64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
cvtt_ โroundpd_ epu64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
cvtt_ โroundps_ epi64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
cvtt_ โroundps_ epu64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
cvttpd_ โepi64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst.
- _mm512_
cvttpd_ โepu64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst.
- _mm512_
cvttps_ โepi64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst.
- _mm512_
cvttps_ โepu64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst.
- _mm512_
extractf32x8_ โps Experimental avx512dq
- Extracts 256 bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a, selected with IMM8, and stores the result in dst.
- _mm512_
extractf64x2_ โpd Experimental avx512dq
- Extracts 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from a, selected with IMM8, and stores the result in dst.
- _mm512_
extracti32x8_ โepi32 Experimental avx512dq
- Extracts 256 bits (composed of 8 packed 32-bit integers) from a, selected with IMM8, and stores the result in dst.
- _mm512_
extracti64x2_ โepi64 Experimental avx512dq
- Extracts 128 bits (composed of 2 packed 64-bit integers) from a, selected with IMM8, and stores the result in dst.
- _mm512_
fpclass_ โpd_ mask Experimental avx512dq
- Test packed double-precision (64-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k. imm can be a combination of:
- _mm512_
fpclass_ โps_ mask Experimental avx512dq
- Test packed single-precision (32-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k. imm can be a combination of:
- _mm512_
insertf32x8 โExperimental avx512dq
- Copy a to dst, then insert 256 bits (composed of 8 packed single-precision (32-bit) floating-point elements) from b into dst at the location specified by IMM8.
- _mm512_
insertf64x2 โExperimental avx512dq
- Copy a to dst, then insert 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from b into dst at the location specified by IMM8.
- _mm512_
inserti32x8 โExperimental avx512dq
- Copy a to dst, then insert 256 bits (composed of 8 packed 32-bit integers) from b into dst at the location specified by IMM8.
- _mm512_
inserti64x2 โExperimental avx512dq
- Copy a to dst, then insert 128 bits (composed of 2 packed 64-bit integers) from b into dst at the location specified by IMM8.
- _mm512_
mask_ โand_ pd Experimental avx512dq
- Compute the bitwise AND of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โand_ ps Experimental avx512dq
- Compute the bitwise AND of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โandnot_ pd Experimental avx512dq
- Compute the bitwise NOT of packed double-precision (64-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โandnot_ ps Experimental avx512dq
- Compute the bitwise NOT of packed single-precision (32-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โbroadcast_ f32x2 Experimental avx512dq
- Broadcasts the lower 2 packed single-precision (32-bit) floating-point elements from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โbroadcast_ f32x8 Experimental avx512dq
- Broadcasts the 8 packed single-precision (32-bit) floating-point elements from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โbroadcast_ f64x2 Experimental avx512dq
- Broadcasts the 2 packed double-precision (64-bit) floating-point elements from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โbroadcast_ i32x2 Experimental avx512dq
- Broadcasts the lower 2 packed 32-bit integers from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โbroadcast_ i32x8 Experimental avx512dq
- Broadcasts the 8 packed 32-bit integers from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โbroadcast_ i64x2 Experimental avx512dq
- Broadcasts the 2 packed 64-bit integers from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โcvt_ roundepi64_ pd Experimental avx512dq
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
mask_ โcvt_ roundepi64_ ps Experimental avx512dq
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
mask_ โcvt_ roundepu64_ pd Experimental avx512dq
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
mask_ โcvt_ roundepu64_ ps Experimental avx512dq
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
mask_ โcvt_ roundpd_ epi64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
mask_ โcvt_ roundpd_ epu64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
mask_ โcvt_ roundps_ epi64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
mask_ โcvt_ roundps_ epu64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
mask_ โcvtepi64_ pd Experimental avx512dq
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โcvtepi64_ ps Experimental avx512dq
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โcvtepu64_ pd Experimental avx512dq
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โcvtepu64_ ps Experimental avx512dq
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โcvtpd_ epi64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โcvtpd_ epu64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โcvtps_ epi64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โcvtps_ epu64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โcvtt_ roundpd_ epi64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
mask_ โcvtt_ roundpd_ epu64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
mask_ โcvtt_ roundps_ epi64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
mask_ โcvtt_ roundps_ epu64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
mask_ โcvttpd_ epi64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โcvttpd_ epu64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โcvttps_ epi64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โcvttps_ epu64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โextractf32x8_ ps Experimental avx512dq
- Extracts 256 bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a, selected with IMM8, and stores the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โextractf64x2_ pd Experimental avx512dq
- Extracts 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from a, selected with IMM8, and stores the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โextracti32x8_ epi32 Experimental avx512dq
- Extracts 256 bits (composed of 8 packed 32-bit integers) from a, selected with IMM8, and stores the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โextracti64x2_ epi64 Experimental avx512dq
- Extracts 128 bits (composed of 2 packed 64-bit integers) from a, selected with IMM8, and stores the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โfpclass_ pd_ mask Experimental avx512dq
- Test packed double-precision (64-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). imm can be a combination of:
- _mm512_
mask_ โfpclass_ ps_ mask Experimental avx512dq
- Test packed single-precision (32-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). imm can be a combination of:
- _mm512_
mask_ โinsertf32x8 Experimental avx512dq
- Copy a to tmp, then insert 256 bits (composed of 8 packed single-precision (32-bit) floating-point elements) from b into tmp at the location specified by IMM8, and copy tmp to dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โinsertf64x2 Experimental avx512dq
- Copy a to tmp, then insert 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from b into tmp at the location specified by IMM8, and copy tmp to dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โinserti32x8 Experimental avx512dq
- Copy a to tmp, then insert 256 bits (composed of 8 packed 32-bit integers) from b into tmp at the location specified by IMM8, and copy tmp to dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โinserti64x2 Experimental avx512dq
- Copy a to tmp, then insert 128 bits (composed of 2 packed 64-bit integers) from b into tmp at the location specified by IMM8, and copy tmp to dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โmullo_ epi64 Experimental avx512dq
- Multiply packed 64-bit integers in
a
andb
, producing intermediate 128-bit integers, and store the low 64 bits of the intermediate integers indst
using writemaskk
(elements are copied fromsrc
if the corresponding bit is not set). - _mm512_
mask_ โor_ pd Experimental avx512dq
- Compute the bitwise OR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โor_ ps Experimental avx512dq
- Compute the bitwise OR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โrange_ pd Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm512_
mask_ โrange_ ps Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm512_
mask_ โrange_ round_ pd Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter.
- _mm512_
mask_ โrange_ round_ ps Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm512_
mask_ โreduce_ pd Experimental avx512dq
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
mask_ โreduce_ ps Experimental avx512dq
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
mask_ โreduce_ round_ pd Experimental avx512dq
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
mask_ โreduce_ round_ ps Experimental avx512dq
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
mask_ โxor_ pd Experimental avx512dq
- Compute the bitwise XOR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ โxor_ ps Experimental avx512dq
- Compute the bitwise XOR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
maskz_ โand_ pd Experimental avx512dq
- Compute the bitwise AND of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โand_ ps Experimental avx512dq
- Compute the bitwise AND of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โandnot_ pd Experimental avx512dq
- Compute the bitwise NOT of packed double-precision (64-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โandnot_ ps Experimental avx512dq
- Compute the bitwise NOT of packed single-precision (32-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โbroadcast_ f32x2 Experimental avx512dq
- Broadcasts the lower 2 packed single-precision (32-bit) floating-point elements from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โbroadcast_ f32x8 Experimental avx512dq
- Broadcasts the 8 packed single-precision (32-bit) floating-point elements from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โbroadcast_ f64x2 Experimental avx512dq
- Broadcasts the 2 packed double-precision (64-bit) floating-point elements from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โbroadcast_ i32x2 Experimental avx512dq
- Broadcasts the lower 2 packed 32-bit integers from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โbroadcast_ i32x8 Experimental avx512dq
- Broadcasts the 8 packed 32-bit integers from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โbroadcast_ i64x2 Experimental avx512dq
- Broadcasts the 2 packed 64-bit integers from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โcvt_ roundepi64_ pd Experimental avx512dq
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
maskz_ โcvt_ roundepi64_ ps Experimental avx512dq
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
maskz_ โcvt_ roundepu64_ pd Experimental avx512dq
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
maskz_ โcvt_ roundepu64_ ps Experimental avx512dq
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
maskz_ โcvt_ roundpd_ epi64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
maskz_ โcvt_ roundpd_ epu64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
maskz_ โcvt_ roundps_ epi64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
maskz_ โcvt_ roundps_ epu64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
maskz_ โcvtepi64_ pd Experimental avx512dq
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โcvtepi64_ ps Experimental avx512dq
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โcvtepu64_ pd Experimental avx512dq
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โcvtepu64_ ps Experimental avx512dq
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โcvtpd_ epi64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โcvtpd_ epu64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โcvtps_ epi64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โcvtps_ epu64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โcvtt_ roundpd_ epi64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
maskz_ โcvtt_ roundpd_ epu64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
maskz_ โcvtt_ roundps_ epi64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
maskz_ โcvtt_ roundps_ epu64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
maskz_ โcvttpd_ epi64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โcvttpd_ epu64 Experimental avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding
- _mm512_
maskz_ โcvttps_ epi64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โcvttps_ epu64 Experimental avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โextractf32x8_ ps Experimental avx512dq
- Extracts 256 bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a, selected with IMM8, and stores the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โextractf64x2_ pd Experimental avx512dq
- Extracts 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from a, selected with IMM8, and stores the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โextracti32x8_ epi32 Experimental avx512dq
- Extracts 256 bits (composed of 8 packed 32-bit integers) from a, selected with IMM8, and stores the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โextracti64x2_ epi64 Experimental avx512dq
- Extracts 128 bits (composed of 2 packed 64-bit integers) from a, selected with IMM8, and stores the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โinsertf32x8 Experimental avx512dq
- Copy a to tmp, then insert 256 bits (composed of 8 packed single-precision (32-bit) floating-point elements) from b into tmp at the location specified by IMM8, and copy tmp to dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โinsertf64x2 Experimental avx512dq
- Copy a to tmp, then insert 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from b into tmp at the location specified by IMM8, and copy tmp to dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โinserti32x8 Experimental avx512dq
- Copy a to tmp, then insert 256 bits (composed of 8 packed 32-bit integers) from b into tmp at the location specified by IMM8, and copy tmp to dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โinserti64x2 Experimental avx512dq
- Copy a to tmp, then insert 128 bits (composed of 2 packed 64-bit integers) from b into tmp at the location specified by IMM8, and copy tmp to dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โmullo_ epi64 Experimental avx512dq
- Multiply packed 64-bit integers in
a
andb
, producing intermediate 128-bit integers, and store the low 64 bits of the intermediate integers indst
using zeromaskk
(elements are zeroed out if the corresponding bit is not set). - _mm512_
maskz_ โor_ pd Experimental avx512dq
- Compute the bitwise OR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โor_ ps Experimental avx512dq
- Compute the bitwise OR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โrange_ pd Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm512_
maskz_ โrange_ ps Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm512_
maskz_ โrange_ round_ pd Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter.
- _mm512_
maskz_ โrange_ round_ ps Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm512_
maskz_ โreduce_ pd Experimental avx512dq
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
maskz_ โreduce_ ps Experimental avx512dq
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
maskz_ โreduce_ round_ pd Experimental avx512dq
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
maskz_ โreduce_ round_ ps Experimental avx512dq
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
maskz_ โxor_ pd Experimental avx512dq
- Compute the bitwise XOR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ โxor_ ps Experimental avx512dq
- Compute the bitwise XOR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
movepi32_ โmask Experimental avx512dq
- Set each bit of mask register k based on the most significant bit of the corresponding packed 32-bit integer in a.
- _mm512_
movepi64_ โmask Experimental avx512dq
- Set each bit of mask register k based on the most significant bit of the corresponding packed 64-bit integer in a.
- _mm512_
movm_ โepi32 Experimental avx512dq
- Set each packed 32-bit integer in dst to all ones or all zeros based on the value of the corresponding bit in k.
- _mm512_
movm_ โepi64 Experimental avx512dq
- Set each packed 64-bit integer in dst to all ones or all zeros based on the value of the corresponding bit in k.
- _mm512_
mullo_ โepi64 Experimental avx512dq
- Multiply packed 64-bit integers in
a
andb
, producing intermediate 128-bit integers, and store the low 64 bits of the intermediate integers indst
. - _mm512_
or_ โpd Experimental avx512dq
- Compute the bitwise OR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst.
- _mm512_
or_ โps Experimental avx512dq
- Compute the bitwise OR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst.
- _mm512_
range_ โpd Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm512_
range_ โps Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm512_
range_ โround_ pd Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter.
- _mm512_
range_ โround_ ps Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter.
- _mm512_
reduce_ โpd Experimental avx512dq
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
reduce_ โps Experimental avx512dq
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
reduce_ โround_ pd Experimental avx512dq
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
reduce_ โround_ ps Experimental avx512dq
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
xor_ โpd Experimental avx512dq
- Compute the bitwise XOR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst.
- _mm512_
xor_ โps Experimental avx512dq
- Compute the bitwise XOR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst.
- _mm_
broadcast_ โi32x2 Experimental avx512dq,avx512vl
- Broadcasts the lower 2 packed 32-bit integers from a to all elements of dst.
- _mm_
cvtepi64_ โpd Experimental avx512dq,avx512vl
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.
- _mm_
cvtepi64_ โps Experimental avx512dq,avx512vl
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.
- _mm_
cvtepu64_ โpd Experimental avx512dq,avx512vl
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.
- _mm_
cvtepu64_ โps Experimental avx512dq,avx512vl
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.
- _mm_
cvtpd_ โepi64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst.
- _mm_
cvtpd_ โepu64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst.
- _mm_
cvtps_ โepi64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst.
- _mm_
cvtps_ โepu64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst.
- _mm_
cvttpd_ โepi64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst.
- _mm_
cvttpd_ โepu64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst.
- _mm_
cvttps_ โepi64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst.
- _mm_
cvttps_ โepu64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst.
- _mm_
fpclass_ โpd_ mask Experimental avx512dq,avx512vl
- Test packed double-precision (64-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k. imm can be a combination of:
- _mm_
fpclass_ โps_ mask Experimental avx512dq,avx512vl
- Test packed single-precision (32-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k. imm can be a combination of:
- _mm_
fpclass_ โsd_ mask Experimental avx512dq
- Test the lower double-precision (64-bit) floating-point element in a for special categories specified by imm8, and store the results in mask vector k. imm can be a combination of:
- _mm_
fpclass_ โss_ mask Experimental avx512dq
- Test the lower single-precision (32-bit) floating-point element in a for special categories specified by imm8, and store the results in mask vector k. imm can be a combination of:
- _mm_
mask_ โand_ pd Experimental avx512dq,avx512vl
- Compute the bitwise AND of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ โand_ ps Experimental avx512dq,avx512vl
- Compute the bitwise AND of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ โandnot_ pd Experimental avx512dq,avx512vl
- Compute the bitwise NOT of packed double-precision (64-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ โandnot_ ps Experimental avx512dq,avx512vl
- Compute the bitwise NOT of packed single-precision (32-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ โbroadcast_ i32x2 Experimental avx512dq,avx512vl
- Broadcasts the lower 2 packed 32-bit integers from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ โcvtepi64_ pd Experimental avx512dq,avx512vl
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ โcvtepi64_ ps Experimental avx512dq,avx512vl
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ โcvtepu64_ pd Experimental avx512dq,avx512vl
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ โcvtepu64_ ps Experimental avx512dq,avx512vl
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ โcvtpd_ epi64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ โcvtpd_ epu64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ โcvtps_ epi64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ โcvtps_ epu64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ โcvttpd_ epi64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ โcvttpd_ epu64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ โcvttps_ epi64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ โcvttps_ epu64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ โfpclass_ pd_ mask Experimental avx512dq,avx512vl
- Test packed double-precision (64-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). imm can be a combination of:
- _mm_
mask_ โfpclass_ ps_ mask Experimental avx512dq,avx512vl
- Test packed single-precision (32-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). imm can be a combination of:
- _mm_
mask_ โfpclass_ sd_ mask Experimental avx512dq
- Test the lower double-precision (64-bit) floating-point element in a for special categories specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). imm can be a combination of:
- _mm_
mask_ โfpclass_ ss_ mask Experimental avx512dq
- Test the lower single-precision (32-bit) floating-point element in a for special categories specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). imm can be a combination of:
- _mm_
mask_ โmullo_ epi64 Experimental avx512dq,avx512vl
- Multiply packed 64-bit integers in
a
andb
, producing intermediate 128-bit integers, and store the low 64 bits of the intermediate integers indst
using writemaskk
(elements are copied fromsrc
if the corresponding bit is not set). - _mm_
mask_ โor_ pd Experimental avx512dq,avx512vl
- Compute the bitwise OR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ โor_ ps Experimental avx512dq,avx512vl
- Compute the bitwise OR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ โrange_ pd Experimental avx512dq,avx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm_
mask_ โrange_ ps Experimental avx512dq,avx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm_
mask_ โrange_ round_ sd Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for the lower double-precision (64-bit) floating-point element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter.
- _mm_
mask_ โrange_ round_ ss Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter.
- _mm_
mask_ โrange_ sd Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for the lower double-precision (64-bit) floating-point element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm_
mask_ โrange_ ss Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm_
mask_ โreduce_ pd Experimental avx512dq,avx512vl
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
mask_ โreduce_ ps Experimental avx512dq,avx512vl
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
mask_ โreduce_ round_ sd Experimental avx512dq
- Extract the reduced argument of the lower double-precision (64-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
mask_ โreduce_ round_ ss Experimental avx512dq
- Extract the reduced argument of the lower single-precision (32-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a. to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
mask_ โreduce_ sd Experimental avx512dq
- Extract the reduced argument of the lower double-precision (64-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
mask_ โreduce_ ss Experimental avx512dq
- Extract the reduced argument of the lower single-precision (32-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a. to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
mask_ โxor_ pd Experimental avx512dq,avx512vl
- Compute the bitwise XOR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ โxor_ ps Experimental avx512dq,avx512vl
- Compute the bitwise XOR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
maskz_ โand_ pd Experimental avx512dq,avx512vl
- Compute the bitwise AND of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ โand_ ps Experimental avx512dq,avx512vl
- Compute the bitwise AND of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ โandnot_ pd Experimental avx512dq,avx512vl
- Compute the bitwise NOT of packed double-precision (64-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ โandnot_ ps Experimental avx512dq,avx512vl
- Compute the bitwise NOT of packed single-precision (32-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ โbroadcast_ i32x2 Experimental avx512dq,avx512vl
- Broadcasts the lower 2 packed 32-bit integers from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ โcvtepi64_ pd Experimental avx512dq,avx512vl
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ โcvtepi64_ ps Experimental avx512dq,avx512vl
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ โcvtepu64_ pd Experimental avx512dq,avx512vl
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ โcvtepu64_ ps Experimental avx512dq,avx512vl
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ โcvtpd_ epi64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ โcvtpd_ epu64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ โcvtps_ epi64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ โcvtps_ epu64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ โcvttpd_ epi64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ โcvttpd_ epu64 Experimental avx512dq,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ โcvttps_ epi64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ โcvttps_ epu64 Experimental avx512dq,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ โmullo_ epi64 Experimental avx512dq,avx512vl
- Multiply packed 64-bit integers in
a
andb
, producing intermediate 128-bit integers, and store the low 64 bits of the intermediate integers indst
using zeromaskk
(elements are zeroed out if the corresponding bit is not set). - _mm_
maskz_ โor_ pd Experimental avx512dq,avx512vl
- Compute the bitwise OR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ โor_ ps Experimental avx512dq,avx512vl
- Compute the bitwise OR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ โrange_ pd Experimental avx512dq,avx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm_
maskz_ โrange_ ps Experimental avx512dq,avx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm_
maskz_ โrange_ round_ sd Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for the lower double-precision (64-bit) floating-point element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter.
- _mm_
maskz_ โrange_ round_ ss Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter.
- _mm_
maskz_ โrange_ sd Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for the lower double-precision (64-bit) floating-point element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm_
maskz_ โrange_ ss Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm_
maskz_ โreduce_ pd Experimental avx512dq,avx512vl
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
maskz_ โreduce_ ps Experimental avx512dq,avx512vl
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
maskz_ โreduce_ round_ sd Experimental avx512dq
- Extract the reduced argument of the lower double-precision (64-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
maskz_ โreduce_ round_ ss Experimental avx512dq
- Extract the reduced argument of the lower single-precision (32-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a. to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
maskz_ โreduce_ sd Experimental avx512dq
- Extract the reduced argument of the lower double-precision (64-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
maskz_ โreduce_ ss Experimental avx512dq
- Extract the reduced argument of the lower single-precision (32-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a. to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
maskz_ โxor_ pd Experimental avx512dq,avx512vl
- Compute the bitwise XOR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ โxor_ ps Experimental avx512dq,avx512vl
- Compute the bitwise XOR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
movepi32_ โmask Experimental avx512dq,avx512vl
- Set each bit of mask register k based on the most significant bit of the corresponding packed 32-bit integer in a.
- _mm_
movepi64_ โmask Experimental avx512dq,avx512vl
- Set each bit of mask register k based on the most significant bit of the corresponding packed 64-bit integer in a.
- _mm_
movm_ โepi32 Experimental avx512dq,avx512vl
- Set each packed 32-bit integer in dst to all ones or all zeros based on the value of the corresponding bit in k.
- _mm_
movm_ โepi64 Experimental avx512dq,avx512vl
- Set each packed 64-bit integer in dst to all ones or all zeros based on the value of the corresponding bit in k.
- _mm_
mullo_ โepi64 Experimental avx512dq,avx512vl
- Multiply packed 64-bit integers in
a
andb
, producing intermediate 128-bit integers, and store the low 64 bits of the intermediate integers indst
. - _mm_
range_ โpd Experimental avx512dq,avx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm_
range_ โps Experimental avx512dq,avx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm_
range_ โround_ sd Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for the lower double-precision (64-bit) floating-point element in a and b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter.
- _mm_
range_ โround_ ss Experimental avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter.
- _mm_
reduce_ โpd Experimental avx512dq,avx512vl
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
reduce_ โps Experimental avx512dq,avx512vl
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
reduce_ โround_ sd Experimental avx512dq
- Extract the reduced argument of the lower double-precision (64-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
reduce_ โround_ ss Experimental avx512dq
- Extract the reduced argument of the lower single-precision (32-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst, and copy the upper element from a. to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
reduce_ โsd Experimental avx512dq
- Extract the reduced argument of the lower double-precision (64-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst using, and copy the upper element from a. to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
reduce_ โss Experimental avx512dq
- Extract the reduced argument of the lower single-precision (32-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst, and copy the upper element from a. to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _store_
mask8 โExperimental avx512dq
- Store 8-bit mask to memory