Available on x86 or x86-64 only.
Constantsยง
- _MM_
CMPINT_ EQ Experimental - Equal
- _MM_
CMPINT_ FALSE Experimental - False
- _MM_
CMPINT_ LE Experimental - Less-than-or-equal
- _MM_
CMPINT_ LT Experimental - Less-than
- _MM_
CMPINT_ NE Experimental - Not-equal
- _MM_
CMPINT_ NLE Experimental - Not less-than-or-equal
- _MM_
CMPINT_ NLT Experimental - Not less-than
- _MM_
CMPINT_ TRUE Experimental - True
- _MM_
MANT_ NORM_ 1_ 2 Experimental - interval [1, 2)
- _MM_
MANT_ NORM_ P5_ 1 Experimental - interval [0.5, 1)
- _MM_
MANT_ NORM_ P5_ 2 Experimental - interval [0.5, 2)
- _MM_
MANT_ NORM_ P75_ 1P5 Experimental - interval [0.75, 1.5)
- _MM_
MANT_ SIGN_ NAN Experimental - DEST = NaN if sign(SRC) = 1
- _MM_
MANT_ SIGN_ SRC Experimental - sign = sign(SRC)
- _MM_
MANT_ SIGN_ ZERO Experimental - sign = 0
- _MM_
PERM_ AAAA Experimental - _MM_
PERM_ AAAB Experimental - _MM_
PERM_ AAAC Experimental - _MM_
PERM_ AAAD Experimental - _MM_
PERM_ AABA Experimental - _MM_
PERM_ AABB Experimental - _MM_
PERM_ AABC Experimental - _MM_
PERM_ AABD Experimental - _MM_
PERM_ AACA Experimental - _MM_
PERM_ AACB Experimental - _MM_
PERM_ AACC Experimental - _MM_
PERM_ AACD Experimental - _MM_
PERM_ AADA Experimental - _MM_
PERM_ AADB Experimental - _MM_
PERM_ AADC Experimental - _MM_
PERM_ AADD Experimental - _MM_
PERM_ ABAA Experimental - _MM_
PERM_ ABAB Experimental - _MM_
PERM_ ABAC Experimental - _MM_
PERM_ ABAD Experimental - _MM_
PERM_ ABBA Experimental - _MM_
PERM_ ABBB Experimental - _MM_
PERM_ ABBC Experimental - _MM_
PERM_ ABBD Experimental - _MM_
PERM_ ABCA Experimental - _MM_
PERM_ ABCB Experimental - _MM_
PERM_ ABCC Experimental - _MM_
PERM_ ABCD Experimental - _MM_
PERM_ ABDA Experimental - _MM_
PERM_ ABDB Experimental - _MM_
PERM_ ABDC Experimental - _MM_
PERM_ ABDD Experimental - _MM_
PERM_ ACAA Experimental - _MM_
PERM_ ACAB Experimental - _MM_
PERM_ ACAC Experimental - _MM_
PERM_ ACAD Experimental - _MM_
PERM_ ACBA Experimental - _MM_
PERM_ ACBB Experimental - _MM_
PERM_ ACBC Experimental - _MM_
PERM_ ACBD Experimental - _MM_
PERM_ ACCA Experimental - _MM_
PERM_ ACCB Experimental - _MM_
PERM_ ACCC Experimental - _MM_
PERM_ ACCD Experimental - _MM_
PERM_ ACDA Experimental - _MM_
PERM_ ACDB Experimental - _MM_
PERM_ ACDC Experimental - _MM_
PERM_ ACDD Experimental - _MM_
PERM_ ADAA Experimental - _MM_
PERM_ ADAB Experimental - _MM_
PERM_ ADAC Experimental - _MM_
PERM_ ADAD Experimental - _MM_
PERM_ ADBA Experimental - _MM_
PERM_ ADBB Experimental - _MM_
PERM_ ADBC Experimental - _MM_
PERM_ ADBD Experimental - _MM_
PERM_ ADCA Experimental - _MM_
PERM_ ADCB Experimental - _MM_
PERM_ ADCC Experimental - _MM_
PERM_ ADCD Experimental - _MM_
PERM_ ADDA Experimental - _MM_
PERM_ ADDB Experimental - _MM_
PERM_ ADDC Experimental - _MM_
PERM_ ADDD Experimental - _MM_
PERM_ BAAA Experimental - _MM_
PERM_ BAAB Experimental - _MM_
PERM_ BAAC Experimental - _MM_
PERM_ BAAD Experimental - _MM_
PERM_ BABA Experimental - _MM_
PERM_ BABB Experimental - _MM_
PERM_ BABC Experimental - _MM_
PERM_ BABD Experimental - _MM_
PERM_ BACA Experimental - _MM_
PERM_ BACB Experimental - _MM_
PERM_ BACC Experimental - _MM_
PERM_ BACD Experimental - _MM_
PERM_ BADA Experimental - _MM_
PERM_ BADB Experimental - _MM_
PERM_ BADC Experimental - _MM_
PERM_ BADD Experimental - _MM_
PERM_ BBAA Experimental - _MM_
PERM_ BBAB Experimental - _MM_
PERM_ BBAC Experimental - _MM_
PERM_ BBAD Experimental - _MM_
PERM_ BBBA Experimental - _MM_
PERM_ BBBB Experimental - _MM_
PERM_ BBBC Experimental - _MM_
PERM_ BBBD Experimental - _MM_
PERM_ BBCA Experimental - _MM_
PERM_ BBCB Experimental - _MM_
PERM_ BBCC Experimental - _MM_
PERM_ BBCD Experimental - _MM_
PERM_ BBDA Experimental - _MM_
PERM_ BBDB Experimental - _MM_
PERM_ BBDC Experimental - _MM_
PERM_ BBDD Experimental - _MM_
PERM_ BCAA Experimental - _MM_
PERM_ BCAB Experimental - _MM_
PERM_ BCAC Experimental - _MM_
PERM_ BCAD Experimental - _MM_
PERM_ BCBA Experimental - _MM_
PERM_ BCBB Experimental - _MM_
PERM_ BCBC Experimental - _MM_
PERM_ BCBD Experimental - _MM_
PERM_ BCCA Experimental - _MM_
PERM_ BCCB Experimental - _MM_
PERM_ BCCC Experimental - _MM_
PERM_ BCCD Experimental - _MM_
PERM_ BCDA Experimental - _MM_
PERM_ BCDB Experimental - _MM_
PERM_ BCDC Experimental - _MM_
PERM_ BCDD Experimental - _MM_
PERM_ BDAA Experimental - _MM_
PERM_ BDAB Experimental - _MM_
PERM_ BDAC Experimental - _MM_
PERM_ BDAD Experimental - _MM_
PERM_ BDBA Experimental - _MM_
PERM_ BDBB Experimental - _MM_
PERM_ BDBC Experimental - _MM_
PERM_ BDBD Experimental - _MM_
PERM_ BDCA Experimental - _MM_
PERM_ BDCB Experimental - _MM_
PERM_ BDCC Experimental - _MM_
PERM_ BDCD Experimental - _MM_
PERM_ BDDA Experimental - _MM_
PERM_ BDDB Experimental - _MM_
PERM_ BDDC Experimental - _MM_
PERM_ BDDD Experimental - _MM_
PERM_ CAAA Experimental - _MM_
PERM_ CAAB Experimental - _MM_
PERM_ CAAC Experimental - _MM_
PERM_ CAAD Experimental - _MM_
PERM_ CABA Experimental - _MM_
PERM_ CABB Experimental - _MM_
PERM_ CABC Experimental - _MM_
PERM_ CABD Experimental - _MM_
PERM_ CACA Experimental - _MM_
PERM_ CACB Experimental - _MM_
PERM_ CACC Experimental - _MM_
PERM_ CACD Experimental - _MM_
PERM_ CADA Experimental - _MM_
PERM_ CADB Experimental - _MM_
PERM_ CADC Experimental - _MM_
PERM_ CADD Experimental - _MM_
PERM_ CBAA Experimental - _MM_
PERM_ CBAB Experimental - _MM_
PERM_ CBAC Experimental - _MM_
PERM_ CBAD Experimental - _MM_
PERM_ CBBA Experimental - _MM_
PERM_ CBBB Experimental - _MM_
PERM_ CBBC Experimental - _MM_
PERM_ CBBD Experimental - _MM_
PERM_ CBCA Experimental - _MM_
PERM_ CBCB Experimental - _MM_
PERM_ CBCC Experimental - _MM_
PERM_ CBCD Experimental - _MM_
PERM_ CBDA Experimental - _MM_
PERM_ CBDB Experimental - _MM_
PERM_ CBDC Experimental - _MM_
PERM_ CBDD Experimental - _MM_
PERM_ CCAA Experimental - _MM_
PERM_ CCAB Experimental - _MM_
PERM_ CCAC Experimental - _MM_
PERM_ CCAD Experimental - _MM_
PERM_ CCBA Experimental - _MM_
PERM_ CCBB Experimental - _MM_
PERM_ CCBC Experimental - _MM_
PERM_ CCBD Experimental - _MM_
PERM_ CCCA Experimental - _MM_
PERM_ CCCB Experimental - _MM_
PERM_ CCCC Experimental - _MM_
PERM_ CCCD Experimental - _MM_
PERM_ CCDA Experimental - _MM_
PERM_ CCDB Experimental - _MM_
PERM_ CCDC Experimental - _MM_
PERM_ CCDD Experimental - _MM_
PERM_ CDAA Experimental - _MM_
PERM_ CDAB Experimental - _MM_
PERM_ CDAC Experimental - _MM_
PERM_ CDAD Experimental - _MM_
PERM_ CDBA Experimental - _MM_
PERM_ CDBB Experimental - _MM_
PERM_ CDBC Experimental - _MM_
PERM_ CDBD Experimental - _MM_
PERM_ CDCA Experimental - _MM_
PERM_ CDCB Experimental - _MM_
PERM_ CDCC Experimental - _MM_
PERM_ CDCD Experimental - _MM_
PERM_ CDDA Experimental - _MM_
PERM_ CDDB Experimental - _MM_
PERM_ CDDC Experimental - _MM_
PERM_ CDDD Experimental - _MM_
PERM_ DAAA Experimental - _MM_
PERM_ DAAB Experimental - _MM_
PERM_ DAAC Experimental - _MM_
PERM_ DAAD Experimental - _MM_
PERM_ DABA Experimental - _MM_
PERM_ DABB Experimental - _MM_
PERM_ DABC Experimental - _MM_
PERM_ DABD Experimental - _MM_
PERM_ DACA Experimental - _MM_
PERM_ DACB Experimental - _MM_
PERM_ DACC Experimental - _MM_
PERM_ DACD Experimental - _MM_
PERM_ DADA Experimental - _MM_
PERM_ DADB Experimental - _MM_
PERM_ DADC Experimental - _MM_
PERM_ DADD Experimental - _MM_
PERM_ DBAA Experimental - _MM_
PERM_ DBAB Experimental - _MM_
PERM_ DBAC Experimental - _MM_
PERM_ DBAD Experimental - _MM_
PERM_ DBBA Experimental - _MM_
PERM_ DBBB Experimental - _MM_
PERM_ DBBC Experimental - _MM_
PERM_ DBBD Experimental - _MM_
PERM_ DBCA Experimental - _MM_
PERM_ DBCB Experimental - _MM_
PERM_ DBCC Experimental - _MM_
PERM_ DBCD Experimental - _MM_
PERM_ DBDA Experimental - _MM_
PERM_ DBDB Experimental - _MM_
PERM_ DBDC Experimental - _MM_
PERM_ DBDD Experimental - _MM_
PERM_ DCAA Experimental - _MM_
PERM_ DCAB Experimental - _MM_
PERM_ DCAC Experimental - _MM_
PERM_ DCAD Experimental - _MM_
PERM_ DCBA Experimental - _MM_
PERM_ DCBB Experimental - _MM_
PERM_ DCBC Experimental - _MM_
PERM_ DCBD Experimental - _MM_
PERM_ DCCA Experimental - _MM_
PERM_ DCCB Experimental - _MM_
PERM_ DCCC Experimental - _MM_
PERM_ DCCD Experimental - _MM_
PERM_ DCDA Experimental - _MM_
PERM_ DCDB Experimental - _MM_
PERM_ DCDC Experimental - _MM_
PERM_ DCDD Experimental - _MM_
PERM_ DDAA Experimental - _MM_
PERM_ DDAB Experimental - _MM_
PERM_ DDAC Experimental - _MM_
PERM_ DDAD Experimental - _MM_
PERM_ DDBA Experimental - _MM_
PERM_ DDBB Experimental - _MM_
PERM_ DDBC Experimental - _MM_
PERM_ DDBD Experimental - _MM_
PERM_ DDCA Experimental - _MM_
PERM_ DDCB Experimental - _MM_
PERM_ DDCC Experimental - _MM_
PERM_ DDCD Experimental - _MM_
PERM_ DDDA Experimental - _MM_
PERM_ DDDB Experimental - _MM_
PERM_ DDDC Experimental - _MM_
PERM_ DDDD Experimental
Functionsยง
- expandloadd_
128 ๐ โ - expandloadd_
256 ๐ โ - expandloadd_
512 ๐ โ - expandloadpd_
128 ๐ โ - expandloadpd_
256 ๐ โ - expandloadpd_
512 ๐ โ - expandloadps_
128 ๐ โ - expandloadps_
256 ๐ โ - expandloadps_
512 ๐ โ - expandloadq_
128 ๐ โ - expandloadq_
256 ๐ โ - expandloadq_
512 ๐ โ - loadapd_
128 ๐ โ - loadapd_
256 ๐ โ - loadapd_
512 ๐ โ - loadaps_
128 ๐ โ - loadaps_
256 ๐ โ - loadaps_
512 ๐ โ - loaddqa32_
128 ๐ โ - loaddqa32_
256 ๐ โ - loaddqa32_
512 ๐ โ - loaddqa64_
128 ๐ โ - loaddqa64_
256 ๐ โ - loaddqa64_
512 ๐ โ - loaddqu32_
128 ๐ โ - loaddqu32_
256 ๐ โ - loaddqu32_
512 ๐ โ - loaddqu64_
128 ๐ โ - loaddqu64_
256 ๐ โ - loaddqu64_
512 ๐ โ - loadupd_
128 ๐ โ - loadupd_
256 ๐ โ - loadupd_
512 ๐ โ - loadups_
128 ๐ โ - loadups_
256 ๐ โ - loadups_
512 ๐ โ - storeapd_
128 ๐ โ - storeapd_
256 ๐ โ - storeapd_
512 ๐ โ - storeaps_
128 ๐ โ - storeaps_
256 ๐ โ - storeaps_
512 ๐ โ - storedqa32_
128 ๐ โ - storedqa32_
256 ๐ โ - storedqa32_
512 ๐ โ - storedqa64_
128 ๐ โ - storedqa64_
256 ๐ โ - storedqa64_
512 ๐ โ - storedqu32_
128 ๐ โ - storedqu32_
256 ๐ โ - storedqu32_
512 ๐ โ - storedqu64_
128 ๐ โ - storedqu64_
256 ๐ โ - storedqu64_
512 ๐ โ - storeupd_
128 ๐ โ - storeupd_
256 ๐ โ - storeupd_
512 ๐ โ - storeups_
128 ๐ โ - storeups_
256 ๐ โ - storeups_
512 ๐ โ - vaddpd ๐ โ
- vaddps ๐ โ
- vaddsd ๐ โ
- vaddss ๐ โ
- vcmppd ๐ โ
- vcmppd128 ๐ โ
- vcmppd256 ๐ โ
- vcmpps ๐ โ
- vcmpps128 ๐ โ
- vcmpps256 ๐ โ
- vcmpsd ๐ โ
- vcmpss ๐ โ
- vcomisd ๐ โ
- vcomiss ๐ โ
- vcompresspd ๐ โ
- vcompresspd128 ๐ โ
- vcompresspd256 ๐ โ
- vcompressps ๐ โ
- vcompressps128 ๐ โ
- vcompressps256 ๐ โ
- vcompressstored ๐ โ
- vcompressstored128 ๐ โ
- vcompressstored256 ๐ โ
- vcompressstorepd ๐ โ
- vcompressstorepd128 ๐ โ
- vcompressstorepd256 ๐ โ
- vcompressstoreps ๐ โ
- vcompressstoreps128 ๐ โ
- vcompressstoreps256 ๐ โ
- vcompressstoreq ๐ โ
- vcompressstoreq128 ๐ โ
- vcompressstoreq256 ๐ โ
- vcvtdq2ps ๐ โ
- vcvtpd2dq ๐ โ
- vcvtpd2ps ๐ โ
- vcvtpd2udq ๐ โ
- vcvtpd2udq128 ๐ โ
- vcvtpd2udq256 ๐ โ
- vcvtph2ps ๐ โ
- vcvtps2dq ๐ โ
- vcvtps2pd ๐ โ
- vcvtps2ph ๐ โ
- vcvtps2ph128 ๐ โ
- vcvtps2ph256 ๐ โ
- vcvtps2udq ๐ โ
- vcvtps2udq128 ๐ โ
- vcvtps2udq256 ๐ โ
- vcvtsd2si ๐ โ
- vcvtsd2ss ๐ โ
- vcvtsd2usi ๐ โ
- vcvtsi2ss ๐ โ
- vcvtss2sd ๐ โ
- vcvtss2si ๐ โ
- vcvtss2usi ๐ โ
- vcvttpd2dq ๐ โ
- vcvttpd2dq128 ๐ โ
- vcvttpd2dq256 ๐ โ
- vcvttpd2udq ๐ โ
- vcvttpd2udq128 ๐ โ
- vcvttpd2udq256 ๐ โ
- vcvttps2dq ๐ โ
- vcvttps2dq128 ๐ โ
- vcvttps2dq256 ๐ โ
- vcvttps2udq ๐ โ
- vcvttps2udq128 ๐ โ
- vcvttps2udq256 ๐ โ
- vcvttsd2si ๐ โ
- vcvttsd2usi ๐ โ
- vcvttss2si ๐ โ
- vcvttss2usi ๐ โ
- vcvtudq2ps ๐ โ
- vcvtusi2ss ๐ โ
- vdivpd ๐ โ
- vdivps ๐ โ
- vdivsd ๐ โ
- vdivss ๐ โ
- vexpandpd ๐ โ
- vexpandpd128 ๐ โ
- vexpandpd256 ๐ โ
- vexpandps ๐ โ
- vexpandps128 ๐ โ
- vexpandps256 ๐ โ
- vfixupimmpd ๐ โ
- vfixupimmpd128 ๐ โ
- vfixupimmpd256 ๐ โ
- vfixupimmpdz ๐ โ
- vfixupimmpdz128 ๐ โ
- vfixupimmpdz256 ๐ โ
- vfixupimmps ๐ โ
- vfixupimmps128 ๐ โ
- vfixupimmps256 ๐ โ
- vfixupimmpsz ๐ โ
- vfixupimmpsz128 ๐ โ
- vfixupimmpsz256 ๐ โ
- vfixupimmsd ๐ โ
- vfixupimmsdz ๐ โ
- vfixupimmss ๐ โ
- vfixupimmssz ๐ โ
- vfmadd132pdround ๐ โ
- vfmadd132psround ๐ โ
- vfmaddsdround ๐ โ
- vfmaddssround ๐ โ
- vfmaddsubpdround ๐ โ
- vfmaddsubpsround ๐ โ
- vgatherdpd ๐ โ
- vgatherdpd_
128 ๐ โ - vgatherdpd_
256 ๐ โ - vgatherdps ๐ โ
- vgatherdps_
128 ๐ โ - vgatherdps_
256 ๐ โ - vgatherqpd ๐ โ
- vgatherqpd_
128 ๐ โ - vgatherqpd_
256 ๐ โ - vgatherqps ๐ โ
- vgatherqps_
128 ๐ โ - vgatherqps_
256 ๐ โ - vgetexppd ๐ โ
- vgetexppd128 ๐ โ
- vgetexppd256 ๐ โ
- vgetexpps ๐ โ
- vgetexpps128 ๐ โ
- vgetexpps256 ๐ โ
- vgetexpsd ๐ โ
- vgetexpss ๐ โ
- vgetmantpd ๐ โ
- vgetmantpd128 ๐ โ
- vgetmantpd256 ๐ โ
- vgetmantps ๐ โ
- vgetmantps128 ๐ โ
- vgetmantps256 ๐ โ
- vgetmantsd ๐ โ
- vgetmantss ๐ โ
- vmaxpd ๐ โ
- vmaxps ๐ โ
- vmaxsd ๐ โ
- vmaxss ๐ โ
- vminpd ๐ โ
- vminps ๐ โ
- vminsd ๐ โ
- vminss ๐ โ
- vmulpd ๐ โ
- vmulps ๐ โ
- vmulsd ๐ โ
- vmulss ๐ โ
- vpcompressd ๐ โ
- vpcompressd128 ๐ โ
- vpcompressd256 ๐ โ
- vpcompressq ๐ โ
- vpcompressq128 ๐ โ
- vpcompressq256 ๐ โ
- vpermd ๐ โ
- vpermi2d ๐ โ
- vpermi2d128 ๐ โ
- vpermi2d256 ๐ โ
- vpermi2pd ๐ โ
- vpermi2pd128 ๐ โ
- vpermi2pd256 ๐ โ
- vpermi2ps ๐ โ
- vpermi2ps128 ๐ โ
- vpermi2ps256 ๐ โ
- vpermi2q ๐ โ
- vpermi2q128 ๐ โ
- vpermi2q256 ๐ โ
- vpermilpd ๐ โ
- vpermilps ๐ โ
- vpermpd ๐ โ
- vpermpd256 ๐ โ
- vpermps ๐ โ
- vpermq ๐ โ
- vpermq256 ๐ โ
- vpexpandd ๐ โ
- vpexpandd128 ๐ โ
- vpexpandd256 ๐ โ
- vpexpandq ๐ โ
- vpexpandq128 ๐ โ
- vpexpandq256 ๐ โ
- vpgatherdd ๐ โ
- vpgatherdd_
128 ๐ โ - vpgatherdd_
256 ๐ โ - vpgatherdq ๐ โ
- vpgatherdq_
128 ๐ โ - vpgatherdq_
256 ๐ โ - vpgatherqd ๐ โ
- vpgatherqd_
128 ๐ โ - vpgatherqd_
256 ๐ โ - vpgatherqq ๐ โ
- vpgatherqq_
128 ๐ โ - vpgatherqq_
256 ๐ โ - vpmovdb128 ๐ โ
- vpmovdb256 ๐ โ
- vpmovdbmem ๐ โ
- vpmovdbmem128 ๐ โ
- vpmovdbmem256 ๐ โ
- vpmovdw128 ๐ โ
- vpmovdwmem ๐ โ
- vpmovdwmem128 ๐ โ
- vpmovdwmem256 ๐ โ
- vpmovqb ๐ โ
- vpmovqb128 ๐ โ
- vpmovqb256 ๐ โ
- vpmovqbmem ๐ โ
- vpmovqbmem128 ๐ โ
- vpmovqbmem256 ๐ โ
- vpmovqd128 ๐ โ
- vpmovqdmem ๐ โ
- vpmovqdmem128 ๐ โ
- vpmovqdmem256 ๐ โ
- vpmovqw128 ๐ โ
- vpmovqw256 ๐ โ
- vpmovqwmem ๐ โ
- vpmovqwmem128 ๐ โ
- vpmovqwmem256 ๐ โ
- vpmovsdb ๐ โ
- vpmovsdb128 ๐ โ
- vpmovsdb256 ๐ โ
- vpmovsdbmem ๐ โ
- vpmovsdbmem128 ๐ โ
- vpmovsdbmem256 ๐ โ
- vpmovsdw ๐ โ
- vpmovsdw128 ๐ โ
- vpmovsdw256 ๐ โ
- vpmovsdwmem ๐ โ
- vpmovsdwmem128 ๐ โ
- vpmovsdwmem256 ๐ โ
- vpmovsqb ๐ โ
- vpmovsqb128 ๐ โ
- vpmovsqb256 ๐ โ
- vpmovsqbmem ๐ โ
- vpmovsqbmem128 ๐ โ
- vpmovsqbmem256 ๐ โ
- vpmovsqd ๐ โ
- vpmovsqd128 ๐ โ
- vpmovsqd256 ๐ โ
- vpmovsqdmem ๐ โ
- vpmovsqdmem128 ๐ โ
- vpmovsqdmem256 ๐ โ
- vpmovsqw ๐ โ
- vpmovsqw128 ๐ โ
- vpmovsqw256 ๐ โ
- vpmovsqwmem ๐ โ
- vpmovsqwmem128 ๐ โ
- vpmovsqwmem256 ๐ โ
- vpmovusdb ๐ โ
- vpmovusdb128 ๐ โ
- vpmovusdb256 ๐ โ
- vpmovusdbmem ๐ โ
- vpmovusdbmem128 ๐ โ
- vpmovusdbmem256 ๐ โ
- vpmovusdw ๐ โ
- vpmovusdw128 ๐ โ
- vpmovusdw256 ๐ โ
- vpmovusdwmem ๐ โ
- vpmovusdwmem128 ๐ โ
- vpmovusdwmem256 ๐ โ
- vpmovusqb ๐ โ
- vpmovusqb128 ๐ โ
- vpmovusqb256 ๐ โ
- vpmovusqbmem ๐ โ
- vpmovusqbmem128 ๐ โ
- vpmovusqbmem256 ๐ โ
- vpmovusqd ๐ โ
- vpmovusqd128 ๐ โ
- vpmovusqd256 ๐ โ
- vpmovusqdmem ๐ โ
- vpmovusqdmem128 ๐ โ
- vpmovusqdmem256 ๐ โ
- vpmovusqw ๐ โ
- vpmovusqw128 ๐ โ
- vpmovusqw256 ๐ โ
- vpmovusqwmem ๐ โ
- vpmovusqwmem128 ๐ โ
- vpmovusqwmem256 ๐ โ
- vprold ๐ โ
- vprold128 ๐ โ
- vprold256 ๐ โ
- vprolq ๐ โ
- vprolq128 ๐ โ
- vprolq256 ๐ โ
- vprolvd ๐ โ
- vprolvd128 ๐ โ
- vprolvd256 ๐ โ
- vprolvq ๐ โ
- vprolvq128 ๐ โ
- vprolvq256 ๐ โ
- vprord ๐ โ
- vprord128 ๐ โ
- vprord256 ๐ โ
- vprorq ๐ โ
- vprorq128 ๐ โ
- vprorq256 ๐ โ
- vprorvd ๐ โ
- vprorvd128 ๐ โ
- vprorvd256 ๐ โ
- vprorvq ๐ โ
- vprorvq128 ๐ โ
- vprorvq256 ๐ โ
- vpscatterdd ๐ โ
- vpscatterdd_
128 ๐ โ - vpscatterdd_
256 ๐ โ - vpscatterdq ๐ โ
- vpscatterdq_
128 ๐ โ - vpscatterdq_
256 ๐ โ - vpscatterqd ๐ โ
- vpscatterqd_
128 ๐ โ - vpscatterqd_
256 ๐ โ - vpscatterqq ๐ โ
- vpscatterqq_
128 ๐ โ - vpscatterqq_
256 ๐ โ - vpslld ๐ โ
- vpsllq ๐ โ
- vpsllvd ๐ โ
- vpsllvq ๐ โ
- vpsrad ๐ โ
- vpsraq ๐ โ
- vpsraq128 ๐ โ
- vpsraq256 ๐ โ
- vpsravd ๐ โ
- vpsravq ๐ โ
- vpsravq128 ๐ โ
- vpsravq256 ๐ โ
- vpsrld ๐ โ
- vpsrlq ๐ โ
- vpsrlvd ๐ โ
- vpsrlvq ๐ โ
- vpternlogd ๐ โ
- vpternlogd128 ๐ โ
- vpternlogd256 ๐ โ
- vpternlogq ๐ โ
- vpternlogq128 ๐ โ
- vpternlogq256 ๐ โ
- vrcp14pd ๐ โ
- vrcp14pd128 ๐ โ
- vrcp14pd256 ๐ โ
- vrcp14ps ๐ โ
- vrcp14ps128 ๐ โ
- vrcp14ps256 ๐ โ
- vrcp14sd ๐ โ
- vrcp14ss ๐ โ
- vrndscalepd ๐ โ
- vrndscalepd128 ๐ โ
- vrndscalepd256 ๐ โ
- vrndscaleps ๐ โ
- vrndscaleps128 ๐ โ
- vrndscaleps256 ๐ โ
- vrndscalesd ๐ โ
- vrndscaless ๐ โ
- vrsqrt14pd ๐ โ
- vrsqrt14pd128 ๐ โ
- vrsqrt14pd256 ๐ โ
- vrsqrt14ps ๐ โ
- vrsqrt14ps128 ๐ โ
- vrsqrt14ps256 ๐ โ
- vrsqrt14sd ๐ โ
- vrsqrt14ss ๐ โ
- vscalefpd ๐ โ
- vscalefpd128 ๐ โ
- vscalefpd256 ๐ โ
- vscalefps ๐ โ
- vscalefps128 ๐ โ
- vscalefps256 ๐ โ
- vscalefsd ๐ โ
- vscalefss ๐ โ
- vscatterdpd ๐ โ
- vscatterdpd_
128 ๐ โ - vscatterdpd_
256 ๐ โ - vscatterdps ๐ โ
- vscatterdps_
128 ๐ โ - vscatterdps_
256 ๐ โ - vscatterqpd ๐ โ
- vscatterqpd_
128 ๐ โ - vscatterqpd_
256 ๐ โ - vscatterqps ๐ โ
- vscatterqps_
128 ๐ โ - vscatterqps_
256 ๐ โ - vsqrtpd ๐ โ
- vsqrtps ๐ โ
- vsqrtsd ๐ โ
- vsqrtss ๐ โ
- vsubpd ๐ โ
- vsubps ๐ โ
- vsubsd ๐ โ
- vsubss ๐ โ
- _cvtmask16_
u32 โExperimental avx512f
- Convert 16-bit mask a into an integer value, and store the result in dst.
- _cvtu32_
mask16 โExperimental avx512f
- Convert 32-bit integer value a to an 16-bit mask and store the result in dst.
- _kand_
mask16 โExperimental avx512f
- Compute the bitwise AND of 16-bit masks a and b, and store the result in k.
- _kandn_
mask16 โExperimental avx512f
- Compute the bitwise NOT of 16-bit masks a and then AND with b, and store the result in k.
- _knot_
mask16 โExperimental avx512f
- Compute the bitwise NOT of 16-bit mask a, and store the result in k.
- _kor_
mask16 โExperimental avx512f
- Compute the bitwise OR of 16-bit masks a and b, and store the result in k.
- _kortest_
mask16_ โu8 Experimental avx512f
- Compute the bitwise OR of 16-bit masks a and b. If the result is all zeros, store 1 in dst, otherwise store 0 in dst. If the result is all ones, store 1 in all_ones, otherwise store 0 in all_ones.
- _kortestc_
mask16_ โu8 Experimental avx512f
- Compute the bitwise OR of 16-bit masks a and b. If the result is all ones, store 1 in dst, otherwise store 0 in dst.
- _kortestz_
mask16_ โu8 Experimental avx512f
- Compute the bitwise OR of 16-bit masks a and b. If the result is all zeros, store 1 in dst, otherwise store 0 in dst.
- _kshiftli_
mask16 โExperimental avx512f
- Shift 16-bit mask a left by count bits while shifting in zeros, and store the result in dst.
- _kshiftri_
mask16 โExperimental avx512f
- Shift 16-bit mask a right by count bits while shifting in zeros, and store the result in dst.
- _kxnor_
mask16 โExperimental avx512f
- Compute the bitwise XNOR of 16-bit masks a and b, and store the result in k.
- _kxor_
mask16 โExperimental avx512f
- Compute the bitwise XOR of 16-bit masks a and b, and store the result in k.
- _load_
mask16 โExperimental avx512f
- Load 16-bit mask from memory
- _mm256_
abs_ โepi64 Experimental avx512f,avx512vl
- Compute the absolute value of packed signed 64-bit integers in a, and store the unsigned results in dst.
- _mm256_
alignr_ โepi32 Experimental avx512f,avx512vl
- Concatenate a and b into a 64-byte immediate result, shift the result right by imm8 32-bit elements, and store the low 32 bytes (8 elements) in dst.
- _mm256_
alignr_ โepi64 Experimental avx512f,avx512vl
- Concatenate a and b into a 64-byte immediate result, shift the result right by imm8 64-bit elements, and store the low 32 bytes (4 elements) in dst.
- _mm256_
broadcast_ โf32x4 Experimental avx512f,avx512vl
- Broadcast the 4 packed single-precision (32-bit) floating-point elements from a to all elements of dst.
- _mm256_
broadcast_ โi32x4 Experimental avx512f,avx512vl
- Broadcast the 4 packed 32-bit integers from a to all elements of dst.
- _mm256_
cmp_ โepi32_ mask Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k.
- _mm256_
cmp_ โepi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k.
- _mm256_
cmp_ โepu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k.
- _mm256_
cmp_ โepu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k.
- _mm256_
cmp_ โpd_ mask Experimental avx512f,avx512vl
- Compare packed double-precision (64-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k.
- _mm256_
cmp_ โps_ mask Experimental avx512f,avx512vl
- Compare packed single-precision (32-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k.
- _mm256_
cmpeq_ โepi32_ mask Experimental avx512f,avx512vl
- Compare packed 32-bit integers in a and b for equality, and store the results in mask vector k.
- _mm256_
cmpeq_ โepi64_ mask Experimental avx512f,avx512vl
- Compare packed 64-bit integers in a and b for equality, and store the results in mask vector k.
- _mm256_
cmpeq_ โepu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for equality, and store the results in mask vector k.
- _mm256_
cmpeq_ โepu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for equality, and store the results in mask vector k.
- _mm256_
cmpge_ โepi32_ mask Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k.
- _mm256_
cmpge_ โepi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k.
- _mm256_
cmpge_ โepu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k.
- _mm256_
cmpge_ โepu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k.
- _mm256_
cmpgt_ โepi32_ mask Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b for greater-than, and store the results in mask vector k.
- _mm256_
cmpgt_ โepi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b for greater-than, and store the results in mask vector k.
- _mm256_
cmpgt_ โepu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for greater-than, and store the results in mask vector k.
- _mm256_
cmpgt_ โepu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for greater-than, and store the results in mask vector k.
- _mm256_
cmple_ โepi32_ mask Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b for less-than-or-equal, and store the results in mask vector k.
- _mm256_
cmple_ โepi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b for less-than-or-equal, and store the results in mask vector k.
- _mm256_
cmple_ โepu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for less-than-or-equal, and store the results in mask vector k.
- _mm256_
cmple_ โepu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for less-than-or-equal, and store the results in mask vector k.
- _mm256_
cmplt_ โepi32_ mask Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b for less-than, and store the results in mask vector k.
- _mm256_
cmplt_ โepi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b for less-than, and store the results in mask vector k.
- _mm256_
cmplt_ โepu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for less-than, and store the results in mask vector k.
- _mm256_
cmplt_ โepu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for less-than, and store the results in mask vector k.
- _mm256_
cmpneq_ โepi32_ mask Experimental avx512f,avx512vl
- Compare packed 32-bit integers in a and b for not-equal, and store the results in mask vector k.
- _mm256_
cmpneq_ โepi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b for not-equal, and store the results in mask vector k.
- _mm256_
cmpneq_ โepu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for not-equal, and store the results in mask vector k.
- _mm256_
cmpneq_ โepu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for not-equal, and store the results in mask vector k.
- _mm256_
cvtepi32_ โepi8 Experimental avx512f,avx512vl
- Convert packed 32-bit integers in a to packed 8-bit integers with truncation, and store the results in dst.
- _mm256_
cvtepi32_ โepi16 Experimental avx512f,avx512vl
- Convert packed 32-bit integers in a to packed 16-bit integers with truncation, and store the results in dst.
- _mm256_
cvtepi64_ โepi8 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 8-bit integers with truncation, and store the results in dst.
- _mm256_
cvtepi64_ โepi16 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 16-bit integers with truncation, and store the results in dst.
- _mm256_
cvtepi64_ โepi32 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 32-bit integers with truncation, and store the results in dst.
- _mm256_
cvtepu32_ โpd Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.
- _mm256_
cvtpd_ โepu32 Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst.
- _mm256_
cvtps_ โepu32 Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst.
- _mm256_
cvtsepi32_ โepi8 Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed 8-bit integers with signed saturation, and store the results in dst.
- _mm256_
cvtsepi32_ โepi16 Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed 16-bit integers with signed saturation, and store the results in dst.
- _mm256_
cvtsepi64_ โepi8 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 8-bit integers with signed saturation, and store the results in dst.
- _mm256_
cvtsepi64_ โepi16 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 16-bit integers with signed saturation, and store the results in dst.
- _mm256_
cvtsepi64_ โepi32 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 32-bit integers with signed saturation, and store the results in dst.
- _mm256_
cvttpd_ โepu32 Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst.
- _mm256_
cvttps_ โepu32 Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst.
- _mm256_
cvtusepi32_ โepi8 Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed unsigned 8-bit integers with unsigned saturation, and store the results in dst.
- _mm256_
cvtusepi32_ โepi16 Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed unsigned 16-bit integers with unsigned saturation, and store the results in dst.
- _mm256_
cvtusepi64_ โepi8 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed unsigned 8-bit integers with unsigned saturation, and store the results in dst.
- _mm256_
cvtusepi64_ โepi16 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed unsigned 16-bit integers with unsigned saturation, and store the results in dst.
- _mm256_
cvtusepi64_ โepi32 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed unsigned 32-bit integers with unsigned saturation, and store the results in dst.
- _mm256_
extractf32x4_ โps Experimental avx512f,avx512vl
- Extract 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from a, selected with imm8, and store the result in dst.
- _mm256_
extracti32x4_ โepi32 Experimental avx512f,avx512vl
- Extract 128 bits (composed of 4 packed 32-bit integers) from a, selected with IMM1, and store the result in dst.
- _mm256_
fixupimm_ โpd Experimental avx512f,avx512vl
- Fix up packed double-precision (64-bit) floating-point elements in a and b using packed 64-bit integers in c, and store the results in dst. imm8 is used to set the required flags reporting.
- _mm256_
fixupimm_ โps Experimental avx512f,avx512vl
- Fix up packed single-precision (32-bit) floating-point elements in a and b using packed 32-bit integers in c, and store the results in dst. imm8 is used to set the required flags reporting.
- _mm256_
getexp_ โpd Experimental avx512f,avx512vl
- Convert the exponent of each packed double-precision (64-bit) floating-point element in a to a double-precision (64-bit) floating-point number representing the integer exponent, and store the results in dst. This intrinsic essentially calculates floor(log2(x)) for each element.
- _mm256_
getexp_ โps Experimental avx512f,avx512vl
- Convert the exponent of each packed single-precision (32-bit) floating-point element in a to a single-precision (32-bit) floating-point number representing the integer exponent, and store the results in dst. This intrinsic essentially calculates floor(log2(x)) for each element.
- _mm256_
getmant_ โpd Experimental avx512f,avx512vl
- Normalize the mantissas of packed double-precision (64-bit) floating-point elements in a, and store the results in dst. This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 - _mm256_
getmant_ โps Experimental avx512f,avx512vl
- Normalize the mantissas of packed single-precision (32-bit) floating-point elements in a, and store the results in dst. This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. The mantissa is normalized to the interval specified by interv, which can take the following values: _MM_MANT_NORM_1_2 // interval [1, 2) _MM_MANT_NORM_p5_2 // interval [0.5, 2) _MM_MANT_NORM_p5_1 // interval [0.5, 1) _MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5) The sign is determined by sc which can take the following values: _MM_MANT_SIGN_src // sign = sign(src) _MM_MANT_SIGN_zero // sign = 0 _MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
- _mm256_
i32scatter_ โepi32 Experimental avx512f,avx512vl
- Stores 8 32-bit integer elements from a to memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale
- _mm256_
i32scatter_ โepi64 Experimental avx512f,avx512vl
- Scatter 64-bit integers from a into memory using 32-bit indices. 64-bit elements are stored at addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale). scale should be 1, 2, 4 or 8.
- _mm256_
i32scatter_ โpd Experimental avx512f,avx512vl
- Stores 4 double-precision (64-bit) floating-point elements from a to memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale
- _mm256_
i32scatter_ โps Experimental avx512f,avx512vl
- Stores 8 single-precision (32-bit) floating-point elements from a to memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale
- _mm256_
i64scatter_ โepi32 Experimental avx512f,avx512vl
- Stores 4 32-bit integer elements from a to memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale
- _mm256_
i64scatter_ โepi64 Experimental avx512f,avx512vl
- Stores 4 64-bit integer elements from a to memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale
- _mm256_
i64scatter_ โpd Experimental avx512f,avx512vl
- Stores 4 double-precision (64-bit) floating-point elements from a to memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale
- _mm256_
i64scatter_ โps Experimental avx512f,avx512vl
- Stores 4 single-precision (32-bit) floating-point elements from a to memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale
- _mm256_
insertf32x4 โExperimental avx512f,avx512vl
- Copy a to dst, then insert 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from b into dst at the location specified by imm8.
- _mm256_
inserti32x4 โExperimental avx512f,avx512vl
- Copy a to dst, then insert 128 bits (composed of 4 packed 32-bit integers) from b into dst at the location specified by imm8.
- _mm256_
load_ โepi32 Experimental avx512f,avx512vl
- Load 256-bits (composed of 8 packed 32-bit integers) from memory into dst. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
- _mm256_
load_ โepi64 Experimental avx512f,avx512vl
- Load 256-bits (composed of 4 packed 64-bit integers) from memory into dst. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
- _mm256_
loadu_ โepi32 Experimental avx512f,avx512vl
- Load 256-bits (composed of 8 packed 32-bit integers) from memory into dst. mem_addr does not need to be aligned on any particular boundary.
- _mm256_
loadu_ โepi64 Experimental avx512f,avx512vl
- Load 256-bits (composed of 4 packed 64-bit integers) from memory into dst. mem_addr does not need to be aligned on any particular boundary.
- _mm256_
mask2_ โpermutex2var_ epi32 Experimental avx512f,avx512vl
- Shuffle 32-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from idx when the corresponding mask bit is not set).
- _mm256_
mask2_ โpermutex2var_ epi64 Experimental avx512f,avx512vl
- Shuffle 64-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from idx when the corresponding mask bit is not set).
- _mm256_
mask2_ โpermutex2var_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from idx when the corresponding mask bit is not set)
- _mm256_
mask2_ โpermutex2var_ ps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from idx when the corresponding mask bit is not set).
- _mm256_
mask3_ โfmadd_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm256_
mask3_ โfmadd_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm256_
mask3_ โfmaddsub_ pd Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm256_
mask3_ โfmaddsub_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm256_
mask3_ โfmsub_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm256_
mask3_ โfmsub_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm256_
mask3_ โfmsubadd_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm256_
mask3_ โfmsubadd_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm256_
mask3_ โfnmadd_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm256_
mask3_ โfnmadd_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm256_
mask3_ โfnmsub_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm256_
mask3_ โfnmsub_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm256_
mask_ โabs_ epi32 Experimental avx512f,avx512vl
- Compute the absolute value of packed signed 32-bit integers in a, and store the unsigned results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โabs_ epi64 Experimental avx512f,avx512vl
- Compute the absolute value of packed signed 64-bit integers in a, and store the unsigned results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โadd_ epi32 Experimental avx512f,avx512vl
- Add packed 32-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โadd_ epi64 Experimental avx512f,avx512vl
- Add packed 64-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โadd_ pd Experimental avx512f,avx512vl
- Add packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โadd_ ps Experimental avx512f,avx512vl
- Add packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โalignr_ epi32 Experimental avx512f,avx512vl
- Concatenate a and b into a 64-byte immediate result, shift the result right by imm8 32-bit elements, and store the low 32 bytes (8 elements) in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โalignr_ epi64 Experimental avx512f,avx512vl
- Concatenate a and b into a 64-byte immediate result, shift the result right by imm8 64-bit elements, and store the low 32 bytes (4 elements) in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โand_ epi32 Experimental avx512f,avx512vl
- Performs element-by-element bitwise AND between packed 32-bit integer elements of a and b, storing the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โand_ epi64 Experimental avx512f,avx512vl
- Compute the bitwise AND of packed 64-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โandnot_ epi32 Experimental avx512f,avx512vl
- Compute the bitwise NOT of packed 32-bit integers in a and then AND with b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โandnot_ epi64 Experimental avx512f,avx512vl
- Compute the bitwise NOT of packed 64-bit integers in a and then AND with b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โblend_ epi32 Experimental avx512f,avx512vl
- Blend packed 32-bit integers from a and b using control mask k, and store the results in dst.
- _mm256_
mask_ โblend_ epi64 Experimental avx512f,avx512vl
- Blend packed 64-bit integers from a and b using control mask k, and store the results in dst.
- _mm256_
mask_ โblend_ pd Experimental avx512f,avx512vl
- Blend packed double-precision (64-bit) floating-point elements from a and b using control mask k, and store the results in dst.
- _mm256_
mask_ โblend_ ps Experimental avx512f,avx512vl
- Blend packed single-precision (32-bit) floating-point elements from a and b using control mask k, and store the results in dst.
- _mm256_
mask_ โbroadcast_ f32x4 Experimental avx512f,avx512vl
- Broadcast the 4 packed single-precision (32-bit) floating-point elements from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โbroadcast_ i32x4 Experimental avx512f,avx512vl
- Broadcast the 4 packed 32-bit integers from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โbroadcastd_ epi32 Experimental avx512f,avx512vl
- Broadcast the low packed 32-bit integer from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โbroadcastq_ epi64 Experimental avx512f,avx512vl
- Broadcast the low packed 64-bit integer from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โbroadcastsd_ pd Experimental avx512f,avx512vl
- Broadcast the low double-precision (64-bit) floating-point element from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โbroadcastss_ ps Experimental avx512f,avx512vl
- Broadcast the low single-precision (32-bit) floating-point element from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcmp_ epi32_ mask Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmp_ epi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmp_ epu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmp_ epu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmp_ pd_ mask Experimental avx512f,avx512vl
- Compare packed double-precision (64-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmp_ ps_ mask Experimental avx512f,avx512vl
- Compare packed single-precision (32-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmpeq_ epi32_ mask Experimental avx512f,avx512vl
- Compare packed 32-bit integers in a and b for equality, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmpeq_ epi64_ mask Experimental avx512f,avx512vl
- Compare packed 64-bit integers in a and b for equality, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmpeq_ epu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for equality, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmpeq_ epu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for equality, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmpge_ epi32_ mask Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmpge_ epi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmpge_ epu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmpge_ epu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmpgt_ epi32_ mask Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b for greater-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmpgt_ epi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b for greater-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmpgt_ epu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for greater-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmpgt_ epu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for greater-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmple_ epi32_ mask Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b for less-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmple_ epi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b for less-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmple_ epu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for less-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmple_ epu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for less-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmplt_ epi32_ mask Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b for less-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmplt_ epi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b for less-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmplt_ epu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for less-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmplt_ epu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for less-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmpneq_ epi32_ mask Experimental avx512f,avx512vl
- Compare packed 32-bit integers in a and b for not-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmpneq_ epi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b for not-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmpneq_ epu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for not-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcmpneq_ epu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for not-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
mask_ โcompress_ epi32 Experimental avx512f,avx512vl
- Contiguously store the active 32-bit integers in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src.
- _mm256_
mask_ โcompress_ epi64 Experimental avx512f,avx512vl
- Contiguously store the active 64-bit integers in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src.
- _mm256_
mask_ โcompress_ pd Experimental avx512f,avx512vl
- Contiguously store the active double-precision (64-bit) floating-point elements in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src.
- _mm256_
mask_ โcompress_ ps Experimental avx512f,avx512vl
- Contiguously store the active single-precision (32-bit) floating-point elements in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src.
- _mm256_
mask_ โcompressstoreu_ epi32 Experimental avx512f,avx512vl
- Contiguously store the active 32-bit integers in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm256_
mask_ โcompressstoreu_ epi64 Experimental avx512f,avx512vl
- Contiguously store the active 64-bit integers in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm256_
mask_ โcompressstoreu_ pd Experimental avx512f,avx512vl
- Contiguously store the active double-precision (64-bit) floating-point elements in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm256_
mask_ โcompressstoreu_ ps Experimental avx512f,avx512vl
- Contiguously store the active single-precision (32-bit) floating-point elements in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm256_
mask_ โcvt_ roundps_ ph Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed half-precision (16-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of: - _mm256_
mask_ โcvtepi8_ epi32 Experimental avx512f,avx512vl
- Sign extend packed 8-bit integers in a to packed 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtepi8_ epi64 Experimental avx512f,avx512vl
- Sign extend packed 8-bit integers in the low 4 bytes of a to packed 64-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtepi16_ epi32 Experimental avx512f,avx512vl
- Sign extend packed 16-bit integers in a to packed 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtepi16_ epi64 Experimental avx512f,avx512vl
- Sign extend packed 16-bit integers in a to packed 64-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtepi32_ epi8 Experimental avx512f,avx512vl
- Convert packed 32-bit integers in a to packed 8-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtepi32_ epi16 Experimental avx512f,avx512vl
- Convert packed 32-bit integers in a to packed 16-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtepi32_ epi64 Experimental avx512f,avx512vl
- Sign extend packed 32-bit integers in a to packed 64-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtepi32_ pd Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtepi32_ ps Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtepi32_ storeu_ epi8 Experimental avx512f,avx512vl
- Convert packed 32-bit integers in a to packed 8-bit integers with truncation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm256_
mask_ โcvtepi32_ storeu_ epi16 Experimental avx512f,avx512vl
- Convert packed 32-bit integers in a to packed 16-bit integers with truncation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm256_
mask_ โcvtepi64_ epi8 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 8-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtepi64_ epi16 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 16-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtepi64_ epi32 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 32-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtepi64_ storeu_ epi8 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 8-bit integers with truncation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm256_
mask_ โcvtepi64_ storeu_ epi16 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 16-bit integers with truncation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm256_
mask_ โcvtepi64_ storeu_ epi32 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 32-bit integers with truncation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm256_
mask_ โcvtepu8_ epi32 Experimental avx512f,avx512vl
- Zero extend packed unsigned 8-bit integers in the low 8 bytes of a to packed 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtepu8_ epi64 Experimental avx512f,avx512vl
- Zero extend packed unsigned 8-bit integers in the low 4 bytes of a to packed 64-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtepu16_ epi32 Experimental avx512f,avx512vl
- Zero extend packed unsigned 16-bit integers in a to packed 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtepu16_ epi64 Experimental avx512f,avx512vl
- Zero extend packed unsigned 16-bit integers in the low 8 bytes of a to packed 64-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtepu32_ epi64 Experimental avx512f,avx512vl
- Zero extend packed unsigned 32-bit integers in a to packed 64-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtepu32_ pd Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtpd_ epi32 Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtpd_ epu32 Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtpd_ ps Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtph_ ps Experimental avx512f,avx512vl
- Convert packed half-precision (16-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtps_ epi32 Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtps_ epu32 Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtps_ ph Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed half-precision (16-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm256_
mask_ โcvtsepi32_ epi8 Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed 8-bit integers with signed saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtsepi32_ epi16 Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed 16-bit integers with signed saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtsepi32_ storeu_ epi8 Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed 8-bit integers with signed saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm256_
mask_ โcvtsepi32_ storeu_ epi16 Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed 16-bit integers with signed saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm256_
mask_ โcvtsepi64_ epi8 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 8-bit integers with signed saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtsepi64_ epi16 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 16-bit integers with signed saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtsepi64_ epi32 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 32-bit integers with signed saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtsepi64_ storeu_ epi8 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 8-bit integers with signed saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm256_
mask_ โcvtsepi64_ storeu_ epi16 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 16-bit integers with signed saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm256_
mask_ โcvtsepi64_ storeu_ epi32 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 32-bit integers with signed saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm256_
mask_ โcvttpd_ epi32 Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvttpd_ epu32 Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvttps_ epi32 Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvttps_ epu32 Experimental avx512f,avx512vl
- Convert packed double-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtusepi32_ epi8 Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed unsigned 8-bit integers with unsigned saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtusepi32_ epi16 Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed unsigned 16-bit integers with unsigned saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtusepi32_ storeu_ epi8 Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed 8-bit integers with unsigned saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm256_
mask_ โcvtusepi32_ storeu_ epi16 Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed unsigned 16-bit integers with unsigned saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm256_
mask_ โcvtusepi64_ epi8 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed unsigned 8-bit integers with unsigned saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtusepi64_ epi16 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed unsigned 16-bit integers with unsigned saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtusepi64_ epi32 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed unsigned 32-bit integers with unsigned saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โcvtusepi64_ storeu_ epi8 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed 8-bit integers with unsigned saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm256_
mask_ โcvtusepi64_ storeu_ epi16 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed 16-bit integers with unsigned saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm256_
mask_ โcvtusepi64_ storeu_ epi32 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed 32-bit integers with unsigned saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm256_
mask_ โdiv_ pd Experimental avx512f,avx512vl
- Divide packed double-precision (64-bit) floating-point elements in a by packed elements in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โdiv_ ps Experimental avx512f,avx512vl
- Divide packed single-precision (32-bit) floating-point elements in a by packed elements in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โexpand_ epi32 Experimental avx512f,avx512vl
- Load contiguous active 32-bit integers from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โexpand_ epi64 Experimental avx512f,avx512vl
- Load contiguous active 64-bit integers from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โexpand_ pd Experimental avx512f,avx512vl
- Load contiguous active double-precision (64-bit) floating-point elements from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โexpand_ ps Experimental avx512f,avx512vl
- Load contiguous active single-precision (32-bit) floating-point elements from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โexpandloadu_ epi32 Experimental avx512f,avx512vl
- Load contiguous active 32-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โexpandloadu_ epi64 Experimental avx512f,avx512vl
- Load contiguous active 64-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โexpandloadu_ pd Experimental avx512f,avx512vl
- Load contiguous active double-precision (64-bit) floating-point elements from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โexpandloadu_ ps Experimental avx512f,avx512vl
- Load contiguous active single-precision (32-bit) floating-point elements from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โextractf32x4_ ps Experimental avx512f,avx512vl
- Extract 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from a, selected with imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โextracti32x4_ epi32 Experimental avx512f,avx512vl
- Extract 128 bits (composed of 4 packed 32-bit integers) from a, selected with IMM1, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โfixupimm_ pd Experimental avx512f,avx512vl
- Fix up packed double-precision (64-bit) floating-point elements in a and b using packed 64-bit integers in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). imm8 is used to set the required flags reporting.
- _mm256_
mask_ โfixupimm_ ps Experimental avx512f,avx512vl
- Fix up packed single-precision (32-bit) floating-point elements in a and b using packed 32-bit integers in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). imm8 is used to set the required flags reporting.
- _mm256_
mask_ โfmadd_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm256_
mask_ โfmadd_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm256_
mask_ โfmaddsub_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm256_
mask_ โfmaddsub_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm256_
mask_ โfmsub_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm256_
mask_ โfmsub_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm256_
mask_ โfmsubadd_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm256_
mask_ โfmsubadd_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm256_
mask_ โfnmadd_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm256_
mask_ โfnmadd_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm256_
mask_ โfnmsub_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm256_
mask_ โfnmsub_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm256_
mask_ โgetexp_ pd Experimental avx512f,avx512vl
- Convert the exponent of each packed double-precision (64-bit) floating-point element in a to a double-precision (64-bit) floating-point number representing the integer exponent, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element.
- _mm256_
mask_ โgetexp_ ps Experimental avx512f,avx512vl
- Convert the exponent of each packed single-precision (32-bit) floating-point element in a to a single-precision (32-bit) floating-point number representing the integer exponent, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element.
- _mm256_
mask_ โgetmant_ pd Experimental avx512f,avx512vl
- Normalize the mantissas of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 - _mm256_
mask_ โgetmant_ ps Experimental avx512f,avx512vl
- Normalize the mantissas of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 - _mm256_
mask_ โi32scatter_ epi32 Experimental avx512f,avx512vl
- Stores 8 32-bit integer elements from a to memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale using writemask k (elements whose corresponding mask bit is not set are not written to memory).
- _mm256_
mask_ โi32scatter_ epi64 Experimental avx512f,avx512vl
- Stores 4 64-bit integer elements from a to memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale using writemask k (elements whose corresponding mask bit is not set are not written to memory).
- _mm256_
mask_ โi32scatter_ pd Experimental avx512f,avx512vl
- Stores 4 double-precision (64-bit) floating-point elements from a to memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale using writemask k (elements whose corresponding mask bit is not set are not written to memory).
- _mm256_
mask_ โi32scatter_ ps Experimental avx512f,avx512vl
- Stores 8 single-precision (32-bit) floating-point elements from a to memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale using writemask k (elements whose corresponding mask bit is not set are not written to memory).
- _mm256_
mask_ โi64scatter_ epi32 Experimental avx512f,avx512vl
- Stores 4 32-bit integer elements from a to memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale using writemask k (elements whose corresponding mask bit is not set are not written to memory).
- _mm256_
mask_ โi64scatter_ epi64 Experimental avx512f,avx512vl
- Stores 4 64-bit integer elements from a to memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale using writemask k (elements whose corresponding mask bit is not set are not written to memory).
- _mm256_
mask_ โi64scatter_ pd Experimental avx512f,avx512vl
- Stores 4 double-precision (64-bit) floating-point elements from a to memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale using writemask k (elements whose corresponding mask bit is not set are not written to memory).
- _mm256_
mask_ โi64scatter_ ps Experimental avx512f,avx512vl
- Stores 4 single-precision (32-bit) floating-point elements from a to memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale using writemask k (elements whose corresponding mask bit is not set are not written to memory).
- _mm256_
mask_ โinsertf32x4 Experimental avx512f,avx512vl
- Copy a to tmp, then insert 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from b into tmp at the location specified by imm8. Store tmp to dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โinserti32x4 Experimental avx512f,avx512vl
- Copy a to tmp, then insert 128 bits (composed of 4 packed 32-bit integers) from b into tmp at the location specified by imm8. Store tmp to dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โload_ epi32 Experimental avx512f,avx512vl
- Load packed 32-bit integers from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
- _mm256_
mask_ โload_ epi64 Experimental avx512f,avx512vl
- Load packed 64-bit integers from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
- _mm256_
mask_ โload_ pd Experimental avx512f,avx512vl
- Load packed double-precision (64-bit) floating-point elements from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
- _mm256_
mask_ โload_ ps Experimental avx512f,avx512vl
- Load packed single-precision (32-bit) floating-point elements from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
- _mm256_
mask_ โloadu_ epi32 Experimental avx512f,avx512vl
- Load packed 32-bit integers from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm256_
mask_ โloadu_ epi64 Experimental avx512f,avx512vl
- Load packed 64-bit integers from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm256_
mask_ โloadu_ pd Experimental avx512f,avx512vl
- Load packed double-precision (64-bit) floating-point elements from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm256_
mask_ โloadu_ ps Experimental avx512f,avx512vl
- Load packed single-precision (32-bit) floating-point elements from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm256_
mask_ โmax_ epi32 Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmax_ epi64 Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmax_ epu32 Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmax_ epu64 Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmax_ pd Experimental avx512f,avx512vl
- Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmax_ ps Experimental avx512f,avx512vl
- Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmin_ epi32 Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmin_ epi64 Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmin_ epu32 Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmin_ epu64 Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmin_ pd Experimental avx512f,avx512vl
- Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmin_ ps Experimental avx512f,avx512vl
- Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmov_ epi32 Experimental avx512f,avx512vl
- Move packed 32-bit integers from a to dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmov_ epi64 Experimental avx512f,avx512vl
- Move packed 64-bit integers from a to dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmov_ pd Experimental avx512f,avx512vl
- Move packed double-precision (64-bit) floating-point elements from a to dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmov_ ps Experimental avx512f,avx512vl
- Move packed single-precision (32-bit) floating-point elements from a to dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmovedup_ pd Experimental avx512f,avx512vl
- Duplicate even-indexed double-precision (64-bit) floating-point elements from a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmovehdup_ ps Experimental avx512f,avx512vl
- Duplicate odd-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmoveldup_ ps Experimental avx512f,avx512vl
- Duplicate even-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmul_ epi32 Experimental avx512f,avx512vl
- Multiply the low signed 32-bit integers from each packed 64-bit element in a and b, and store the signed 64-bit results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmul_ epu32 Experimental avx512f,avx512vl
- Multiply the low unsigned 32-bit integers from each packed 64-bit element in a and b, and store the unsigned 64-bit results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmul_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmul_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โmullo_ epi32 Experimental avx512f,avx512vl
- Multiply the packed 32-bit integers in a and b, producing intermediate 64-bit integers, and store the low 32 bits of the intermediate integers in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โor_ epi32 Experimental avx512f,avx512vl
- Compute the bitwise OR of packed 32-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โor_ epi64 Experimental avx512f,avx512vl
- Compute the bitwise OR of packed 64-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โpermute_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โpermute_ ps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โpermutevar_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โpermutevar_ ps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โpermutex2var_ epi32 Experimental avx512f,avx512vl
- Shuffle 32-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm256_
mask_ โpermutex2var_ epi64 Experimental avx512f,avx512vl
- Shuffle 64-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm256_
mask_ โpermutex2var_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm256_
mask_ โpermutex2var_ ps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm256_
mask_ โpermutex_ epi64 Experimental avx512f,avx512vl
- Shuffle 64-bit integers in a within 256-bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โpermutex_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a within 256-bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โpermutexvar_ epi32 Experimental avx512f,avx512vl
- Shuffle 32-bit integers in a across lanes using the corresponding index in idx, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โpermutexvar_ epi64 Experimental avx512f,avx512vl
- Shuffle 64-bit integers in a across lanes using the corresponding index in idx, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โpermutexvar_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a across lanes using the corresponding index in idx, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โpermutexvar_ ps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a across lanes using the corresponding index in idx, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โrcp14_ pd Experimental avx512f,avx512vl
- Compute the approximate reciprocal of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm256_
mask_ โrcp14_ ps Experimental avx512f,avx512vl
- Compute the approximate reciprocal of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm256_
mask_ โrol_ epi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the left by the number of bits specified in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โrol_ epi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the left by the number of bits specified in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โrolv_ epi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โrolv_ epi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โror_ epi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the right by the number of bits specified in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โror_ epi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the right by the number of bits specified in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โrorv_ epi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โrorv_ epi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โroundscale_ pd Experimental avx512f,avx512vl
- Round packed double-precision (64-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm256_
mask_ โroundscale_ ps Experimental avx512f,avx512vl
- Round packed single-precision (32-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm256_
mask_ โrsqrt14_ pd Experimental avx512f,avx512vl
- Compute the approximate reciprocal square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm256_
mask_ โrsqrt14_ ps Experimental avx512f,avx512vl
- Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm256_
mask_ โscalef_ pd Experimental avx512f,avx512vl
- Scale the packed double-precision (64-bit) floating-point elements in a using values from b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โscalef_ ps Experimental avx512f,avx512vl
- Scale the packed single-precision (32-bit) floating-point elements in a using values from b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โset1_ epi32 Experimental avx512f,avx512vl
- Broadcast 32-bit integer a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โset1_ epi64 Experimental avx512f,avx512vl
- Broadcast 64-bit integer a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โshuffle_ epi32 Experimental avx512f,avx512vl
- Shuffle 32-bit integers in a within 128-bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โshuffle_ f32x4 Experimental avx512f,avx512vl
- Shuffle 128-bits (composed of 4 single-precision (32-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โshuffle_ f64x2 Experimental avx512f,avx512vl
- Shuffle 128-bits (composed of 2 double-precision (64-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โshuffle_ i32x4 Experimental avx512f,avx512vl
- Shuffle 128-bits (composed of 4 32-bit integers) selected by imm8 from a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โshuffle_ i64x2 Experimental avx512f,avx512vl
- Shuffle 128-bits (composed of 2 64-bit integers) selected by imm8 from a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โshuffle_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements within 128-bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โshuffle_ ps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsll_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a left by count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsll_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a left by count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โslli_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a left by imm8 while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โslli_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a left by imm8 while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsllv_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsllv_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsqrt_ pd Experimental avx512f,avx512vl
- Compute the square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsqrt_ ps Experimental avx512f,avx512vl
- Compute the square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsra_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by count while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsra_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by count while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsrai_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by imm8 while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsrai_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by imm8 while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsrav_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsrav_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsrl_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsrl_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsrli_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by imm8 while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsrli_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by imm8 while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsrlv_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsrlv_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โstore_ epi32 Experimental avx512f,avx512vl
- Store packed 32-bit integers from a into memory using writemask k. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
- _mm256_
mask_ โstore_ epi64 Experimental avx512f,avx512vl
- Store packed 64-bit integers from a into memory using writemask k. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
- _mm256_
mask_ โstore_ pd Experimental avx512f,avx512vl
- Store packed double-precision (64-bit) floating-point elements from a into memory using writemask k. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
- _mm256_
mask_ โstore_ ps Experimental avx512f,avx512vl
- Store packed single-precision (32-bit) floating-point elements from a into memory using writemask k. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
- _mm256_
mask_ โstoreu_ epi32 Experimental avx512f,avx512vl
- Store packed 32-bit integers from a into memory using writemask k. mem_addr does not need to be aligned on any particular boundary.
- _mm256_
mask_ โstoreu_ epi64 Experimental avx512f,avx512vl
- Store packed 64-bit integers from a into memory using writemask k. mem_addr does not need to be aligned on any particular boundary.
- _mm256_
mask_ โstoreu_ pd Experimental avx512f,avx512vl
- Store packed double-precision (64-bit) floating-point elements from a into memory using writemask k. mem_addr does not need to be aligned on any particular boundary.
- _mm256_
mask_ โstoreu_ ps Experimental avx512f,avx512vl
- Store packed single-precision (32-bit) floating-point elements from a into memory using writemask k. mem_addr does not need to be aligned on any particular boundary.
- _mm256_
mask_ โsub_ epi32 Experimental avx512f,avx512vl
- Subtract packed 32-bit integers in b from packed 32-bit integers in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsub_ epi64 Experimental avx512f,avx512vl
- Subtract packed 64-bit integers in b from packed 64-bit integers in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsub_ pd Experimental avx512f,avx512vl
- Subtract packed double-precision (64-bit) floating-point elements in b from packed double-precision (64-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โsub_ ps Experimental avx512f,avx512vl
- Subtract packed single-precision (32-bit) floating-point elements in b from packed single-precision (32-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โternarylogic_ epi32 Experimental avx512f,avx512vl
- Bitwise ternary logic that provides the capability to implement any three-operand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 32-bit integer, the corresponding bit from src, a, and b are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst using writemask k at 32-bit granularity (32-bit elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โternarylogic_ epi64 Experimental avx512f,avx512vl
- Bitwise ternary logic that provides the capability to implement any three-operand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 64-bit integer, the corresponding bit from src, a, and b are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst using writemask k at 64-bit granularity (64-bit elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โtest_ epi32_ mask Experimental avx512f,avx512vl
- Compute the bitwise AND of packed 32-bit integers in a and b, producing intermediate 32-bit values, and set the corresponding bit in result mask k (subject to writemask k) if the intermediate value is non-zero.
- _mm256_
mask_ โtest_ epi64_ mask Experimental avx512f,avx512vl
- Compute the bitwise AND of packed 64-bit integers in a and b, producing intermediate 64-bit values, and set the corresponding bit in result mask k (subject to writemask k) if the intermediate value is non-zero.
- _mm256_
mask_ โtestn_ epi32_ mask Experimental avx512f,avx512vl
- Compute the bitwise NAND of packed 32-bit integers in a and b, producing intermediate 32-bit values, and set the corresponding bit in result mask k (subject to writemask k) if the intermediate value is zero.
- _mm256_
mask_ โtestn_ epi64_ mask Experimental avx512f,avx512vl
- Compute the bitwise NAND of packed 64-bit integers in a and b, producing intermediate 64-bit values, and set the corresponding bit in result mask k (subject to writemask k) if the intermediate value is zero.
- _mm256_
mask_ โunpackhi_ epi32 Experimental avx512f,avx512vl
- Unpack and interleave 32-bit integers from the high half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โunpackhi_ epi64 Experimental avx512f,avx512vl
- Unpack and interleave 64-bit integers from the high half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โunpackhi_ pd Experimental avx512f,avx512vl
- Unpack and interleave double-precision (64-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โunpackhi_ ps Experimental avx512f,avx512vl
- Unpack and interleave single-precision (32-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โunpacklo_ epi32 Experimental avx512f,avx512vl
- Unpack and interleave 32-bit integers from the low half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โunpacklo_ epi64 Experimental avx512f,avx512vl
- Unpack and interleave 64-bit integers from the low half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โunpacklo_ pd Experimental avx512f,avx512vl
- Unpack and interleave double-precision (64-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โunpacklo_ ps Experimental avx512f,avx512vl
- Unpack and interleave single-precision (32-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โxor_ epi32 Experimental avx512f,avx512vl
- Compute the bitwise XOR of packed 32-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mask_ โxor_ epi64 Experimental avx512f,avx512vl
- Compute the bitwise XOR of packed 64-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
maskz_ โabs_ epi32 Experimental avx512f,avx512vl
- Compute the absolute value of packed signed 32-bit integers in a, and store the unsigned results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โabs_ epi64 Experimental avx512f,avx512vl
- Compute the absolute value of packed signed 64-bit integers in a, and store the unsigned results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โadd_ epi32 Experimental avx512f,avx512vl
- Add packed 32-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โadd_ epi64 Experimental avx512f,avx512vl
- Add packed 64-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โadd_ pd Experimental avx512f,avx512vl
- Add packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โadd_ ps Experimental avx512f,avx512vl
- Add packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โalignr_ epi32 Experimental avx512f,avx512vl
- Concatenate a and b into a 64-byte immediate result, shift the result right by imm8 32-bit elements, and store the low 32 bytes (8 elements) in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โalignr_ epi64 Experimental avx512f,avx512vl
- Concatenate a and b into a 64-byte immediate result, shift the result right by imm8 64-bit elements, and store the low 32 bytes (4 elements) in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โand_ epi32 Experimental avx512f,avx512vl
- Compute the bitwise AND of packed 32-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โand_ epi64 Experimental avx512f,avx512vl
- Compute the bitwise AND of packed 64-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โandnot_ epi32 Experimental avx512f,avx512vl
- Compute the bitwise NOT of packed 32-bit integers in a and then AND with b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โandnot_ epi64 Experimental avx512f,avx512vl
- Compute the bitwise NOT of packed 64-bit integers in a and then AND with b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โbroadcast_ f32x4 Experimental avx512f,avx512vl
- Broadcast the 4 packed single-precision (32-bit) floating-point elements from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โbroadcast_ i32x4 Experimental avx512f,avx512vl
- Broadcast the 4 packed 32-bit integers from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โbroadcastd_ epi32 Experimental avx512f,avx512vl
- Broadcast the low packed 32-bit integer from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โbroadcastq_ epi64 Experimental avx512f,avx512vl
- Broadcast the low packed 64-bit integer from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โbroadcastsd_ pd Experimental avx512f,avx512vl
- Broadcast the low double-precision (64-bit) floating-point element from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โbroadcastss_ ps Experimental avx512f,avx512vl
- Broadcast the low single-precision (32-bit) floating-point element from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcompress_ epi32 Experimental avx512f,avx512vl
- Contiguously store the active 32-bit integers in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero.
- _mm256_
maskz_ โcompress_ epi64 Experimental avx512f,avx512vl
- Contiguously store the active 64-bit integers in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero.
- _mm256_
maskz_ โcompress_ pd Experimental avx512f,avx512vl
- Contiguously store the active double-precision (64-bit) floating-point elements in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero.
- _mm256_
maskz_ โcompress_ ps Experimental avx512f,avx512vl
- Contiguously store the active single-precision (32-bit) floating-point elements in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero.
- _mm256_
maskz_ โcvt_ roundps_ ph Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed half-precision (16-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm256_
maskz_ โcvtepi8_ epi32 Experimental avx512f,avx512vl
- Sign extend packed 8-bit integers in a to packed 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtepi8_ epi64 Experimental avx512f,avx512vl
- Sign extend packed 8-bit integers in the low 4 bytes of a to packed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtepi16_ epi32 Experimental avx512f,avx512vl
- Sign extend packed 16-bit integers in a to packed 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtepi16_ epi64 Experimental avx512f,avx512vl
- Sign extend packed 16-bit integers in a to packed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtepi32_ epi8 Experimental avx512f,avx512vl
- Convert packed 32-bit integers in a to packed 8-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtepi32_ epi16 Experimental avx512f,avx512vl
- Convert packed 32-bit integers in a to packed 16-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtepi32_ epi64 Experimental avx512f,avx512vl
- Sign extend packed 32-bit integers in a to packed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtepi32_ pd Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtepi32_ ps Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtepi64_ epi8 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 8-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtepi64_ epi16 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 16-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtepi64_ epi32 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 32-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtepu8_ epi32 Experimental avx512f,avx512vl
- Zero extend packed unsigned 8-bit integers in the low 8 bytes of a to packed 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtepu8_ epi64 Experimental avx512f,avx512vl
- Zero extend packed unsigned 8-bit integers in the low 4 bytes of a to packed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtepu16_ epi32 Experimental avx512f,avx512vl
- Zero extend packed unsigned 16-bit integers in a to packed 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtepu16_ epi64 Experimental avx512f,avx512vl
- Zero extend packed unsigned 16-bit integers in the low 8 bytes of a to packed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtepu32_ epi64 Experimental avx512f,avx512vl
- Zero extend packed unsigned 32-bit integers in a to packed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtepu32_ pd Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtpd_ epi32 Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtpd_ epu32 Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtpd_ ps Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtph_ ps Experimental avx512f,avx512vl
- Convert packed half-precision (16-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtps_ epi32 Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtps_ epu32 Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtps_ ph Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed half-precision (16-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm256_
maskz_ โcvtsepi32_ epi8 Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed 8-bit integers with signed saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtsepi32_ epi16 Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed 16-bit integers with signed saturation, and store the results in dst.
- _mm256_
maskz_ โcvtsepi64_ epi8 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 8-bit integers with signed saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtsepi64_ epi16 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 16-bit integers with signed saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtsepi64_ epi32 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 32-bit integers with signed saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvttpd_ epi32 Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvttpd_ epu32 Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvttps_ epi32 Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvttps_ epu32 Experimental avx512f,avx512vl
- Convert packed double-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtusepi32_ epi8 Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed unsigned 8-bit integers with unsigned saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtusepi32_ epi16 Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed unsigned 16-bit integers with unsigned saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtusepi64_ epi8 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed unsigned 8-bit integers with unsigned saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtusepi64_ epi16 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed unsigned 16-bit integers with unsigned saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โcvtusepi64_ epi32 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed unsigned 32-bit integers with unsigned saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โdiv_ pd Experimental avx512f,avx512vl
- Divide packed double-precision (64-bit) floating-point elements in a by packed elements in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โdiv_ ps Experimental avx512f,avx512vl
- Divide packed single-precision (32-bit) floating-point elements in a by packed elements in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โexpand_ epi32 Experimental avx512f,avx512vl
- Load contiguous active 32-bit integers from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โexpand_ epi64 Experimental avx512f,avx512vl
- Load contiguous active 64-bit integers from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โexpand_ pd Experimental avx512f,avx512vl
- Load contiguous active double-precision (64-bit) floating-point elements from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โexpand_ ps Experimental avx512f,avx512vl
- Load contiguous active single-precision (32-bit) floating-point elements from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โexpandloadu_ epi32 Experimental avx512f,avx512vl
- Load contiguous active 32-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โexpandloadu_ epi64 Experimental avx512f,avx512vl
- Load contiguous active 64-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โexpandloadu_ pd Experimental avx512f,avx512vl
- Load contiguous active double-precision (64-bit) floating-point elements from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โexpandloadu_ ps Experimental avx512f,avx512vl
- Load contiguous active single-precision (32-bit) floating-point elements from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โextractf32x4_ ps Experimental avx512f,avx512vl
- Extract 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from a, selected with imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โextracti32x4_ epi32 Experimental avx512f,avx512vl
- Extract 128 bits (composed of 4 packed 32-bit integers) from a, selected with IMM1, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โfixupimm_ pd Experimental avx512f,avx512vl
- Fix up packed double-precision (64-bit) floating-point elements in a and b using packed 64-bit integers in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). imm8 is used to set the required flags reporting.
- _mm256_
maskz_ โfixupimm_ ps Experimental avx512f,avx512vl
- Fix up packed single-precision (32-bit) floating-point elements in a and b using packed 32-bit integers in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). imm8 is used to set the required flags reporting.
- _mm256_
maskz_ โfmadd_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โfmadd_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โfmaddsub_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โfmaddsub_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โfmsub_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โfmsub_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โfmsubadd_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โfmsubadd_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โfnmadd_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โfnmadd_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โfnmsub_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โfnmsub_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โgetexp_ pd Experimental avx512f,avx512vl
- Convert the exponent of each packed double-precision (64-bit) floating-point element in a to a double-precision (64-bit) floating-point number representing the integer exponent, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element.
- _mm256_
maskz_ โgetexp_ ps Experimental avx512f,avx512vl
- Convert the exponent of each packed single-precision (32-bit) floating-point element in a to a single-precision (32-bit) floating-point number representing the integer exponent, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element.
- _mm256_
maskz_ โgetmant_ pd Experimental avx512f,avx512vl
- Normalize the mantissas of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 - _mm256_
maskz_ โgetmant_ ps Experimental avx512f,avx512vl
- Normalize the mantissas of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 - _mm256_
maskz_ โinsertf32x4 Experimental avx512f,avx512vl
- Copy a to tmp, then insert 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from b into tmp at the location specified by imm8. Store tmp to dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โinserti32x4 Experimental avx512f,avx512vl
- Copy a to tmp, then insert 128 bits (composed of 4 packed 32-bit integers) from b into tmp at the location specified by imm8. Store tmp to dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โload_ epi32 Experimental avx512f,avx512vl
- Load packed 32-bit integers from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
- _mm256_
maskz_ โload_ epi64 Experimental avx512f,avx512vl
- Load packed 64-bit integers from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
- _mm256_
maskz_ โload_ pd Experimental avx512f,avx512vl
- Load packed double-precision (64-bit) floating-point elements from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
- _mm256_
maskz_ โload_ ps Experimental avx512f,avx512vl
- Load packed single-precision (32-bit) floating-point elements from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
- _mm256_
maskz_ โloadu_ epi32 Experimental avx512f,avx512vl
- Load packed 32-bit integers from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm256_
maskz_ โloadu_ epi64 Experimental avx512f,avx512vl
- Load packed 64-bit integers from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm256_
maskz_ โloadu_ pd Experimental avx512f,avx512vl
- Load packed double-precision (64-bit) floating-point elements from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm256_
maskz_ โloadu_ ps Experimental avx512f,avx512vl
- Load packed single-precision (32-bit) floating-point elements from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm256_
maskz_ โmax_ epi32 Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmax_ epi64 Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmax_ epu32 Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmax_ epu64 Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmax_ pd Experimental avx512f,avx512vl
- Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmax_ ps Experimental avx512f,avx512vl
- Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmin_ epi32 Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmin_ epi64 Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmin_ epu32 Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmin_ epu64 Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmin_ pd Experimental avx512f,avx512vl
- Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmin_ ps Experimental avx512f,avx512vl
- Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmov_ epi32 Experimental avx512f,avx512vl
- Move packed 32-bit integers from a into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmov_ epi64 Experimental avx512f,avx512vl
- Move packed 64-bit integers from a into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmov_ pd Experimental avx512f,avx512vl
- Move packed double-precision (64-bit) floating-point elements from a into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmov_ ps Experimental avx512f,avx512vl
- Move packed single-precision (32-bit) floating-point elements from a into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmovedup_ pd Experimental avx512f,avx512vl
- Duplicate even-indexed double-precision (64-bit) floating-point elements from a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmovehdup_ ps Experimental avx512f,avx512vl
- Duplicate odd-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmoveldup_ ps Experimental avx512f,avx512vl
- Duplicate even-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmul_ epi32 Experimental avx512f,avx512vl
- Multiply the low signed 32-bit integers from each packed 64-bit element in a and b, and store the signed 64-bit results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmul_ epu32 Experimental avx512f,avx512vl
- Multiply the low unsigned 32-bit integers from each packed 64-bit element in a and b, and store the unsigned 64-bit results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmul_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmul_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โmullo_ epi32 Experimental avx512f,avx512vl
- Multiply the packed 32-bit integers in a and b, producing intermediate 64-bit integers, and store the low 32 bits of the intermediate integers in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โor_ epi32 Experimental avx512f,avx512vl
- Compute the bitwise OR of packed 32-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โor_ epi64 Experimental avx512f,avx512vl
- Compute the bitwise OR of packed 64-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โpermute_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โpermute_ ps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โpermutevar_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โpermutevar_ ps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โpermutex2var_ epi32 Experimental avx512f,avx512vl
- Shuffle 32-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โpermutex2var_ epi64 Experimental avx512f,avx512vl
- Shuffle 64-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โpermutex2var_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โpermutex2var_ ps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โpermutex_ epi64 Experimental avx512f,avx512vl
- Shuffle 64-bit integers in a within 256-bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โpermutex_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a within 256-bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โpermutexvar_ epi32 Experimental avx512f,avx512vl
- Shuffle 32-bit integers in a across lanes using the corresponding index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โpermutexvar_ epi64 Experimental avx512f,avx512vl
- Shuffle 64-bit integers in a across lanes using the corresponding index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โpermutexvar_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a across lanes using the corresponding index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โpermutexvar_ ps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a across lanes using the corresponding index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โrcp14_ pd Experimental avx512f,avx512vl
- Compute the approximate reciprocal of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm256_
maskz_ โrcp14_ ps Experimental avx512f,avx512vl
- Compute the approximate reciprocal of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm256_
maskz_ โrol_ epi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the left by the number of bits specified in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โrol_ epi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the left by the number of bits specified in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โrolv_ epi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โrolv_ epi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โror_ epi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the right by the number of bits specified in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โror_ epi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the right by the number of bits specified in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โrorv_ epi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โrorv_ epi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โroundscale_ pd Experimental avx512f,avx512vl
- Round packed double-precision (64-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm256_
maskz_ โroundscale_ ps Experimental avx512f,avx512vl
- Round packed single-precision (32-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm256_
maskz_ โrsqrt14_ pd Experimental avx512f,avx512vl
- Compute the approximate reciprocal square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm256_
maskz_ โrsqrt14_ ps Experimental avx512f,avx512vl
- Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm256_
maskz_ โscalef_ pd Experimental avx512f,avx512vl
- Scale the packed double-precision (64-bit) floating-point elements in a using values from b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โscalef_ ps Experimental avx512f,avx512vl
- Scale the packed single-precision (32-bit) floating-point elements in a using values from b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โset1_ epi32 Experimental avx512f,avx512vl
- Broadcast 32-bit integer a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โset1_ epi64 Experimental avx512f,avx512vl
- Broadcast 64-bit integer a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โshuffle_ epi32 Experimental avx512f,avx512vl
- Shuffle 32-bit integers in a within 128-bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โshuffle_ f32x4 Experimental avx512f,avx512vl
- Shuffle 128-bits (composed of 4 single-precision (32-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โshuffle_ f64x2 Experimental avx512f,avx512vl
- Shuffle 128-bits (composed of 2 double-precision (64-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โshuffle_ i32x4 Experimental avx512f,avx512vl
- Shuffle 128-bits (composed of 4 32-bit integers) selected by imm8 from a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โshuffle_ i64x2 Experimental avx512f,avx512vl
- Shuffle 128-bits (composed of 2 64-bit integers) selected by imm8 from a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โshuffle_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements within 128-bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โshuffle_ ps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsll_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a left by count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsll_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a left by count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โslli_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a left by imm8 while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โslli_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a left by imm8 while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsllv_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsllv_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsqrt_ pd Experimental avx512f,avx512vl
- Compute the square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsqrt_ ps Experimental avx512f,avx512vl
- Compute the square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsra_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by count while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsra_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by count while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsrai_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by imm8 while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsrai_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by imm8 while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsrav_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsrav_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsrl_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsrl_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsrli_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by imm8 while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsrli_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by imm8 while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsrlv_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsrlv_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsub_ epi32 Experimental avx512f,avx512vl
- Subtract packed 32-bit integers in b from packed 32-bit integers in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsub_ epi64 Experimental avx512f,avx512vl
- Subtract packed 64-bit integers in b from packed 64-bit integers in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsub_ pd Experimental avx512f,avx512vl
- Subtract packed double-precision (64-bit) floating-point elements in b from packed double-precision (64-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โsub_ ps Experimental avx512f,avx512vl
- Subtract packed single-precision (32-bit) floating-point elements in b from packed single-precision (32-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โternarylogic_ epi32 Experimental avx512f,avx512vl
- Bitwise ternary logic that provides the capability to implement any three-operand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 32-bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst using zeromask k at 32-bit granularity (32-bit elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โternarylogic_ epi64 Experimental avx512f,avx512vl
- Bitwise ternary logic that provides the capability to implement any three-operand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 64-bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst using zeromask k at 64-bit granularity (64-bit elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โunpackhi_ epi32 Experimental avx512f,avx512vl
- Unpack and interleave 32-bit integers from the high half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โunpackhi_ epi64 Experimental avx512f,avx512vl
- Unpack and interleave 64-bit integers from the high half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โunpackhi_ pd Experimental avx512f,avx512vl
- Unpack and interleave double-precision (64-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โunpackhi_ ps Experimental avx512f,avx512vl
- Unpack and interleave single-precision (32-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โunpacklo_ epi32 Experimental avx512f,avx512vl
- Unpack and interleave 32-bit integers from the low half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โunpacklo_ epi64 Experimental avx512f,avx512vl
- Unpack and interleave 64-bit integers from the low half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โunpacklo_ pd Experimental avx512f,avx512vl
- Unpack and interleave double-precision (64-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โunpacklo_ ps Experimental avx512f,avx512vl
- Unpack and interleave single-precision (32-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โxor_ epi32 Experimental avx512f,avx512vl
- Compute the bitwise XOR of packed 32-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
maskz_ โxor_ epi64 Experimental avx512f,avx512vl
- Compute the bitwise XOR of packed 64-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm256_
max_ โepi64 Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b, and store packed maximum values in dst.
- _mm256_
max_ โepu64 Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b, and store packed maximum values in dst.
- _mm256_
min_ โepi64 Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b, and store packed minimum values in dst.
- _mm256_
min_ โepu64 Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b, and store packed minimum values in dst.
- _mm256_
mmask_ โi32gather_ epi32 Experimental avx512f,avx512vl
- Loads 8 32-bit integer elements from memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mmask_ โi32gather_ epi64 Experimental avx512f,avx512vl
- Loads 4 64-bit integer elements from memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mmask_ โi32gather_ pd Experimental avx512f,avx512vl
- Loads 4 double-precision (64-bit) floating-point elements from memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mmask_ โi32gather_ ps Experimental avx512f,avx512vl
- Loads 8 single-precision (32-bit) floating-point elements from memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mmask_ โi64gather_ epi32 Experimental avx512f,avx512vl
- Loads 4 32-bit integer elements from memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mmask_ โi64gather_ epi64 Experimental avx512f,avx512vl
- Loads 4 64-bit integer elements from memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mmask_ โi64gather_ pd Experimental avx512f,avx512vl
- Loads 4 double-precision (64-bit) floating-point elements from memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
mmask_ โi64gather_ ps Experimental avx512f,avx512vl
- Loads 4 single-precision (32-bit) floating-point elements from memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm256_
or_ โepi32 Experimental avx512f,avx512vl
- Compute the bitwise OR of packed 32-bit integers in a and b, and store the results in dst.
- _mm256_
or_ โepi64 Experimental avx512f,avx512vl
- Compute the bitwise OR of packed 64-bit integers in a and b, and store the resut in dst.
- _mm256_
permutex2var_ โepi32 Experimental avx512f,avx512vl
- Shuffle 32-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst.
- _mm256_
permutex2var_ โepi64 Experimental avx512f,avx512vl
- Shuffle 64-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst.
- _mm256_
permutex2var_ โpd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst.
- _mm256_
permutex2var_ โps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst.
- _mm256_
permutex_ โepi64 Experimental avx512f,avx512vl
- Shuffle 64-bit integers in a within 256-bit lanes using the control in imm8, and store the results in dst.
- _mm256_
permutex_ โpd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a within 256-bit lanes using the control in imm8, and store the results in dst.
- _mm256_
permutexvar_ โepi32 Experimental avx512f,avx512vl
- Shuffle 32-bit integers in a across lanes using the corresponding index in idx, and store the results in dst.
- _mm256_
permutexvar_ โepi64 Experimental avx512f,avx512vl
- Shuffle 64-bit integers in a across lanes using the corresponding index in idx, and store the results in dst.
- _mm256_
permutexvar_ โpd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a across lanes using the corresponding index in idx, and store the results in dst.
- _mm256_
permutexvar_ โps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a across lanes using the corresponding index in idx.
- _mm256_
rcp14_ โpd Experimental avx512f,avx512vl
- Compute the approximate reciprocal of packed double-precision (64-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 2^-14.
- _mm256_
rcp14_ โps Experimental avx512f,avx512vl
- Compute the approximate reciprocal of packed single-precision (32-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 2^-14.
- _mm256_
rol_ โepi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the left by the number of bits specified in imm8, and store the results in dst.
- _mm256_
rol_ โepi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the left by the number of bits specified in imm8, and store the results in dst.
- _mm256_
rolv_ โepi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst.
- _mm256_
rolv_ โepi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst.
- _mm256_
ror_ โepi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the right by the number of bits specified in imm8, and store the results in dst.
- _mm256_
ror_ โepi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the right by the number of bits specified in imm8, and store the results in dst.
- _mm256_
rorv_ โepi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst.
- _mm256_
rorv_ โepi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst.
- _mm256_
roundscale_ โpd Experimental avx512f,avx512vl
- Round packed double-precision (64-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm256_
roundscale_ โps Experimental avx512f,avx512vl
- Round packed single-precision (32-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm256_
rsqrt14_ โpd Experimental avx512f,avx512vl
- Compute the approximate reciprocal square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 2^-14.
- _mm256_
rsqrt14_ โps Experimental avx512f,avx512vl
- Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 2^-14.
- _mm256_
scalef_ โpd Experimental avx512f,avx512vl
- Scale the packed double-precision (64-bit) floating-point elements in a using values from b, and store the results in dst.
- _mm256_
scalef_ โps Experimental avx512f,avx512vl
- Scale the packed single-precision (32-bit) floating-point elements in a using values from b, and store the results in dst.
- _mm256_
shuffle_ โf32x4 Experimental avx512f,avx512vl
- Shuffle 128-bits (composed of 4 single-precision (32-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst.
- _mm256_
shuffle_ โf64x2 Experimental avx512f,avx512vl
- Shuffle 128-bits (composed of 2 double-precision (64-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst.
- _mm256_
shuffle_ โi32x4 Experimental avx512f,avx512vl
- Shuffle 128-bits (composed of 4 32-bit integers) selected by imm8 from a and b, and store the results in dst.
- _mm256_
shuffle_ โi64x2 Experimental avx512f,avx512vl
- Shuffle 128-bits (composed of 2 64-bit integers) selected by imm8 from a and b, and store the results in dst.
- _mm256_
sra_ โepi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by count while shifting in sign bits, and store the results in dst.
- _mm256_
srai_ โepi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by imm8 while shifting in sign bits, and store the results in dst.
- _mm256_
srav_ โepi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst.
- _mm256_
store_ โepi32 Experimental avx512f,avx512vl
- Store 256-bits (composed of 8 packed 32-bit integers) from a into memory. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
- _mm256_
store_ โepi64 Experimental avx512f,avx512vl
- Store 256-bits (composed of 4 packed 64-bit integers) from a into memory. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
- _mm256_
storeu_ โepi32 Experimental avx512f,avx512vl
- Store 256-bits (composed of 8 packed 32-bit integers) from a into memory. mem_addr does not need to be aligned on any particular boundary.
- _mm256_
storeu_ โepi64 Experimental avx512f,avx512vl
- Store 256-bits (composed of 4 packed 64-bit integers) from a into memory. mem_addr does not need to be aligned on any particular boundary.
- _mm256_
ternarylogic_ โepi32 Experimental avx512f,avx512vl
- Bitwise ternary logic that provides the capability to implement any three-operand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 32-bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst.
- _mm256_
ternarylogic_ โepi64 Experimental avx512f,avx512vl
- Bitwise ternary logic that provides the capability to implement any three-operand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 64-bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst.
- _mm256_
test_ โepi32_ mask Experimental avx512f,avx512vl
- Compute the bitwise AND of packed 32-bit integers in a and b, producing intermediate 32-bit values, and set the corresponding bit in result mask k if the intermediate value is non-zero.
- _mm256_
test_ โepi64_ mask Experimental avx512f,avx512vl
- Compute the bitwise AND of packed 64-bit integers in a and b, producing intermediate 64-bit values, and set the corresponding bit in result mask k if the intermediate value is non-zero.
- _mm256_
testn_ โepi32_ mask Experimental avx512f,avx512vl
- Compute the bitwise NAND of packed 32-bit integers in a and b, producing intermediate 32-bit values, and set the corresponding bit in result mask k if the intermediate value is zero.
- _mm256_
testn_ โepi64_ mask Experimental avx512f,avx512vl
- Compute the bitwise NAND of packed 64-bit integers in a and b, producing intermediate 64-bit values, and set the corresponding bit in result mask k if the intermediate value is zero.
- _mm256_
xor_ โepi32 Experimental avx512f,avx512vl
- Compute the bitwise XOR of packed 32-bit integers in a and b, and store the results in dst.
- _mm256_
xor_ โepi64 Experimental avx512f,avx512vl
- Compute the bitwise XOR of packed 64-bit integers in a and b, and store the results in dst.
- _mm512_
abs_ โepi32 Experimental avx512f
- Computes the absolute values of packed 32-bit integers in
a
. - _mm512_
abs_ โepi64 Experimental avx512f
- Compute the absolute value of packed signed 64-bit integers in a, and store the unsigned results in dst.
- _mm512_
abs_ โpd Experimental avx512f
- Finds the absolute value of each packed double-precision (64-bit) floating-point element in v2, storing the results in dst.
- _mm512_
abs_ โps Experimental avx512f
- Finds the absolute value of each packed single-precision (32-bit) floating-point element in v2, storing the results in dst.
- _mm512_
add_ โepi32 Experimental avx512f
- Add packed 32-bit integers in a and b, and store the results in dst.
- _mm512_
add_ โepi64 Experimental avx512f
- Add packed 64-bit integers in a and b, and store the results in dst.
- _mm512_
add_ โpd Experimental avx512f
- Add packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.
- _mm512_
add_ โps Experimental avx512f
- Add packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.
- _mm512_
add_ โround_ pd Experimental avx512f
- Add packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.\
- _mm512_
add_ โround_ ps Experimental avx512f
- Add packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.\
- _mm512_
alignr_ โepi32 Experimental avx512f
- Concatenate a and b into a 128-byte immediate result, shift the result right by imm8 32-bit elements, and store the low 64 bytes (16 elements) in dst.
- _mm512_
alignr_ โepi64 Experimental avx512f
- Concatenate a and b into a 128-byte immediate result, shift the result right by imm8 64-bit elements, and store the low 64 bytes (8 elements) in dst.
- _mm512_
and_ โepi32 Experimental avx512f
- Compute the bitwise AND of packed 32-bit integers in a and b, and store the results in dst.
- _mm512_
and_ โepi64 Experimental avx512f
- Compute the bitwise AND of 512 bits (composed of packed 64-bit integers) in a and b, and store the results in dst.
- _mm512_
and_ โsi512 Experimental avx512f
- Compute the bitwise AND of 512 bits (representing integer data) in a and b, and store the result in dst.
- _mm512_
andnot_ โepi32 Experimental avx512f
- Compute the bitwise NOT of packed 32-bit integers in a and then AND with b, and store the results in dst.
- _mm512_
andnot_ โepi64 Experimental avx512f
- Compute the bitwise NOT of 512 bits (composed of packed 64-bit integers) in a and then AND with b, and store the results in dst.
- _mm512_
andnot_ โsi512 Experimental avx512f
- Compute the bitwise NOT of 512 bits (representing integer data) in a and then AND with b, and store the result in dst.
- _mm512_
broadcast_ โf32x4 Experimental avx512f
- Broadcast the 4 packed single-precision (32-bit) floating-point elements from a to all elements of dst.
- _mm512_
broadcast_ โf64x4 Experimental avx512f
- Broadcast the 4 packed double-precision (64-bit) floating-point elements from a to all elements of dst.
- _mm512_
broadcast_ โi32x4 Experimental avx512f
- Broadcast the 4 packed 32-bit integers from a to all elements of dst.
- _mm512_
broadcast_ โi64x4 Experimental avx512f
- Broadcast the 4 packed 64-bit integers from a to all elements of dst.
- _mm512_
broadcastd_ โepi32 Experimental avx512f
- Broadcast the low packed 32-bit integer from a to all elements of dst.
- _mm512_
broadcastq_ โepi64 Experimental avx512f
- Broadcast the low packed 64-bit integer from a to all elements of dst.
- _mm512_
broadcastsd_ โpd Experimental avx512f
- Broadcast the low double-precision (64-bit) floating-point element from a to all elements of dst.
- _mm512_
broadcastss_ โps Experimental avx512f
- Broadcast the low single-precision (32-bit) floating-point element from a to all elements of dst.
- _mm512_
castpd128_ โpd512 Experimental avx512f
- Cast vector of type __m128d to type __m512d; the upper 384 bits of the result are undefined. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
castpd256_ โpd512 Experimental avx512f
- Cast vector of type __m256d to type __m512d; the upper 256 bits of the result are undefined. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
castpd512_ โpd128 Experimental avx512f
- Cast vector of type __m512d to type __m128d. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
castpd512_ โpd256 Experimental avx512f
- Cast vector of type __m512d to type __m256d. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
castpd_ โps Experimental avx512f
- Cast vector of type __m512d to type __m512. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
castpd_ โsi512 Experimental avx512f
- Cast vector of type __m512d to type __m512i. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
castps128_ โps512 Experimental avx512f
- Cast vector of type __m128 to type __m512; the upper 384 bits of the result are undefined. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
castps256_ โps512 Experimental avx512f
- Cast vector of type __m256 to type __m512; the upper 256 bits of the result are undefined. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
castps512_ โps128 Experimental avx512f
- Cast vector of type __m512 to type __m128. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
castps512_ โps256 Experimental avx512f
- Cast vector of type __m512 to type __m256. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
castps_ โpd Experimental avx512f
- Cast vector of type __m512 to type __m512d. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
castps_ โsi512 Experimental avx512f
- Cast vector of type __m512 to type __m512i. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
castsi128_ โsi512 Experimental avx512f
- Cast vector of type __m128i to type __m512i; the upper 384 bits of the result are undefined. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
castsi256_ โsi512 Experimental avx512f
- Cast vector of type __m256i to type __m512i; the upper 256 bits of the result are undefined. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
castsi512_ โpd Experimental avx512f
- Cast vector of type __m512i to type __m512d. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
castsi512_ โps Experimental avx512f
- Cast vector of type __m512i to type __m512. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
castsi512_ โsi128 Experimental avx512f
- Cast vector of type __m512i to type __m128i. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
castsi512_ โsi256 Experimental avx512f
- Cast vector of type __m512i to type __m256i. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
cmp_ โepi32_ mask Experimental avx512f
- Compare packed signed 32-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k.
- _mm512_
cmp_ โepi64_ mask Experimental avx512f
- Compare packed signed 64-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k.
- _mm512_
cmp_ โepu32_ mask Experimental avx512f
- Compare packed unsigned 32-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k.
- _mm512_
cmp_ โepu64_ mask Experimental avx512f
- Compare packed unsigned 64-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k.
- _mm512_
cmp_ โpd_ mask Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k.
- _mm512_
cmp_ โps_ mask Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k.
- _mm512_
cmp_ โround_ pd_ mask Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
cmp_ โround_ ps_ mask Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
cmpeq_ โepi32_ mask Experimental avx512f
- Compare packed 32-bit integers in a and b for equality, and store the results in mask vector k.
- _mm512_
cmpeq_ โepi64_ mask Experimental avx512f
- Compare packed 64-bit integers in a and b for equality, and store the results in mask vector k.
- _mm512_
cmpeq_ โepu32_ mask Experimental avx512f
- Compare packed unsigned 32-bit integers in a and b for equality, and store the results in mask vector k.
- _mm512_
cmpeq_ โepu64_ mask Experimental avx512f
- Compare packed unsigned 64-bit integers in a and b for equality, and store the results in mask vector k.
- _mm512_
cmpeq_ โpd_ mask Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b for equality, and store the results in mask vector k.
- _mm512_
cmpeq_ โps_ mask Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b for equality, and store the results in mask vector k.
- _mm512_
cmpge_ โepi32_ mask Experimental avx512f
- Compare packed signed 32-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k.
- _mm512_
cmpge_ โepi64_ mask Experimental avx512f
- Compare packed signed 64-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k.
- _mm512_
cmpge_ โepu32_ mask Experimental avx512f
- Compare packed unsigned 32-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k.
- _mm512_
cmpge_ โepu64_ mask Experimental avx512f
- Compare packed unsigned 64-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k.
- _mm512_
cmpgt_ โepi32_ mask Experimental avx512f
- Compare packed signed 32-bit integers in a and b for greater-than, and store the results in mask vector k.
- _mm512_
cmpgt_ โepi64_ mask Experimental avx512f
- Compare packed signed 64-bit integers in a and b for greater-than, and store the results in mask vector k.
- _mm512_
cmpgt_ โepu32_ mask Experimental avx512f
- Compare packed unsigned 32-bit integers in a and b for greater-than, and store the results in mask vector k.
- _mm512_
cmpgt_ โepu64_ mask Experimental avx512f
- Compare packed unsigned 64-bit integers in a and b for greater-than, and store the results in mask vector k.
- _mm512_
cmple_ โepi32_ mask Experimental avx512f
- Compare packed signed 32-bit integers in a and b for less-than-or-equal, and store the results in mask vector k.
- _mm512_
cmple_ โepi64_ mask Experimental avx512f
- Compare packed signed 64-bit integers in a and b for less-than-or-equal, and store the results in mask vector k.
- _mm512_
cmple_ โepu32_ mask Experimental avx512f
- Compare packed unsigned 32-bit integers in a and b for less-than-or-equal, and store the results in mask vector k.
- _mm512_
cmple_ โepu64_ mask Experimental avx512f
- Compare packed unsigned 64-bit integers in a and b for less-than-or-equal, and store the results in mask vector k.
- _mm512_
cmple_ โpd_ mask Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b for less-than-or-equal, and store the results in mask vector k.
- _mm512_
cmple_ โps_ mask Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b for less-than-or-equal, and store the results in mask vector k.
- _mm512_
cmplt_ โepi32_ mask Experimental avx512f
- Compare packed signed 32-bit integers in a and b for less-than, and store the results in mask vector k.
- _mm512_
cmplt_ โepi64_ mask Experimental avx512f
- Compare packed signed 64-bit integers in a and b for less-than, and store the results in mask vector k.
- _mm512_
cmplt_ โepu32_ mask Experimental avx512f
- Compare packed unsigned 32-bit integers in a and b for less-than, and store the results in mask vector k.
- _mm512_
cmplt_ โepu64_ mask Experimental avx512f
- Compare packed unsigned 64-bit integers in a and b for less-than, and store the results in mask vector k.
- _mm512_
cmplt_ โpd_ mask Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b for less-than, and store the results in mask vector k.
- _mm512_
cmplt_ โps_ mask Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b for less-than, and store the results in mask vector k.
- _mm512_
cmpneq_ โepi32_ mask Experimental avx512f
- Compare packed 32-bit integers in a and b for not-equal, and store the results in mask vector k.
- _mm512_
cmpneq_ โepi64_ mask Experimental avx512f
- Compare packed signed 64-bit integers in a and b for not-equal, and store the results in mask vector k.
- _mm512_
cmpneq_ โepu32_ mask Experimental avx512f
- Compare packed unsigned 32-bit integers in a and b for not-equal, and store the results in mask vector k.
- _mm512_
cmpneq_ โepu64_ mask Experimental avx512f
- Compare packed unsigned 64-bit integers in a and b for not-equal, and store the results in mask vector k.
- _mm512_
cmpneq_ โpd_ mask Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b for not-equal, and store the results in mask vector k.
- _mm512_
cmpneq_ โps_ mask Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b for not-equal, and store the results in mask vector k.
- _mm512_
cmpnle_ โpd_ mask Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b for not-less-than-or-equal, and store the results in mask vector k.
- _mm512_
cmpnle_ โps_ mask Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b for not-less-than-or-equal, and store the results in mask vector k.
- _mm512_
cmpnlt_ โpd_ mask Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b for not-less-than, and store the results in mask vector k.
- _mm512_
cmpnlt_ โps_ mask Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b for not-less-than, and store the results in mask vector k.
- _mm512_
cmpord_ โpd_ mask Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b to see if neither is NaN, and store the results in mask vector k.
- _mm512_
cmpord_ โps_ mask Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b to see if neither is NaN, and store the results in mask vector k.
- _mm512_
cmpunord_ โpd_ mask Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b to see if either is NaN, and store the results in mask vector k.
- _mm512_
cmpunord_ โps_ mask Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b to see if either is NaN, and store the results in mask vector k.
- _mm512_
cvt_ โroundepi32_ ps Experimental avx512f
- Convert packed signed 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.\
- _mm512_
cvt_ โroundepu32_ ps Experimental avx512f
- Convert packed unsigned 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.\
- _mm512_
cvt_ โroundpd_ epi32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst.\
- _mm512_
cvt_ โroundpd_ epu32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst.\
- _mm512_
cvt_ โroundpd_ ps Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.\
- _mm512_
cvt_ โroundph_ ps Experimental avx512f
- Convert packed half-precision (16-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
cvt_ โroundps_ epi32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst.
- _mm512_
cvt_ โroundps_ epu32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst.\
- _mm512_
cvt_ โroundps_ pd Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
cvt_ โroundps_ ph Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed half-precision (16-bit) floating-point elements, and store the results in dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
cvtepi8_ โepi32 Experimental avx512f
- Sign extend packed 8-bit integers in a to packed 32-bit integers, and store the results in dst.
- _mm512_
cvtepi8_ โepi64 Experimental avx512f
- Sign extend packed 8-bit integers in the low 8 bytes of a to packed 64-bit integers, and store the results in dst.
- _mm512_
cvtepi16_ โepi32 Experimental avx512f
- Sign extend packed 16-bit integers in a to packed 32-bit integers, and store the results in dst.
- _mm512_
cvtepi16_ โepi64 Experimental avx512f
- Sign extend packed 16-bit integers in a to packed 64-bit integers, and store the results in dst.
- _mm512_
cvtepi32_ โepi8 Experimental avx512f
- Convert packed 32-bit integers in a to packed 8-bit integers with truncation, and store the results in dst.
- _mm512_
cvtepi32_ โepi16 Experimental avx512f
- Convert packed 32-bit integers in a to packed 16-bit integers with truncation, and store the results in dst.
- _mm512_
cvtepi32_ โepi64 Experimental avx512f
- Sign extend packed 32-bit integers in a to packed 64-bit integers, and store the results in dst.
- _mm512_
cvtepi32_ โpd Experimental avx512f
- Convert packed signed 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.
- _mm512_
cvtepi32_ โps Experimental avx512f
- Convert packed signed 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.
- _mm512_
cvtepi32lo_ โpd Experimental avx512f
- Performs element-by-element conversion of the lower half of packed 32-bit integer elements in v2 to packed double-precision (64-bit) floating-point elements, storing the results in dst.
- _mm512_
cvtepi64_ โepi8 Experimental avx512f
- Convert packed 64-bit integers in a to packed 8-bit integers with truncation, and store the results in dst.
- _mm512_
cvtepi64_ โepi16 Experimental avx512f
- Convert packed 64-bit integers in a to packed 16-bit integers with truncation, and store the results in dst.
- _mm512_
cvtepi64_ โepi32 Experimental avx512f
- Convert packed 64-bit integers in a to packed 32-bit integers with truncation, and store the results in dst.
- _mm512_
cvtepu8_ โepi32 Experimental avx512f
- Zero extend packed unsigned 8-bit integers in a to packed 32-bit integers, and store the results in dst.
- _mm512_
cvtepu8_ โepi64 Experimental avx512f
- Zero extend packed unsigned 8-bit integers in the low 8 byte sof a to packed 64-bit integers, and store the results in dst.
- _mm512_
cvtepu16_ โepi32 Experimental avx512f
- Zero extend packed unsigned 16-bit integers in a to packed 32-bit integers, and store the results in dst.
- _mm512_
cvtepu16_ โepi64 Experimental avx512f
- Zero extend packed unsigned 16-bit integers in a to packed 64-bit integers, and store the results in dst.
- _mm512_
cvtepu32_ โepi64 Experimental avx512f
- Zero extend packed unsigned 32-bit integers in a to packed 64-bit integers, and store the results in dst.
- _mm512_
cvtepu32_ โpd Experimental avx512f
- Convert packed unsigned 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.
- _mm512_
cvtepu32_ โps Experimental avx512f
- Convert packed unsigned 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.
- _mm512_
cvtepu32lo_ โpd Experimental avx512f
- Performs element-by-element conversion of the lower half of packed 32-bit unsigned integer elements in v2 to packed double-precision (64-bit) floating-point elements, storing the results in dst.
- _mm512_
cvtpd_ โepi32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst.
- _mm512_
cvtpd_ โepu32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst.
- _mm512_
cvtpd_ โps Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.
- _mm512_
cvtpd_ โpslo Experimental avx512f
- Performs an element-by-element conversion of packed double-precision (64-bit) floating-point elements in v2 to single-precision (32-bit) floating-point elements and stores them in dst. The elements are stored in the lower half of the results vector, while the remaining upper half locations are set to 0.
- _mm512_
cvtph_ โps Experimental avx512f
- Convert packed half-precision (16-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.
- _mm512_
cvtps_ โepi32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst.
- _mm512_
cvtps_ โepu32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst.
- _mm512_
cvtps_ โpd Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.
- _mm512_
cvtps_ โph Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed half-precision (16-bit) floating-point elements, and store the results in dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
cvtpslo_ โpd Experimental avx512f
- Performs element-by-element conversion of the lower half of packed single-precision (32-bit) floating-point elements in v2 to packed double-precision (64-bit) floating-point elements, storing the results in dst.
- _mm512_
cvtsd_ โf64 Experimental avx512f
- Copy the lower double-precision (64-bit) floating-point element of a to dst.
- _mm512_
cvtsepi32_ โepi8 Experimental avx512f
- Convert packed signed 32-bit integers in a to packed 8-bit integers with signed saturation, and store the results in dst.
- _mm512_
cvtsepi32_ โepi16 Experimental avx512f
- Convert packed signed 32-bit integers in a to packed 16-bit integers with signed saturation, and store the results in dst.
- _mm512_
cvtsepi64_ โepi8 Experimental avx512f
- Convert packed signed 64-bit integers in a to packed 8-bit integers with signed saturation, and store the results in dst.
- _mm512_
cvtsepi64_ โepi16 Experimental avx512f
- Convert packed signed 64-bit integers in a to packed 16-bit integers with signed saturation, and store the results in dst.
- _mm512_
cvtsepi64_ โepi32 Experimental avx512f
- Convert packed signed 64-bit integers in a to packed 32-bit integers with signed saturation, and store the results in dst.
- _mm512_
cvtsi512_ โsi32 Experimental avx512f
- Copy the lower 32-bit integer in a to dst.
- _mm512_
cvtss_ โf32 Experimental avx512f
- Copy the lower single-precision (32-bit) floating-point element of a to dst.
- _mm512_
cvtt_ โroundpd_ epi32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
cvtt_ โroundpd_ epu32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
cvtt_ โroundps_ epi32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
cvtt_ โroundps_ epu32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
cvttpd_ โepi32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst.
- _mm512_
cvttpd_ โepu32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst.
- _mm512_
cvttps_ โepi32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst.
- _mm512_
cvttps_ โepu32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst.
- _mm512_
cvtusepi32_ โepi8 Experimental avx512f
- Convert packed unsigned 32-bit integers in a to packed unsigned 8-bit integers with unsigned saturation, and store the results in dst.
- _mm512_
cvtusepi32_ โepi16 Experimental avx512f
- Convert packed unsigned 32-bit integers in a to packed unsigned 16-bit integers with unsigned saturation, and store the results in dst.
- _mm512_
cvtusepi64_ โepi8 Experimental avx512f
- Convert packed unsigned 64-bit integers in a to packed unsigned 8-bit integers with unsigned saturation, and store the results in dst.
- _mm512_
cvtusepi64_ โepi16 Experimental avx512f
- Convert packed unsigned 64-bit integers in a to packed unsigned 16-bit integers with unsigned saturation, and store the results in dst.
- _mm512_
cvtusepi64_ โepi32 Experimental avx512f
- Convert packed unsigned 64-bit integers in a to packed unsigned 32-bit integers with unsigned saturation, and store the results in dst.
- _mm512_
div_ โpd Experimental avx512f
- Divide packed double-precision (64-bit) floating-point elements in a by packed elements in b, and store the results in dst.
- _mm512_
div_ โps Experimental avx512f
- Divide packed single-precision (32-bit) floating-point elements in a by packed elements in b, and store the results in dst.
- _mm512_
div_ โround_ pd Experimental avx512f
- Divide packed double-precision (64-bit) floating-point elements in a by packed elements in b, =and store the results in dst.\
- _mm512_
div_ โround_ ps Experimental avx512f
- Divide packed single-precision (32-bit) floating-point elements in a by packed elements in b, and store the results in dst.\
- _mm512_
extractf32x4_ โps Experimental avx512f
- Extract 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from a, selected with imm8, and store the result in dst.
- _mm512_
extractf64x4_ โpd Experimental avx512f
- Extract 256 bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a, selected with imm8, and store the result in dst.
- _mm512_
extracti32x4_ โepi32 Experimental avx512f
- Extract 128 bits (composed of 4 packed 32-bit integers) from a, selected with IMM2, and store the result in dst.
- _mm512_
extracti64x4_ โepi64 Experimental avx512f
- Extract 256 bits (composed of 4 packed 64-bit integers) from a, selected with IMM1, and store the result in dst.
- _mm512_
fixupimm_ โpd Experimental avx512f
- Fix up packed double-precision (64-bit) floating-point elements in a and b using packed 64-bit integers in c, and store the results in dst. imm8 is used to set the required flags reporting.
- _mm512_
fixupimm_ โps Experimental avx512f
- Fix up packed single-precision (32-bit) floating-point elements in a and b using packed 32-bit integers in c, and store the results in dst. imm8 is used to set the required flags reporting.
- _mm512_
fixupimm_ โround_ pd Experimental avx512f
- Fix up packed double-precision (64-bit) floating-point elements in a and b using packed 64-bit integers in c, and store the results in dst. imm8 is used to set the required flags reporting.\
- _mm512_
fixupimm_ โround_ ps Experimental avx512f
- Fix up packed single-precision (32-bit) floating-point elements in a and b using packed 32-bit integers in c, and store the results in dst. imm8 is used to set the required flags reporting.\
- _mm512_
fmadd_ โpd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst.
- _mm512_
fmadd_ โps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst.
- _mm512_
fmadd_ โround_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst.\
- _mm512_
fmadd_ โround_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst.\
- _mm512_
fmaddsub_ โpd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst.
- _mm512_
fmaddsub_ โps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst.
- _mm512_
fmaddsub_ โround_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst.\
- _mm512_
fmaddsub_ โround_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst.\
- _mm512_
fmsub_ โpd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst.
- _mm512_
fmsub_ โps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst.
- _mm512_
fmsub_ โround_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst.\
- _mm512_
fmsub_ โround_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst.\
- _mm512_
fmsubadd_ โpd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst.
- _mm512_
fmsubadd_ โps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst.
- _mm512_
fmsubadd_ โround_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst.\
- _mm512_
fmsubadd_ โround_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst.\
- _mm512_
fnmadd_ โpd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst.
- _mm512_
fnmadd_ โps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst.
- _mm512_
fnmadd_ โround_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst.\
- _mm512_
fnmadd_ โround_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst.\
- _mm512_
fnmsub_ โpd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst.
- _mm512_
fnmsub_ โps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst.
- _mm512_
fnmsub_ โround_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst.\
- _mm512_
fnmsub_ โround_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst.\
- _mm512_
getexp_ โpd Experimental avx512f
- Convert the exponent of each packed double-precision (64-bit) floating-point element in a to a double-precision (64-bit) floating-point number representing the integer exponent, and store the results in dst. This intrinsic essentially calculates floor(log2(x)) for each element.
- _mm512_
getexp_ โps Experimental avx512f
- Convert the exponent of each packed single-precision (32-bit) floating-point element in a to a single-precision (32-bit) floating-point number representing the integer exponent, and store the results in dst. This intrinsic essentially calculates floor(log2(x)) for each element.
- _mm512_
getexp_ โround_ pd Experimental avx512f
- Convert the exponent of each packed double-precision (64-bit) floating-point element in a to a double-precision (64-bit) floating-point number representing the integer exponent, and store the results in dst. This intrinsic essentially calculates floor(log2(x)) for each element.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
getexp_ โround_ ps Experimental avx512f
- Convert the exponent of each packed single-precision (32-bit) floating-point element in a to a single-precision (32-bit) floating-point number representing the integer exponent, and store the results in dst. This intrinsic essentially calculates floor(log2(x)) for each element.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
getmant_ โpd Experimental avx512f
- Normalize the mantissas of packed double-precision (64-bit) floating-point elements in a, and store the results in dst. This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 - _mm512_
getmant_ โps Experimental avx512f
- Normalize the mantissas of packed single-precision (32-bit) floating-point elements in a, and store the results in dst. This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. The mantissa is normalized to the interval specified by interv, which can take the following values: _MM_MANT_NORM_1_2 // interval [1, 2) _MM_MANT_NORM_p5_2 // interval [0.5, 2) _MM_MANT_NORM_p5_1 // interval [0.5, 1) _MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5) The sign is determined by sc which can take the following values: _MM_MANT_SIGN_src // sign = sign(src) _MM_MANT_SIGN_zero // sign = 0 _MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
- _mm512_
getmant_ โround_ pd Experimental avx512f
- Normalize the mantissas of packed double-precision (64-bit) floating-point elements in a, and store the results in dst. This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
getmant_ โround_ ps Experimental avx512f
- Normalize the mantissas of packed single-precision (32-bit) floating-point elements in a, and store the results in dst. This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
i32gather_ โepi32 Experimental avx512f
- Gather 32-bit integers from memory using 32-bit indices. 32-bit elements are loaded from addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8.
- _mm512_
i32gather_ โepi64 Experimental avx512f
- Gather 64-bit integers from memory using 32-bit indices. 64-bit elements are loaded from addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8.
- _mm512_
i32gather_ โpd Experimental avx512f
- Gather double-precision (64-bit) floating-point elements from memory using 32-bit indices. 64-bit elements are loaded from addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8.
- _mm512_
i32gather_ โps Experimental avx512f
- Gather single-precision (32-bit) floating-point elements from memory using 32-bit indices. 32-bit elements are loaded from addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8.
- _mm512_
i32logather_ โepi64 Experimental avx512f
- Loads 8 64-bit integer elements from memory starting at location base_addr at packed 32-bit integer indices stored in the lower half of vindex scaled by scale and stores them in dst.
- _mm512_
i32logather_ โpd Experimental avx512f
- Loads 8 double-precision (64-bit) floating-point elements from memory starting at location base_addr at packed 32-bit integer indices stored in the lower half of vindex scaled by scale and stores them in dst.
- _mm512_
i32loscatter_ โepi64 Experimental avx512f
- Stores 8 64-bit integer elements from a to memory starting at location base_addr at packed 32-bit integer indices stored in the lower half of vindex scaled by scale.
- _mm512_
i32loscatter_ โpd Experimental avx512f
- Stores 8 double-precision (64-bit) floating-point elements from a to memory starting at location base_addr at packed 32-bit integer indices stored in the lower half of vindex scaled by scale.
- _mm512_
i32scatter_ โepi32 Experimental avx512f
- Scatter 32-bit integers from a into memory using 32-bit indices. 32-bit elements are stored at addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale). scale should be 1, 2, 4 or 8.
- _mm512_
i32scatter_ โepi64 Experimental avx512f
- Scatter 64-bit integers from a into memory using 32-bit indices. 64-bit elements are stored at addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale). scale should be 1, 2, 4 or 8.
- _mm512_
i32scatter_ โpd Experimental avx512f
- Scatter double-precision (64-bit) floating-point elements from a into memory using 32-bit indices. 64-bit elements are stored at addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale). scale should be 1, 2, 4 or 8.
- _mm512_
i32scatter_ โps Experimental avx512f
- Scatter single-precision (32-bit) floating-point elements from a into memory using 32-bit indices. 32-bit elements are stored at addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale). scale should be 1, 2, 4 or 8.
- _mm512_
i64gather_ โepi32 Experimental avx512f
- Gather 32-bit integers from memory using 64-bit indices. 32-bit elements are loaded from addresses starting at base_addr and offset by each 64-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8.
- _mm512_
i64gather_ โepi64 Experimental avx512f
- Gather 64-bit integers from memory using 64-bit indices. 64-bit elements are loaded from addresses starting at base_addr and offset by each 64-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8.
- _mm512_
i64gather_ โpd Experimental avx512f
- Gather double-precision (64-bit) floating-point elements from memory using 64-bit indices. 64-bit elements are loaded from addresses starting at base_addr and offset by each 64-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8.
- _mm512_
i64gather_ โps Experimental avx512f
- Gather single-precision (32-bit) floating-point elements from memory using 64-bit indices. 32-bit elements are loaded from addresses starting at base_addr and offset by each 64-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8.
- _mm512_
i64scatter_ โepi32 Experimental avx512f
- Scatter 32-bit integers from a into memory using 64-bit indices. 32-bit elements are stored at addresses starting at base_addr and offset by each 64-bit element in vindex (each index is scaled by the factor in scale). scale should be 1, 2, 4 or 8.
- _mm512_
i64scatter_ โepi64 Experimental avx512f
- Scatter 64-bit integers from a into memory using 64-bit indices. 64-bit elements are stored at addresses starting at base_addr and offset by each 64-bit element in vindex (each index is scaled by the factor in scale). scale should be 1, 2, 4 or 8.
- _mm512_
i64scatter_ โpd Experimental avx512f
- Scatter double-precision (64-bit) floating-point elements from a into memory using 64-bit indices. 64-bit elements are stored at addresses starting at base_addr and offset by each 64-bit element in vindex (each index is scaled by the factor in scale). scale should be 1, 2, 4 or 8.
- _mm512_
i64scatter_ โps Experimental avx512f
- Scatter single-precision (32-bit) floating-point elements from a into memory using 64-bit indices. 32-bit elements are stored at addresses starting at base_addr and offset by each 64-bit element in vindex (each index is scaled by the factor in scale) subject to mask k (elements are not stored when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8.
- _mm512_
insertf32x4 โExperimental avx512f
- Copy a to dst, then insert 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from b into dst at the location specified by imm8.
- _mm512_
insertf64x4 โExperimental avx512f
- Copy a to dst, then insert 256 bits (composed of 4 packed double-precision (64-bit) floating-point elements) from b into dst at the location specified by imm8.
- _mm512_
inserti32x4 โExperimental avx512f
- Copy a to dst, then insert 128 bits (composed of 4 packed 32-bit integers) from b into dst at the location specified by imm8.
- _mm512_
inserti64x4 โExperimental avx512f
- Copy a to dst, then insert 256 bits (composed of 4 packed 64-bit integers) from b into dst at the location specified by imm8.
- _mm512_
int2mask โExperimental avx512f
- Converts integer mask into bitmask, storing the result in dst.
- _mm512_
kand โExperimental avx512f
- Compute the bitwise AND of 16-bit masks a and b, and store the result in k.
- _mm512_
kandn โExperimental avx512f
- Compute the bitwise NOT of 16-bit masks a and then AND with b, and store the result in k.
- _mm512_
kmov โExperimental avx512f
- Copy 16-bit mask a to k.
- _mm512_
knot โExperimental avx512f
- Compute the bitwise NOT of 16-bit mask a, and store the result in k.
- _mm512_
kor โExperimental avx512f
- Compute the bitwise OR of 16-bit masks a and b, and store the result in k.
- _mm512_
kortestc โExperimental avx512f
- Performs bitwise OR between k1 and k2, storing the result in dst. CF flag is set if dst consists of all 1โs.
- _mm512_
kortestz โExperimental avx512f
- Performs bitwise OR between k1 and k2, storing the result in dst. ZF flag is set if dst is 0.
- _mm512_
kunpackb โExperimental avx512f
- Unpack and interleave 8 bits from masks a and b, and store the 16-bit result in k.
- _mm512_
kxnor โExperimental avx512f
- Compute the bitwise XNOR of 16-bit masks a and b, and store the result in k.
- _mm512_
kxor โExperimental avx512f
- Compute the bitwise XOR of 16-bit masks a and b, and store the result in k.
- _mm512_
load_ โepi32 Experimental avx512f
- Load 512-bits (composed of 16 packed 32-bit integers) from memory into dst. mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
load_ โepi64 Experimental avx512f
- Load 512-bits (composed of 8 packed 64-bit integers) from memory into dst. mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
load_ โpd Experimental avx512f
- Load 512-bits (composed of 8 packed double-precision (64-bit) floating-point elements) from memory into dst. mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
load_ โps Experimental avx512f
- Load 512-bits (composed of 16 packed single-precision (32-bit) floating-point elements) from memory into dst. mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
load_ โsi512 Experimental avx512f
- Load 512-bits of integer data from memory into dst. mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
loadu_ โepi32 Experimental avx512f
- Load 512-bits (composed of 16 packed 32-bit integers) from memory into dst. mem_addr does not need to be aligned on any particular boundary.
- _mm512_
loadu_ โepi64 Experimental avx512f
- Load 512-bits (composed of 8 packed 64-bit integers) from memory into dst. mem_addr does not need to be aligned on any particular boundary.
- _mm512_
loadu_ โpd Experimental avx512f
- Loads 512-bits (composed of 8 packed double-precision (64-bit)
floating-point elements) from memory into result.
mem_addr
does not need to be aligned on any particular boundary. - _mm512_
loadu_ โps Experimental avx512f
- Loads 512-bits (composed of 16 packed single-precision (32-bit)
floating-point elements) from memory into result.
mem_addr
does not need to be aligned on any particular boundary. - _mm512_
loadu_ โsi512 Experimental avx512f
- Load 512-bits of integer data from memory into dst. mem_addr does not need to be aligned on any particular boundary.
- _mm512_
mask2_ โpermutex2var_ epi32 Experimental avx512f
- Shuffle 32-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from idx when the corresponding mask bit is not set).
- _mm512_
mask2_ โpermutex2var_ epi64 Experimental avx512f
- Shuffle 64-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from idx when the corresponding mask bit is not set).
- _mm512_
mask2_ โpermutex2var_ pd Experimental avx512f
- Shuffle double-precision (64-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from idx when the corresponding mask bit is not set)
- _mm512_
mask2_ โpermutex2var_ ps Experimental avx512f
- Shuffle single-precision (32-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from idx when the corresponding mask bit is not set).
- _mm512_
mask2int โExperimental avx512f
- Converts bit mask k1 into an integer value, storing the results in dst.
- _mm512_
mask3_ โfmadd_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm512_
mask3_ โfmadd_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm512_
mask3_ โfmadd_ round_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).\
- _mm512_
mask3_ โfmadd_ round_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).\
- _mm512_
mask3_ โfmaddsub_ pd Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm512_
mask3_ โfmaddsub_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm512_
mask3_ โfmaddsub_ round_ pd Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).\
- _mm512_
mask3_ โfmaddsub_ round_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).\
- _mm512_
mask3_ โfmsub_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm512_
mask3_ โfmsub_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm512_
mask3_ โfmsub_ round_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).\
- _mm512_
mask3_ โfmsub_ round_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).\
- _mm512_
mask3_ โfmsubadd_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm512_
mask3_ โfmsubadd_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm512_
mask3_ โfmsubadd_ round_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).\
- _mm512_
mask3_ โfmsubadd_ round_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).\
- _mm512_
mask3_ โfnmadd_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm512_
mask3_ โfnmadd_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm512_
mask3_ โfnmadd_ round_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).\
- _mm512_
mask3_ โfnmadd_ round_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).\
- _mm512_
mask3_ โfnmsub_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm512_
mask3_ โfnmsub_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm512_
mask3_ โfnmsub_ round_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).\
- _mm512_
mask3_ โfnmsub_ round_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).\
- _mm512_
mask_ โabs_ epi32 Experimental avx512f
- Computes the absolute value of packed 32-bit integers in
a
, and store the unsigned results indst
using writemaskk
(elements are copied fromsrc
when the corresponding mask bit is not set). - _mm512_
mask_ โabs_ epi64 Experimental avx512f
- Compute the absolute value of packed signed 64-bit integers in a, and store the unsigned results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โabs_ pd Experimental avx512f
- Finds the absolute value of each packed double-precision (64-bit) floating-point element in v2, storing the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โabs_ ps Experimental avx512f
- Finds the absolute value of each packed single-precision (32-bit) floating-point element in v2, storing the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โadd_ epi32 Experimental avx512f
- Add packed 32-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โadd_ epi64 Experimental avx512f
- Add packed 64-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โadd_ pd Experimental avx512f
- Add packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โadd_ ps Experimental avx512f
- Add packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โadd_ round_ pd Experimental avx512f
- Add packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).\
- _mm512_
mask_ โadd_ round_ ps Experimental avx512f
- Add packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).\
- _mm512_
mask_ โalignr_ epi32 Experimental avx512f
- Concatenate a and b into a 128-byte immediate result, shift the result right by imm8 32-bit elements, and store the low 64 bytes (16 elements) in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โalignr_ epi64 Experimental avx512f
- Concatenate a and b into a 128-byte immediate result, shift the result right by imm8 64-bit elements, and store the low 64 bytes (8 elements) in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โand_ epi32 Experimental avx512f
- Performs element-by-element bitwise AND between packed 32-bit integer elements of a and b, storing the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โand_ epi64 Experimental avx512f
- Compute the bitwise AND of packed 64-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โandnot_ epi32 Experimental avx512f
- Compute the bitwise NOT of packed 32-bit integers in a and then AND with b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โandnot_ epi64 Experimental avx512f
- Compute the bitwise NOT of packed 64-bit integers in a and then AND with b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โblend_ epi32 Experimental avx512f
- Blend packed 32-bit integers from a and b using control mask k, and store the results in dst.
- _mm512_
mask_ โblend_ epi64 Experimental avx512f
- Blend packed 64-bit integers from a and b using control mask k, and store the results in dst.
- _mm512_
mask_ โblend_ pd Experimental avx512f
- Blend packed double-precision (64-bit) floating-point elements from a and b using control mask k, and store the results in dst.
- _mm512_
mask_ โblend_ ps Experimental avx512f
- Blend packed single-precision (32-bit) floating-point elements from a and b using control mask k, and store the results in dst.
- _mm512_
mask_ โbroadcast_ f32x4 Experimental avx512f
- Broadcast the 4 packed single-precision (32-bit) floating-point elements from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โbroadcast_ f64x4 Experimental avx512f
- Broadcast the 4 packed double-precision (64-bit) floating-point elements from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โbroadcast_ i32x4 Experimental avx512f
- Broadcast the 4 packed 32-bit integers from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โbroadcast_ i64x4 Experimental avx512f
- Broadcast the 4 packed 64-bit integers from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โbroadcastd_ epi32 Experimental avx512f
- Broadcast the low packed 32-bit integer from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โbroadcastq_ epi64 Experimental avx512f
- Broadcast the low packed 64-bit integer from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โbroadcastsd_ pd Experimental avx512f
- Broadcast the low double-precision (64-bit) floating-point element from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โbroadcastss_ ps Experimental avx512f
- Broadcast the low single-precision (32-bit) floating-point element from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcmp_ epi32_ mask Experimental avx512f
- Compare packed signed 32-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmp_ epi64_ mask Experimental avx512f
- Compare packed signed 64-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmp_ epu32_ mask Experimental avx512f
- Compare packed unsigned 32-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmp_ epu64_ mask Experimental avx512f
- Compare packed unsigned 64-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmp_ pd_ mask Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmp_ ps_ mask Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmp_ round_ pd_ mask Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
mask_ โcmp_ round_ ps_ mask Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
mask_ โcmpeq_ epi32_ mask Experimental avx512f
- Compare packed 32-bit integers in a and b for equality, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpeq_ epi64_ mask Experimental avx512f
- Compare packed 64-bit integers in a and b for equality, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpeq_ epu32_ mask Experimental avx512f
- Compare packed unsigned 32-bit integers in a and b for equality, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpeq_ epu64_ mask Experimental avx512f
- Compare packed unsigned 64-bit integers in a and b for equality, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpeq_ pd_ mask Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b for equality, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpeq_ ps_ mask Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b for equality, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpge_ epi32_ mask Experimental avx512f
- Compare packed signed 32-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpge_ epi64_ mask Experimental avx512f
- Compare packed signed 64-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpge_ epu32_ mask Experimental avx512f
- Compare packed unsigned 32-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpge_ epu64_ mask Experimental avx512f
- Compare packed unsigned 64-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpgt_ epi32_ mask Experimental avx512f
- Compare packed signed 32-bit integers in a and b for greater-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpgt_ epi64_ mask Experimental avx512f
- Compare packed signed 64-bit integers in a and b for greater-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpgt_ epu32_ mask Experimental avx512f
- Compare packed unsigned 32-bit integers in a and b for greater-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpgt_ epu64_ mask Experimental avx512f
- Compare packed unsigned 64-bit integers in a and b for greater-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmple_ epi32_ mask Experimental avx512f
- Compare packed signed 32-bit integers in a and b for less-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmple_ epi64_ mask Experimental avx512f
- Compare packed signed 64-bit integers in a and b for less-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmple_ epu32_ mask Experimental avx512f
- Compare packed unsigned 32-bit integers in a and b for less-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmple_ epu64_ mask Experimental avx512f
- Compare packed unsigned 64-bit integers in a and b for less-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmple_ pd_ mask Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b for less-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmple_ ps_ mask Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b for less-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmplt_ epi32_ mask Experimental avx512f
- Compare packed signed 32-bit integers in a and b for less-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmplt_ epi64_ mask Experimental avx512f
- Compare packed signed 64-bit integers in a and b for less-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmplt_ epu32_ mask Experimental avx512f
- Compare packed unsigned 32-bit integers in a and b for less-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmplt_ epu64_ mask Experimental avx512f
- Compare packed unsigned 64-bit integers in a and b for less-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmplt_ pd_ mask Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b for less-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmplt_ ps_ mask Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b for less-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpneq_ epi32_ mask Experimental avx512f
- Compare packed 32-bit integers in a and b for not-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpneq_ epi64_ mask Experimental avx512f
- Compare packed signed 64-bit integers in a and b for not-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpneq_ epu32_ mask Experimental avx512f
- Compare packed unsigned 32-bit integers in a and b for not-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpneq_ epu64_ mask Experimental avx512f
- Compare packed unsigned 64-bit integers in a and b for not-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpneq_ pd_ mask Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b for not-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpneq_ ps_ mask Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b for not-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpnle_ pd_ mask Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b for not-less-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpnle_ ps_ mask Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b for not-less-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpnlt_ pd_ mask Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b for not-less-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpnlt_ ps_ mask Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b for not-less-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpord_ pd_ mask Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b to see if neither is NaN, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpord_ ps_ mask Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b to see if neither is NaN, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpunord_ pd_ mask Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b to see if either is NaN, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcmpunord_ ps_ mask Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b to see if either is NaN, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
mask_ โcompress_ epi32 Experimental avx512f
- Contiguously store the active 32-bit integers in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src.
- _mm512_
mask_ โcompress_ epi64 Experimental avx512f
- Contiguously store the active 64-bit integers in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src.
- _mm512_
mask_ โcompress_ pd Experimental avx512f
- Contiguously store the active double-precision (64-bit) floating-point elements in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src.
- _mm512_
mask_ โcompress_ ps Experimental avx512f
- Contiguously store the active single-precision (32-bit) floating-point elements in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src.
- _mm512_
mask_ โcompressstoreu_ epi32 Experimental avx512f
- Contiguously store the active 32-bit integers in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm512_
mask_ โcompressstoreu_ epi64 Experimental avx512f
- Contiguously store the active 64-bit integers in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm512_
mask_ โcompressstoreu_ pd Experimental avx512f
- Contiguously store the active double-precision (64-bit) floating-point elements in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm512_
mask_ โcompressstoreu_ ps Experimental avx512f
- Contiguously store the active single-precision (32-bit) floating-point elements in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm512_
mask_ โcvt_ roundepi32_ ps Experimental avx512f
- Convert packed signed 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).\
- _mm512_
mask_ โcvt_ roundepu32_ ps Experimental avx512f
- Convert packed unsigned 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).\
- _mm512_
mask_ โcvt_ roundpd_ epi32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).\
- _mm512_
mask_ โcvt_ roundpd_ epu32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).\
- _mm512_
mask_ โcvt_ roundpd_ ps Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).\
- _mm512_
mask_ โcvt_ roundph_ ps Experimental avx512f
- Convert packed half-precision (16-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
mask_ โcvt_ roundps_ epi32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).\
- _mm512_
mask_ โcvt_ roundps_ epu32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).\
- _mm512_
mask_ โcvt_ roundps_ pd Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
mask_ โcvt_ roundps_ ph Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed half-precision (16-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
mask_ โcvtepi8_ epi32 Experimental avx512f
- Sign extend packed 8-bit integers in a to packed 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtepi8_ epi64 Experimental avx512f
- Sign extend packed 8-bit integers in the low 8 bytes of a to packed 64-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtepi16_ epi32 Experimental avx512f
- Sign extend packed 16-bit integers in a to packed 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtepi16_ epi64 Experimental avx512f
- Sign extend packed 16-bit integers in a to packed 64-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtepi32_ epi8 Experimental avx512f
- Convert packed 32-bit integers in a to packed 8-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtepi32_ epi16 Experimental avx512f
- Convert packed 32-bit integers in a to packed 16-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtepi32_ epi64 Experimental avx512f
- Sign extend packed 32-bit integers in a to packed 64-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtepi32_ pd Experimental avx512f
- Convert packed signed 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtepi32_ ps Experimental avx512f
- Convert packed signed 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtepi32_ storeu_ epi8 Experimental avx512f
- Convert packed 32-bit integers in a to packed 8-bit integers with truncation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm512_
mask_ โcvtepi32_ storeu_ epi16 Experimental avx512f
- Convert packed 32-bit integers in a to packed 16-bit integers with truncation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm512_
mask_ โcvtepi32lo_ pd Experimental avx512f
- Performs element-by-element conversion of the lower half of packed 32-bit integer elements in v2 to packed double-precision (64-bit) floating-point elements, storing the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtepi64_ epi8 Experimental avx512f
- Convert packed 64-bit integers in a to packed 8-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtepi64_ epi16 Experimental avx512f
- Convert packed 64-bit integers in a to packed 16-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtepi64_ epi32 Experimental avx512f
- Convert packed 64-bit integers in a to packed 32-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtepi64_ storeu_ epi8 Experimental avx512f
- Convert packed 64-bit integers in a to packed 8-bit integers with truncation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm512_
mask_ โcvtepi64_ storeu_ epi16 Experimental avx512f
- Convert packed 64-bit integers in a to packed 16-bit integers with truncation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm512_
mask_ โcvtepi64_ storeu_ epi32 Experimental avx512f
- Convert packed 64-bit integers in a to packed 32-bit integers with truncation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm512_
mask_ โcvtepu8_ epi32 Experimental avx512f
- Zero extend packed unsigned 8-bit integers in a to packed 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtepu8_ epi64 Experimental avx512f
- Zero extend packed unsigned 8-bit integers in the low 8 bytes of a to packed 64-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtepu16_ epi32 Experimental avx512f
- Zero extend packed unsigned 16-bit integers in a to packed 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtepu16_ epi64 Experimental avx512f
- Zero extend packed unsigned 16-bit integers in a to packed 64-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtepu32_ epi64 Experimental avx512f
- Zero extend packed unsigned 32-bit integers in a to packed 64-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtepu32_ pd Experimental avx512f
- Convert packed unsigned 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtepu32_ ps Experimental avx512f
- Convert packed unsigned 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtepu32lo_ pd Experimental avx512f
- Performs element-by-element conversion of the lower half of 32-bit unsigned integer elements in v2 to packed double-precision (64-bit) floating-point elements, storing the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtpd_ epi32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtpd_ epu32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtpd_ ps Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtpd_ pslo Experimental avx512f
- Performs an element-by-element conversion of packed double-precision (64-bit) floating-point elements in v2 to single-precision (32-bit) floating-point elements and stores them in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The elements are stored in the lower half of the results vector, while the remaining upper half locations are set to 0.
- _mm512_
mask_ โcvtph_ ps Experimental avx512f
- Convert packed half-precision (16-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtps_ epi32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtps_ epu32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtps_ pd Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtps_ ph Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed half-precision (16-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
mask_ โcvtpslo_ pd Experimental avx512f
- Performs element-by-element conversion of the lower half of packed single-precision (32-bit) floating-point elements in v2 to packed double-precision (64-bit) floating-point elements, storing the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtsepi32_ epi8 Experimental avx512f
- Convert packed signed 32-bit integers in a to packed 8-bit integers with signed saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtsepi32_ epi16 Experimental avx512f
- Convert packed signed 32-bit integers in a to packed 16-bit integers with signed saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtsepi32_ storeu_ epi8 Experimental avx512f
- Convert packed signed 32-bit integers in a to packed 8-bit integers with signed saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm512_
mask_ โcvtsepi32_ storeu_ epi16 Experimental avx512f
- Convert packed signed 32-bit integers in a to packed 16-bit integers with signed saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm512_
mask_ โcvtsepi64_ epi8 Experimental avx512f
- Convert packed signed 64-bit integers in a to packed 8-bit integers with signed saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtsepi64_ epi16 Experimental avx512f
- Convert packed signed 64-bit integers in a to packed 16-bit integers with signed saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtsepi64_ epi32 Experimental avx512f
- Convert packed signed 64-bit integers in a to packed 32-bit integers with signed saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtsepi64_ storeu_ epi8 Experimental avx512f
- Convert packed signed 64-bit integers in a to packed 8-bit integers with signed saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm512_
mask_ โcvtsepi64_ storeu_ epi16 Experimental avx512f
- Convert packed signed 64-bit integers in a to packed 16-bit integers with signed saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm512_
mask_ โcvtsepi64_ storeu_ epi32 Experimental avx512f
- Convert packed signed 64-bit integers in a to packed 32-bit integers with signed saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm512_
mask_ โcvtt_ roundpd_ epi32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
mask_ โcvtt_ roundpd_ epu32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
mask_ โcvtt_ roundps_ epi32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
mask_ โcvtt_ roundps_ epu32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
mask_ โcvttpd_ epi32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvttpd_ epu32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvttps_ epi32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvttps_ epu32 Experimental avx512f
- Convert packed double-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtusepi32_ epi8 Experimental avx512f
- Convert packed unsigned 32-bit integers in a to packed unsigned 8-bit integers with unsigned saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtusepi32_ epi16 Experimental avx512f
- Convert packed unsigned 32-bit integers in a to packed unsigned 16-bit integers with unsigned saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtusepi32_ storeu_ epi8 Experimental avx512f
- Convert packed unsigned 32-bit integers in a to packed 8-bit integers with unsigned saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm512_
mask_ โcvtusepi32_ storeu_ epi16 Experimental avx512f
- Convert packed unsigned 32-bit integers in a to packed 16-bit integers with unsigned saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm512_
mask_ โcvtusepi64_ epi8 Experimental avx512f
- Convert packed unsigned 64-bit integers in a to packed unsigned 8-bit integers with unsigned saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtusepi64_ epi16 Experimental avx512f
- Convert packed unsigned 64-bit integers in a to packed unsigned 16-bit integers with unsigned saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtusepi64_ epi32 Experimental avx512f
- Convert packed unsigned 64-bit integers in a to packed unsigned 32-bit integers with unsigned saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โcvtusepi64_ storeu_ epi8 Experimental avx512f
- Convert packed unsigned 64-bit integers in a to packed 8-bit integers with unsigned saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm512_
mask_ โcvtusepi64_ storeu_ epi16 Experimental avx512f
- Convert packed unsigned 64-bit integers in a to packed 16-bit integers with unsigned saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm512_
mask_ โcvtusepi64_ storeu_ epi32 Experimental avx512f
- Convert packed unsigned 64-bit integers in a to packed 32-bit integers with unsigned saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm512_
mask_ โdiv_ pd Experimental avx512f
- Divide packed double-precision (64-bit) floating-point elements in a by packed elements in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โdiv_ ps Experimental avx512f
- Divide packed single-precision (32-bit) floating-point elements in a by packed elements in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โdiv_ round_ pd Experimental avx512f
- Divide packed double-precision (64-bit) floating-point elements in a by packed elements in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).\
- _mm512_
mask_ โdiv_ round_ ps Experimental avx512f
- Divide packed single-precision (32-bit) floating-point elements in a by packed elements in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).\
- _mm512_
mask_ โexpand_ epi32 Experimental avx512f
- Load contiguous active 32-bit integers from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โexpand_ epi64 Experimental avx512f
- Load contiguous active 64-bit integers from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โexpand_ pd Experimental avx512f
- Load contiguous active double-precision (64-bit) floating-point elements from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โexpand_ ps Experimental avx512f
- Load contiguous active single-precision (32-bit) floating-point elements from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โexpandloadu_ epi32 Experimental avx512f
- Load contiguous active 32-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โexpandloadu_ epi64 Experimental avx512f
- Load contiguous active 64-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โexpandloadu_ pd Experimental avx512f
- Load contiguous active double-precision (64-bit) floating-point elements from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โexpandloadu_ ps Experimental avx512f
- Load contiguous active single-precision (32-bit) floating-point elements from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โextractf32x4_ ps Experimental avx512f
- Extract 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from a, selected with imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โextractf64x4_ pd Experimental avx512f
- Extract 256 bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a, selected with imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โextracti32x4_ epi32 Experimental avx512f
- Extract 128 bits (composed of 4 packed 32-bit integers) from a, selected with IMM2, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โextracti64x4_ epi64 Experimental avx512f
- Extract 256 bits (composed of 4 packed 64-bit integers) from a, selected with IMM1, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โfixupimm_ pd Experimental avx512f
- Fix up packed double-precision (64-bit) floating-point elements in a and b using packed 64-bit integers in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). imm8 is used to set the required flags reporting.
- _mm512_
mask_ โfixupimm_ ps Experimental avx512f
- Fix up packed single-precision (32-bit) floating-point elements in a and b using packed 32-bit integers in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). imm8 is used to set the required flags reporting.
- _mm512_
mask_ โfixupimm_ round_ pd Experimental avx512f
- Fix up packed double-precision (64-bit) floating-point elements in a and b using packed 64-bit integers in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). imm8 is used to set the required flags reporting.\
- _mm512_
mask_ โfixupimm_ round_ ps Experimental avx512f
- Fix up packed single-precision (32-bit) floating-point elements in a and b using packed 32-bit integers in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). imm8 is used to set the required flags reporting.\
- _mm512_
mask_ โfmadd_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm512_
mask_ โfmadd_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm512_
mask_ โfmadd_ round_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).\
- _mm512_
mask_ โfmadd_ round_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).\
- _mm512_
mask_ โfmaddsub_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm512_
mask_ โfmaddsub_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm512_
mask_ โfmaddsub_ round_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).\
- _mm512_
mask_ โfmaddsub_ round_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).\
- _mm512_
mask_ โfmsub_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm512_
mask_ โfmsub_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm512_
mask_ โfmsub_ round_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).\
- _mm512_
mask_ โfmsub_ round_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).\
- _mm512_
mask_ โfmsubadd_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm512_
mask_ โfmsubadd_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm512_
mask_ โfmsubadd_ round_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).\
- _mm512_
mask_ โfmsubadd_ round_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).\
- _mm512_
mask_ โfnmadd_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm512_
mask_ โfnmadd_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm512_
mask_ โfnmadd_ round_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).\
- _mm512_
mask_ โfnmadd_ round_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).\
- _mm512_
mask_ โfnmsub_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm512_
mask_ โfnmsub_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm512_
mask_ โfnmsub_ round_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).\
- _mm512_
mask_ โfnmsub_ round_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).\
- _mm512_
mask_ โgetexp_ pd Experimental avx512f
- Convert the exponent of each packed double-precision (64-bit) floating-point element in a to a double-precision (64-bit) floating-point number representing the integer exponent, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element.
- _mm512_
mask_ โgetexp_ ps Experimental avx512f
- Convert the exponent of each packed single-precision (32-bit) floating-point element in a to a single-precision (32-bit) floating-point number representing the integer exponent, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element.
- _mm512_
mask_ โgetexp_ round_ pd Experimental avx512f
- Convert the exponent of each packed double-precision (64-bit) floating-point element in a to a double-precision (64-bit) floating-point number representing the integer exponent, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
mask_ โgetexp_ round_ ps Experimental avx512f
- Convert the exponent of each packed single-precision (32-bit) floating-point element in a to a single-precision (32-bit) floating-point number representing the integer exponent, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
mask_ โgetmant_ pd Experimental avx512f
- Normalize the mantissas of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 - _mm512_
mask_ โgetmant_ ps Experimental avx512f
- Normalize the mantissas of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 - _mm512_
mask_ โgetmant_ round_ pd Experimental avx512f
- Normalize the mantissas of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
mask_ โgetmant_ round_ ps Experimental avx512f
- Normalize the mantissas of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
mask_ โi32gather_ epi32 Experimental avx512f
- Gather 32-bit integers from memory using 32-bit indices. 32-bit elements are loaded from addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8.
- _mm512_
mask_ โi32gather_ epi64 Experimental avx512f
- Gather 64-bit integers from memory using 32-bit indices. 64-bit elements are loaded from addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8.
- _mm512_
mask_ โi32gather_ pd Experimental avx512f
- Gather double-precision (64-bit) floating-point elements from memory using 32-bit indices. 64-bit elements are loaded from addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8.
- _mm512_
mask_ โi32gather_ ps Experimental avx512f
- Gather single-precision (32-bit) floating-point elements from memory using 32-bit indices. 32-bit elements are loaded from addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8.
- _mm512_
mask_ โi32logather_ epi64 Experimental avx512f
- Loads 8 64-bit integer elements from memory starting at location base_addr at packed 32-bit integer indices stored in the lower half of vindex scaled by scale and stores them in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โi32logather_ pd Experimental avx512f
- Loads 8 double-precision (64-bit) floating-point elements from memory starting at location base_addr at packed 32-bit integer indices stored in the lower half of vindex scaled by scale and stores them in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โi32loscatter_ epi64 Experimental avx512f
- Stores 8 64-bit integer elements from a to memory starting at location base_addr at packed 32-bit integer indices stored in the lower half of vindex scaled by scale using writemask k (elements whose corresponding mask bit is not set are not written to memory).
- _mm512_
mask_ โi32loscatter_ pd Experimental avx512f
- Stores 8 double-precision (64-bit) floating-point elements from a to memory starting at location base_addr at packed 32-bit integer indices stored in the lower half of vindex scaled by scale using writemask k (elements whose corresponding mask bit is not set are not written to memory).
- _mm512_
mask_ โi32scatter_ epi32 Experimental avx512f
- Scatter 32-bit integers from a into memory using 32-bit indices. 32-bit elements are stored at addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale) subject to mask k (elements are not stored when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8.
- _mm512_
mask_ โi32scatter_ epi64 Experimental avx512f
- Scatter 64-bit integers from a into memory using 32-bit indices. 64-bit elements are stored at addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale) subject to mask k (elements are not stored when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8.
- _mm512_
mask_ โi32scatter_ pd Experimental avx512f
- Scatter double-precision (64-bit) floating-point elements from a into memory using 32-bit indices. 64-bit elements are stored at addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale) subject to mask k (elements are not stored when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8.
- _mm512_
mask_ โi32scatter_ ps Experimental avx512f
- Scatter single-precision (32-bit) floating-point elements from a into memory using 32-bit indices. 32-bit elements are stored at addresses starting at base_addr and offset by each 32-bit element in vindex (each index is scaled by the factor in scale) subject to mask k (elements are not stored when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8.
- _mm512_
mask_ โi64gather_ epi32 Experimental avx512f
- Gather 32-bit integers from memory using 64-bit indices. 32-bit elements are loaded from addresses starting at base_addr and offset by each 64-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8.
- _mm512_
mask_ โi64gather_ epi64 Experimental avx512f
- Gather 64-bit integers from memory using 64-bit indices. 64-bit elements are loaded from addresses starting at base_addr and offset by each 64-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8.
- _mm512_
mask_ โi64gather_ pd Experimental avx512f
- Gather double-precision (64-bit) floating-point elements from memory using 64-bit indices. 64-bit elements are loaded from addresses starting at base_addr and offset by each 64-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8.
- _mm512_
mask_ โi64gather_ ps Experimental avx512f
- Gather single-precision (32-bit) floating-point elements from memory using 64-bit indices. 32-bit elements are loaded from addresses starting at base_addr and offset by each 64-bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8.
- _mm512_
mask_ โi64scatter_ epi32 Experimental avx512f
- Scatter 32-bit integers from a into memory using 64-bit indices. 32-bit elements are stored at addresses starting at base_addr and offset by each 64-bit element in vindex (each index is scaled by the factor in scale) subject to mask k (elements are not stored when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8.
- _mm512_
mask_ โi64scatter_ epi64 Experimental avx512f
- Scatter 64-bit integers from a into memory using 64-bit indices. 64-bit elements are stored at addresses starting at base_addr and offset by each 64-bit element in vindex (each index is scaled by the factor in scale) subject to mask k (elements are not stored when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8.
- _mm512_
mask_ โi64scatter_ pd Experimental avx512f
- Scatter double-precision (64-bit) floating-point elements from a into memory using 64-bit indices. 64-bit elements are stored at addresses starting at base_addr and offset by each 64-bit element in vindex (each index is scaled by the factor in scale) subject to mask k (elements are not stored when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8.
- _mm512_
mask_ โi64scatter_ ps Experimental avx512f
- Scatter single-precision (32-bit) floating-point elements from a into memory using 64-bit indices. 32-bit elements are stored at addresses starting at base_addr and offset by each 64-bit element in vindex (each index is scaled by the factor in scale) subject to mask k (elements are not stored when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8.
- _mm512_
mask_ โinsertf32x4 Experimental avx512f
- Copy a to tmp, then insert 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from b into tmp at the location specified by imm8. Store tmp to dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โinsertf64x4 Experimental avx512f
- Copy a to tmp, then insert 256 bits (composed of 4 packed double-precision (64-bit) floating-point elements) from b into tmp at the location specified by imm8. Store tmp to dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โinserti32x4 Experimental avx512f
- Copy a to tmp, then insert 128 bits (composed of 4 packed 32-bit integers) from b into tmp at the location specified by imm8. Store tmp to dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โinserti64x4 Experimental avx512f
- Copy a to tmp, then insert 256 bits (composed of 4 packed 64-bit integers) from b into tmp at the location specified by imm8. Store tmp to dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โload_ epi32 Experimental avx512f
- Load packed 32-bit integers from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
mask_ โload_ epi64 Experimental avx512f
- Load packed 64-bit integers from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
mask_ โload_ pd Experimental avx512f
- Load packed double-precision (64-bit) floating-point elements from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
mask_ โload_ ps Experimental avx512f
- Load packed single-precision (32-bit) floating-point elements from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
mask_ โloadu_ epi32 Experimental avx512f
- Load packed 32-bit integers from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm512_
mask_ โloadu_ epi64 Experimental avx512f
- Load packed 64-bit integers from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm512_
mask_ โloadu_ pd Experimental avx512f
- Load packed double-precision (64-bit) floating-point elements from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm512_
mask_ โloadu_ ps Experimental avx512f
- Load packed single-precision (32-bit) floating-point elements from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm512_
mask_ โmax_ epi32 Experimental avx512f
- Compare packed signed 32-bit integers in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmax_ epi64 Experimental avx512f
- Compare packed signed 64-bit integers in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmax_ epu32 Experimental avx512f
- Compare packed unsigned 32-bit integers in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmax_ epu64 Experimental avx512f
- Compare packed unsigned 64-bit integers in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmax_ pd Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmax_ ps Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmax_ round_ pd Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
mask_ โmax_ round_ ps Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
mask_ โmin_ epi32 Experimental avx512f
- Compare packed signed 32-bit integers in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmin_ epi64 Experimental avx512f
- Compare packed signed 64-bit integers in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmin_ epu32 Experimental avx512f
- Compare packed unsigned 32-bit integers in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmin_ epu64 Experimental avx512f
- Compare packed unsigned 64-bit integers in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmin_ pd Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmin_ ps Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmin_ round_ pd Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
mask_ โmin_ round_ ps Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
mask_ โmov_ epi32 Experimental avx512f
- Move packed 32-bit integers from a to dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmov_ epi64 Experimental avx512f
- Move packed 64-bit integers from a to dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmov_ pd Experimental avx512f
- Move packed double-precision (64-bit) floating-point elements from a to dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmov_ ps Experimental avx512f
- Move packed single-precision (32-bit) floating-point elements from a to dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmovedup_ pd Experimental avx512f
- Duplicate even-indexed double-precision (64-bit) floating-point elements from a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmovehdup_ ps Experimental avx512f
- Duplicate odd-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmoveldup_ ps Experimental avx512f
- Duplicate even-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmul_ epi32 Experimental avx512f
- Multiply the low signed 32-bit integers from each packed 64-bit element in a and b, and store the signed 64-bit results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmul_ epu32 Experimental avx512f
- Multiply the low unsigned 32-bit integers from each packed 64-bit element in a and b, and store the unsigned 64-bit results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmul_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmul_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmul_ round_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).\
- _mm512_
mask_ โmul_ round_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).\
- _mm512_
mask_ โmullo_ epi32 Experimental avx512f
- Multiply the packed 32-bit integers in a and b, producing intermediate 64-bit integers, and store the low 32 bits of the intermediate integers in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โmullox_ epi64 Experimental avx512f
- Multiplies elements in packed 64-bit integer vectors a and b together, storing the lower 64 bits of the result in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โor_ epi32 Experimental avx512f
- Compute the bitwise OR of packed 32-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โor_ epi64 Experimental avx512f
- Compute the bitwise OR of packed 64-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โpermute_ pd Experimental avx512f
- Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โpermute_ ps Experimental avx512f
- Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โpermutevar_ epi32 Experimental avx512f
- Shuffle 32-bit integers in a across lanes using the corresponding index in idx, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). Note that this intrinsic shuffles across 128-bit lanes, unlike past intrinsics that use the permutevar name. This intrinsic is identical to _mm512_mask_permutexvar_epi32, and it is recommended that you use that intrinsic name.
- _mm512_
mask_ โpermutevar_ pd Experimental avx512f
- Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โpermutevar_ ps Experimental avx512f
- Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โpermutex2var_ epi32 Experimental avx512f
- Shuffle 32-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm512_
mask_ โpermutex2var_ epi64 Experimental avx512f
- Shuffle 64-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm512_
mask_ โpermutex2var_ pd Experimental avx512f
- Shuffle double-precision (64-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm512_
mask_ โpermutex2var_ ps Experimental avx512f
- Shuffle single-precision (32-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm512_
mask_ โpermutex_ epi64 Experimental avx512f
- Shuffle 64-bit integers in a within 256-bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โpermutex_ pd Experimental avx512f
- Shuffle double-precision (64-bit) floating-point elements in a within 256-bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โpermutexvar_ epi32 Experimental avx512f
- Shuffle 32-bit integers in a across lanes using the corresponding index in idx, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โpermutexvar_ epi64 Experimental avx512f
- Shuffle 64-bit integers in a across lanes using the corresponding index in idx, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โpermutexvar_ pd Experimental avx512f
- Shuffle double-precision (64-bit) floating-point elements in a across lanes using the corresponding index in idx, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โpermutexvar_ ps Experimental avx512f
- Shuffle single-precision (32-bit) floating-point elements in a across lanes using the corresponding index in idx, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โrcp14_ pd Experimental avx512f
- Compute the approximate reciprocal of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm512_
mask_ โrcp14_ ps Experimental avx512f
- Compute the approximate reciprocal of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm512_
mask_ โreduce_ add_ epi32 Experimental avx512f
- Reduce the packed 32-bit integers in a by addition using mask k. Returns the sum of all active elements in a.
- _mm512_
mask_ โreduce_ add_ epi64 Experimental avx512f
- Reduce the packed 64-bit integers in a by addition using mask k. Returns the sum of all active elements in a.
- _mm512_
mask_ โreduce_ add_ pd Experimental avx512f
- Reduce the packed double-precision (64-bit) floating-point elements in a by addition using mask k. Returns the sum of all active elements in a.
- _mm512_
mask_ โreduce_ add_ ps Experimental avx512f
- Reduce the packed single-precision (32-bit) floating-point elements in a by addition using mask k. Returns the sum of all active elements in a.
- _mm512_
mask_ โreduce_ and_ epi32 Experimental avx512f
- Reduce the packed 32-bit integers in a by bitwise AND using mask k. Returns the bitwise AND of all active elements in a.
- _mm512_
mask_ โreduce_ and_ epi64 Experimental avx512f
- Reduce the packed 64-bit integers in a by addition using mask k. Returns the sum of all active elements in a.
- _mm512_
mask_ โreduce_ max_ epi32 Experimental avx512f
- Reduce the packed signed 32-bit integers in a by maximum using mask k. Returns the maximum of all active elements in a.
- _mm512_
mask_ โreduce_ max_ epi64 Experimental avx512f
- Reduce the packed signed 64-bit integers in a by maximum using mask k. Returns the maximum of all active elements in a.
- _mm512_
mask_ โreduce_ max_ epu32 Experimental avx512f
- Reduce the packed unsigned 32-bit integers in a by maximum using mask k. Returns the maximum of all active elements in a.
- _mm512_
mask_ โreduce_ max_ epu64 Experimental avx512f
- Reduce the packed unsigned 64-bit integers in a by maximum using mask k. Returns the maximum of all active elements in a.
- _mm512_
mask_ โreduce_ max_ pd Experimental avx512f
- Reduce the packed double-precision (64-bit) floating-point elements in a by maximum using mask k. Returns the maximum of all active elements in a.
- _mm512_
mask_ โreduce_ max_ ps Experimental avx512f
- Reduce the packed single-precision (32-bit) floating-point elements in a by maximum using mask k. Returns the maximum of all active elements in a.
- _mm512_
mask_ โreduce_ min_ epi32 Experimental avx512f
- Reduce the packed signed 32-bit integers in a by maximum using mask k. Returns the minimum of all active elements in a.
- _mm512_
mask_ โreduce_ min_ epi64 Experimental avx512f
- Reduce the packed signed 64-bit integers in a by maximum using mask k. Returns the minimum of all active elements in a.
- _mm512_
mask_ โreduce_ min_ epu32 Experimental avx512f
- Reduce the packed unsigned 32-bit integers in a by maximum using mask k. Returns the minimum of all active elements in a.
- _mm512_
mask_ โreduce_ min_ epu64 Experimental avx512f
- Reduce the packed signed 64-bit integers in a by maximum using mask k. Returns the minimum of all active elements in a.
- _mm512_
mask_ โreduce_ min_ pd Experimental avx512f
- Reduce the packed double-precision (64-bit) floating-point elements in a by maximum using mask k. Returns the minimum of all active elements in a.
- _mm512_
mask_ โreduce_ min_ ps Experimental avx512f
- Reduce the packed single-precision (32-bit) floating-point elements in a by maximum using mask k. Returns the minimum of all active elements in a.
- _mm512_
mask_ โreduce_ mul_ epi32 Experimental avx512f
- Reduce the packed 32-bit integers in a by multiplication using mask k. Returns the product of all active elements in a.
- _mm512_
mask_ โreduce_ mul_ epi64 Experimental avx512f
- Reduce the packed 64-bit integers in a by multiplication using mask k. Returns the product of all active elements in a.
- _mm512_
mask_ โreduce_ mul_ pd Experimental avx512f
- Reduce the packed double-precision (64-bit) floating-point elements in a by multiplication using mask k. Returns the product of all active elements in a.
- _mm512_
mask_ โreduce_ mul_ ps Experimental avx512f
- Reduce the packed single-precision (32-bit) floating-point elements in a by multiplication using mask k. Returns the product of all active elements in a.
- _mm512_
mask_ โreduce_ or_ epi32 Experimental avx512f
- Reduce the packed 32-bit integers in a by bitwise OR using mask k. Returns the bitwise OR of all active elements in a.
- _mm512_
mask_ โreduce_ or_ epi64 Experimental avx512f
- Reduce the packed 64-bit integers in a by bitwise OR using mask k. Returns the bitwise OR of all active elements in a.
- _mm512_
mask_ โrol_ epi32 Experimental avx512f
- Rotate the bits in each packed 32-bit integer in a to the left by the number of bits specified in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โrol_ epi64 Experimental avx512f
- Rotate the bits in each packed 64-bit integer in a to the left by the number of bits specified in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โrolv_ epi32 Experimental avx512f
- Rotate the bits in each packed 32-bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โrolv_ epi64 Experimental avx512f
- Rotate the bits in each packed 64-bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โror_ epi32 Experimental avx512f
- Rotate the bits in each packed 32-bit integer in a to the right by the number of bits specified in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โror_ epi64 Experimental avx512f
- Rotate the bits in each packed 64-bit integer in a to the right by the number of bits specified in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โrorv_ epi32 Experimental avx512f
- Rotate the bits in each packed 32-bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โrorv_ epi64 Experimental avx512f
- Rotate the bits in each packed 64-bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โroundscale_ pd Experimental avx512f
- Round packed double-precision (64-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm512_
mask_ โroundscale_ ps Experimental avx512f
- Round packed single-precision (32-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm512_
mask_ โroundscale_ round_ pd Experimental avx512f
- Round packed double-precision (64-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm512_
mask_ โroundscale_ round_ ps Experimental avx512f
- Round packed single-precision (32-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm512_
mask_ โrsqrt14_ pd Experimental avx512f
- Compute the approximate reciprocal square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm512_
mask_ โrsqrt14_ ps Experimental avx512f
- Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm512_
mask_ โscalef_ pd Experimental avx512f
- Scale the packed double-precision (64-bit) floating-point elements in a using values from b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โscalef_ ps Experimental avx512f
- Scale the packed single-precision (32-bit) floating-point elements in a using values from b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โscalef_ round_ pd Experimental avx512f
- Scale the packed double-precision (64-bit) floating-point elements in a using values from b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).\
- _mm512_
mask_ โscalef_ round_ ps Experimental avx512f
- Scale the packed single-precision (32-bit) floating-point elements in a using values from b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).\
- _mm512_
mask_ โset1_ epi32 Experimental avx512f
- Broadcast 32-bit integer a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โset1_ epi64 Experimental avx512f
- Broadcast 64-bit integer a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โshuffle_ epi32 Experimental avx512f
- Shuffle 32-bit integers in a within 128-bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โshuffle_ f32x4 Experimental avx512f
- Shuffle 128-bits (composed of 4 single-precision (32-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โshuffle_ f64x2 Experimental avx512f
- Shuffle 128-bits (composed of 2 double-precision (64-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โshuffle_ i32x4 Experimental avx512f
- Shuffle 128-bits (composed of 4 32-bit integers) selected by imm8 from a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โshuffle_ i64x2 Experimental avx512f
- Shuffle 128-bits (composed of 2 64-bit integers) selected by imm8 from a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โshuffle_ pd Experimental avx512f
- Shuffle double-precision (64-bit) floating-point elements within 128-bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โshuffle_ ps Experimental avx512f
- Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsll_ epi32 Experimental avx512f
- Shift packed 32-bit integers in a left by count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsll_ epi64 Experimental avx512f
- Shift packed 64-bit integers in a left by count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โslli_ epi32 Experimental avx512f
- Shift packed 32-bit integers in a left by imm8 while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โslli_ epi64 Experimental avx512f
- Shift packed 64-bit integers in a left by imm8 while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsllv_ epi32 Experimental avx512f
- Shift packed 32-bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsllv_ epi64 Experimental avx512f
- Shift packed 64-bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsqrt_ pd Experimental avx512f
- Compute the square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsqrt_ ps Experimental avx512f
- Compute the square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsqrt_ round_ pd Experimental avx512f
- Compute the square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).\
- _mm512_
mask_ โsqrt_ round_ ps Experimental avx512f
- Compute the square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).\
- _mm512_
mask_ โsra_ epi32 Experimental avx512f
- Shift packed 32-bit integers in a right by count while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsra_ epi64 Experimental avx512f
- Shift packed 64-bit integers in a right by count while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsrai_ epi32 Experimental avx512f
- Shift packed 32-bit integers in a right by imm8 while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsrai_ epi64 Experimental avx512f
- Shift packed 64-bit integers in a right by imm8 while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsrav_ epi32 Experimental avx512f
- Shift packed 32-bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsrav_ epi64 Experimental avx512f
- Shift packed 64-bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsrl_ epi32 Experimental avx512f
- Shift packed 32-bit integers in a right by count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsrl_ epi64 Experimental avx512f
- Shift packed 64-bit integers in a right by count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsrli_ epi32 Experimental avx512f
- Shift packed 32-bit integers in a right by imm8 while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsrli_ epi64 Experimental avx512f
- Shift packed 64-bit integers in a right by imm8 while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsrlv_ epi32 Experimental avx512f
- Shift packed 32-bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsrlv_ epi64 Experimental avx512f
- Shift packed 64-bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โstore_ epi32 Experimental avx512f
- Store packed 32-bit integers from a into memory using writemask k. mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
mask_ โstore_ epi64 Experimental avx512f
- Store packed 64-bit integers from a into memory using writemask k. mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
mask_ โstore_ pd Experimental avx512f
- Store packed double-precision (64-bit) floating-point elements from a into memory using writemask k. mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
mask_ โstore_ ps Experimental avx512f
- Store packed single-precision (32-bit) floating-point elements from a into memory using writemask k. mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
mask_ โstoreu_ epi32 Experimental avx512f
- Store packed 32-bit integers from a into memory using writemask k. mem_addr does not need to be aligned on any particular boundary.
- _mm512_
mask_ โstoreu_ epi64 Experimental avx512f
- Store packed 64-bit integers from a into memory using writemask k. mem_addr does not need to be aligned on any particular boundary.
- _mm512_
mask_ โstoreu_ pd Experimental avx512f
- Store packed double-precision (64-bit) floating-point elements from a into memory using writemask k. mem_addr does not need to be aligned on any particular boundary.
- _mm512_
mask_ โstoreu_ ps Experimental avx512f
- Store packed single-precision (32-bit) floating-point elements from a into memory using writemask k. mem_addr does not need to be aligned on any particular boundary.
- _mm512_
mask_ โsub_ epi32 Experimental avx512f
- Subtract packed 32-bit integers in b from packed 32-bit integers in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsub_ epi64 Experimental avx512f
- Subtract packed 64-bit integers in b from packed 64-bit integers in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsub_ pd Experimental avx512f
- Subtract packed double-precision (64-bit) floating-point elements in b from packed double-precision (64-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsub_ ps Experimental avx512f
- Subtract packed single-precision (32-bit) floating-point elements in b from packed single-precision (32-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โsub_ round_ pd Experimental avx512f
- Subtract packed double-precision (64-bit) floating-point elements in b from packed double-precision (64-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).\
- _mm512_
mask_ โsub_ round_ ps Experimental avx512f
- Subtract packed single-precision (32-bit) floating-point elements in b from packed single-precision (32-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).\
- _mm512_
mask_ โternarylogic_ epi32 Experimental avx512f
- Bitwise ternary logic that provides the capability to implement any three-operand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 32-bit integer, the corresponding bit from src, a, and b are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst using writemask k at 32-bit granularity (32-bit elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โternarylogic_ epi64 Experimental avx512f
- Bitwise ternary logic that provides the capability to implement any three-operand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 64-bit integer, the corresponding bit from src, a, and b are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst using writemask k at 64-bit granularity (64-bit elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โtest_ epi32_ mask Experimental avx512f
- Compute the bitwise AND of packed 32-bit integers in a and b, producing intermediate 32-bit values, and set the corresponding bit in result mask k (subject to writemask k) if the intermediate value is non-zero.
- _mm512_
mask_ โtest_ epi64_ mask Experimental avx512f
- Compute the bitwise AND of packed 64-bit integers in a and b, producing intermediate 64-bit values, and set the corresponding bit in result mask k (subject to writemask k) if the intermediate value is non-zero.
- _mm512_
mask_ โtestn_ epi32_ mask Experimental avx512f
- Compute the bitwise NAND of packed 32-bit integers in a and b, producing intermediate 32-bit values, and set the corresponding bit in result mask k (subject to writemask k) if the intermediate value is zero.
- _mm512_
mask_ โtestn_ epi64_ mask Experimental avx512f
- Compute the bitwise NAND of packed 64-bit integers in a and b, producing intermediate 64-bit values, and set the corresponding bit in result mask k (subject to writemask k) if the intermediate value is zero.
- _mm512_
mask_ โunpackhi_ epi32 Experimental avx512f
- Unpack and interleave 32-bit integers from the high half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โunpackhi_ epi64 Experimental avx512f
- Unpack and interleave 64-bit integers from the high half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โunpackhi_ pd Experimental avx512f
- Unpack and interleave double-precision (64-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โunpackhi_ ps Experimental avx512f
- Unpack and interleave single-precision (32-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โunpacklo_ epi32 Experimental avx512f
- Unpack and interleave 32-bit integers from the low half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โunpacklo_ epi64 Experimental avx512f
- Unpack and interleave 64-bit integers from the low half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โunpacklo_ pd Experimental avx512f
- Unpack and interleave double-precision (64-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โunpacklo_ ps Experimental avx512f
- Unpack and interleave single-precision (32-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โxor_ epi32 Experimental avx512f
- Compute the bitwise XOR of packed 32-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
mask_ โxor_ epi64 Experimental avx512f
- Compute the bitwise XOR of packed 64-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm512_
maskz_ โabs_ epi32 Experimental avx512f
- Computes the absolute value of packed 32-bit integers in
a
, and store the unsigned results indst
using zeromaskk
(elements are zeroed out when the corresponding mask bit is not set). - _mm512_
maskz_ โabs_ epi64 Experimental avx512f
- Compute the absolute value of packed signed 64-bit integers in a, and store the unsigned results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โadd_ epi32 Experimental avx512f
- Add packed 32-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โadd_ epi64 Experimental avx512f
- Add packed 64-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โadd_ pd Experimental avx512f
- Add packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โadd_ ps Experimental avx512f
- Add packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โadd_ round_ pd Experimental avx512f
- Add packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โadd_ round_ ps Experimental avx512f
- Add packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โalignr_ epi32 Experimental avx512f
- Concatenate a and b into a 128-byte immediate result, shift the result right by imm8 32-bit elements, and stores the low 64 bytes (16 elements) in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โalignr_ epi64 Experimental avx512f
- Concatenate a and b into a 128-byte immediate result, shift the result right by imm8 64-bit elements, and stores the low 64 bytes (8 elements) in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โand_ epi32 Experimental avx512f
- Compute the bitwise AND of packed 32-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โand_ epi64 Experimental avx512f
- Compute the bitwise AND of packed 64-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โandnot_ epi32 Experimental avx512f
- Compute the bitwise NOT of packed 32-bit integers in a and then AND with b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โandnot_ epi64 Experimental avx512f
- Compute the bitwise NOT of packed 64-bit integers in a and then AND with b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โbroadcast_ f32x4 Experimental avx512f
- Broadcast the 4 packed single-precision (32-bit) floating-point elements from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โbroadcast_ f64x4 Experimental avx512f
- Broadcast the 4 packed double-precision (64-bit) floating-point elements from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โbroadcast_ i32x4 Experimental avx512f
- Broadcast the 4 packed 32-bit integers from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โbroadcast_ i64x4 Experimental avx512f
- Broadcast the 4 packed 64-bit integers from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โbroadcastd_ epi32 Experimental avx512f
- Broadcast the low packed 32-bit integer from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โbroadcastq_ epi64 Experimental avx512f
- Broadcast the low packed 64-bit integer from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โbroadcastsd_ pd Experimental avx512f
- Broadcast the low double-precision (64-bit) floating-point element from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โbroadcastss_ ps Experimental avx512f
- Broadcast the low single-precision (32-bit) floating-point element from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcompress_ epi32 Experimental avx512f
- Contiguously store the active 32-bit integers in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero.
- _mm512_
maskz_ โcompress_ epi64 Experimental avx512f
- Contiguously store the active 64-bit integers in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero.
- _mm512_
maskz_ โcompress_ pd Experimental avx512f
- Contiguously store the active double-precision (64-bit) floating-point elements in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero.
- _mm512_
maskz_ โcompress_ ps Experimental avx512f
- Contiguously store the active single-precision (32-bit) floating-point elements in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero.
- _mm512_
maskz_ โcvt_ roundepi32_ ps Experimental avx512f
- Convert packed signed 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โcvt_ roundepu32_ ps Experimental avx512f
- Convert packed unsigned 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โcvt_ roundpd_ epi32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โcvt_ roundpd_ epu32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โcvt_ roundpd_ ps Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โcvt_ roundph_ ps Experimental avx512f
- Convert packed half-precision (16-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
maskz_ โcvt_ roundps_ epi32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โcvt_ roundps_ epu32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โcvt_ roundps_ pd Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
maskz_ โcvt_ roundps_ ph Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed half-precision (16-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
maskz_ โcvtepi8_ epi32 Experimental avx512f
- Sign extend packed 8-bit integers in a to packed 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtepi8_ epi64 Experimental avx512f
- Sign extend packed 8-bit integers in the low 8 bytes of a to packed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtepi16_ epi32 Experimental avx512f
- Sign extend packed 16-bit integers in a to packed 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtepi16_ epi64 Experimental avx512f
- Sign extend packed 16-bit integers in a to packed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtepi32_ epi8 Experimental avx512f
- Convert packed 32-bit integers in a to packed 8-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtepi32_ epi16 Experimental avx512f
- Convert packed 32-bit integers in a to packed 16-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtepi32_ epi64 Experimental avx512f
- Sign extend packed 32-bit integers in a to packed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtepi32_ pd Experimental avx512f
- Convert packed signed 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtepi32_ ps Experimental avx512f
- Convert packed signed 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtepi64_ epi8 Experimental avx512f
- Convert packed 64-bit integers in a to packed 8-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtepi64_ epi16 Experimental avx512f
- Convert packed 64-bit integers in a to packed 16-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtepi64_ epi32 Experimental avx512f
- Convert packed 64-bit integers in a to packed 32-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtepu8_ epi32 Experimental avx512f
- Zero extend packed unsigned 8-bit integers in a to packed 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtepu8_ epi64 Experimental avx512f
- Zero extend packed unsigned 8-bit integers in the low 8 bytes of a to packed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtepu16_ epi32 Experimental avx512f
- Zero extend packed unsigned 16-bit integers in a to packed 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtepu16_ epi64 Experimental avx512f
- Zero extend packed unsigned 16-bit integers in a to packed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtepu32_ epi64 Experimental avx512f
- Zero extend packed unsigned 32-bit integers in a to packed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtepu32_ pd Experimental avx512f
- Convert packed unsigned 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtepu32_ ps Experimental avx512f
- Convert packed unsigned 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtpd_ epi32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtpd_ epu32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtpd_ ps Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtph_ ps Experimental avx512f
- Convert packed half-precision (16-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtps_ epi32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtps_ epu32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtps_ pd Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtps_ ph Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed half-precision (16-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
maskz_ โcvtsepi32_ epi8 Experimental avx512f
- Convert packed signed 32-bit integers in a to packed 8-bit integers with signed saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtsepi32_ epi16 Experimental avx512f
- Convert packed signed 32-bit integers in a to packed 16-bit integers with signed saturation, and store the results in dst.
- _mm512_
maskz_ โcvtsepi64_ epi8 Experimental avx512f
- Convert packed signed 64-bit integers in a to packed 8-bit integers with signed saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtsepi64_ epi16 Experimental avx512f
- Convert packed signed 64-bit integers in a to packed 16-bit integers with signed saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtsepi64_ epi32 Experimental avx512f
- Convert packed signed 64-bit integers in a to packed 32-bit integers with signed saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtt_ roundpd_ epi32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
maskz_ โcvtt_ roundpd_ epu32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
maskz_ โcvtt_ roundps_ epi32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
maskz_ โcvtt_ roundps_ epu32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
maskz_ โcvttpd_ epi32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvttpd_ epu32 Experimental avx512f
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvttps_ epi32 Experimental avx512f
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvttps_ epu32 Experimental avx512f
- Convert packed double-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtusepi32_ epi8 Experimental avx512f
- Convert packed unsigned 32-bit integers in a to packed unsigned 8-bit integers with unsigned saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtusepi32_ epi16 Experimental avx512f
- Convert packed unsigned 32-bit integers in a to packed unsigned 16-bit integers with unsigned saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtusepi64_ epi8 Experimental avx512f
- Convert packed unsigned 64-bit integers in a to packed unsigned 8-bit integers with unsigned saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtusepi64_ epi16 Experimental avx512f
- Convert packed unsigned 64-bit integers in a to packed unsigned 16-bit integers with unsigned saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โcvtusepi64_ epi32 Experimental avx512f
- Convert packed unsigned 64-bit integers in a to packed unsigned 32-bit integers with unsigned saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โdiv_ pd Experimental avx512f
- Divide packed double-precision (64-bit) floating-point elements in a by packed elements in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โdiv_ ps Experimental avx512f
- Divide packed single-precision (32-bit) floating-point elements in a by packed elements in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โdiv_ round_ pd Experimental avx512f
- Divide packed double-precision (64-bit) floating-point elements in a by packed elements in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โdiv_ round_ ps Experimental avx512f
- Divide packed single-precision (32-bit) floating-point elements in a by packed elements in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โexpand_ epi32 Experimental avx512f
- Load contiguous active 32-bit integers from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โexpand_ epi64 Experimental avx512f
- Load contiguous active 64-bit integers from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โexpand_ pd Experimental avx512f
- Load contiguous active double-precision (64-bit) floating-point elements from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โexpand_ ps Experimental avx512f
- Load contiguous active single-precision (32-bit) floating-point elements from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โexpandloadu_ epi32 Experimental avx512f
- Load contiguous active 32-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โexpandloadu_ epi64 Experimental avx512f
- Load contiguous active 64-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โexpandloadu_ pd Experimental avx512f
- Load contiguous active double-precision (64-bit) floating-point elements from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โexpandloadu_ ps Experimental avx512f
- Load contiguous active single-precision (32-bit) floating-point elements from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โextractf32x4_ ps Experimental avx512f
- Extract 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from a, selected with imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โextractf64x4_ pd Experimental avx512f
- Extract 256 bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a, selected with imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โextracti32x4_ epi32 Experimental avx512f
- Extract 128 bits (composed of 4 packed 32-bit integers) from a, selected with IMM2, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โextracti64x4_ epi64 Experimental avx512f
- Extract 256 bits (composed of 4 packed 64-bit integers) from a, selected with IMM1, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โfixupimm_ pd Experimental avx512f
- Fix up packed double-precision (64-bit) floating-point elements in a and b using packed 64-bit integers in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). imm8 is used to set the required flags reporting.
- _mm512_
maskz_ โfixupimm_ ps Experimental avx512f
- Fix up packed single-precision (32-bit) floating-point elements in a and b using packed 32-bit integers in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). imm8 is used to set the required flags reporting.
- _mm512_
maskz_ โfixupimm_ round_ pd Experimental avx512f
- Fix up packed double-precision (64-bit) floating-point elements in a and b using packed 64-bit integers in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). imm8 is used to set the required flags reporting.\
- _mm512_
maskz_ โfixupimm_ round_ ps Experimental avx512f
- Fix up packed single-precision (32-bit) floating-point elements in a and b using packed 32-bit integers in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). imm8 is used to set the required flags reporting.\
- _mm512_
maskz_ โfmadd_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โfmadd_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โfmadd_ round_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โfmadd_ round_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in a using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โfmaddsub_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โfmaddsub_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โfmaddsub_ round_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โfmaddsub_ round_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โfmsub_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โfmsub_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โfmsub_ round_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โfmsub_ round_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โfmsubadd_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โfmsubadd_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โfmsubadd_ round_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โfmsubadd_ round_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โfnmadd_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โfnmadd_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โfnmadd_ round_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โfnmadd_ round_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โfnmsub_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โfnmsub_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โfnmsub_ round_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โfnmsub_ round_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โgetexp_ pd Experimental avx512f
- Convert the exponent of each packed double-precision (64-bit) floating-point element in a to a double-precision (64-bit) floating-point number representing the integer exponent, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element.
- _mm512_
maskz_ โgetexp_ ps Experimental avx512f
- Convert the exponent of each packed single-precision (32-bit) floating-point element in a to a single-precision (32-bit) floating-point number representing the integer exponent, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element.
- _mm512_
maskz_ โgetexp_ round_ pd Experimental avx512f
- Convert the exponent of each packed double-precision (64-bit) floating-point element in a to a double-precision (64-bit) floating-point number representing the integer exponent, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
maskz_ โgetexp_ round_ ps Experimental avx512f
- Convert the exponent of each packed single-precision (32-bit) floating-point element in a to a single-precision (32-bit) floating-point number representing the integer exponent, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
maskz_ โgetmant_ pd Experimental avx512f
- Normalize the mantissas of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 - _mm512_
maskz_ โgetmant_ ps Experimental avx512f
- Normalize the mantissas of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 - _mm512_
maskz_ โgetmant_ round_ pd Experimental avx512f
- Normalize the mantissas of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
maskz_ โgetmant_ round_ ps Experimental avx512f
- Normalize the mantissas of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
maskz_ โinsertf32x4 Experimental avx512f
- Copy a to tmp, then insert 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from b into tmp at the location specified by imm8. Store tmp to dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โinsertf64x4 Experimental avx512f
- Copy a to tmp, then insert 256 bits (composed of 4 packed double-precision (64-bit) floating-point elements) from b into tmp at the location specified by imm8. Store tmp to dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โinserti32x4 Experimental avx512f
- Copy a to tmp, then insert 128 bits (composed of 4 packed 32-bit integers) from b into tmp at the location specified by imm8. Store tmp to dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โinserti64x4 Experimental avx512f
- Copy a to tmp, then insert 256 bits (composed of 4 packed 64-bit integers) from b into tmp at the location specified by imm8. Store tmp to dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โload_ epi32 Experimental avx512f
- Load packed 32-bit integers from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
maskz_ โload_ epi64 Experimental avx512f
- Load packed 64-bit integers from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
maskz_ โload_ pd Experimental avx512f
- Load packed double-precision (64-bit) floating-point elements from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
maskz_ โload_ ps Experimental avx512f
- Load packed single-precision (32-bit) floating-point elements from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
maskz_ โloadu_ epi32 Experimental avx512f
- Load packed 32-bit integers from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm512_
maskz_ โloadu_ epi64 Experimental avx512f
- Load packed 64-bit integers from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm512_
maskz_ โloadu_ pd Experimental avx512f
- Load packed double-precision (64-bit) floating-point elements from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm512_
maskz_ โloadu_ ps Experimental avx512f
- Load packed single-precision (32-bit) floating-point elements from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm512_
maskz_ โmax_ epi32 Experimental avx512f
- Compare packed signed 32-bit integers in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmax_ epi64 Experimental avx512f
- Compare packed signed 64-bit integers in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmax_ epu32 Experimental avx512f
- Compare packed unsigned 32-bit integers in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmax_ epu64 Experimental avx512f
- Compare packed unsigned 64-bit integers in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmax_ pd Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmax_ ps Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmax_ round_ pd Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
maskz_ โmax_ round_ ps Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
maskz_ โmin_ epi32 Experimental avx512f
- Compare packed signed 32-bit integers in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmin_ epi64 Experimental avx512f
- Compare packed signed 64-bit integers in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmin_ epu32 Experimental avx512f
- Compare packed unsigned 32-bit integers in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmin_ epu64 Experimental avx512f
- Compare packed unsigned 64-bit integers in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmin_ pd Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmin_ ps Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmin_ round_ pd Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
maskz_ โmin_ round_ ps Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
maskz_ โmov_ epi32 Experimental avx512f
- Move packed 32-bit integers from a into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmov_ epi64 Experimental avx512f
- Move packed 64-bit integers from a into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmov_ pd Experimental avx512f
- Move packed double-precision (64-bit) floating-point elements from a into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmov_ ps Experimental avx512f
- Move packed single-precision (32-bit) floating-point elements from a into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmovedup_ pd Experimental avx512f
- Duplicate even-indexed double-precision (64-bit) floating-point elements from a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmovehdup_ ps Experimental avx512f
- Duplicate odd-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmoveldup_ ps Experimental avx512f
- Duplicate even-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmul_ epi32 Experimental avx512f
- Multiply the low signed 32-bit integers from each packed 64-bit element in a and b, and store the signed 64-bit results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmul_ epu32 Experimental avx512f
- Multiply the low unsigned 32-bit integers from each packed 64-bit element in a and b, and store the unsigned 64-bit results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmul_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmul_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โmul_ round_ pd Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โmul_ round_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โmullo_ epi32 Experimental avx512f
- Multiply the packed 32-bit integers in a and b, producing intermediate 64-bit integers, and store the low 32 bits of the intermediate integers in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โor_ epi32 Experimental avx512f
- Compute the bitwise OR of packed 32-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โor_ epi64 Experimental avx512f
- Compute the bitwise OR of packed 64-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โpermute_ pd Experimental avx512f
- Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โpermute_ ps Experimental avx512f
- Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โpermutevar_ pd Experimental avx512f
- Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โpermutevar_ ps Experimental avx512f
- Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โpermutex2var_ epi32 Experimental avx512f
- Shuffle 32-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โpermutex2var_ epi64 Experimental avx512f
- Shuffle 64-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โpermutex2var_ pd Experimental avx512f
- Shuffle double-precision (64-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โpermutex2var_ ps Experimental avx512f
- Shuffle single-precision (32-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โpermutex_ epi64 Experimental avx512f
- Shuffle 64-bit integers in a within 256-bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โpermutex_ pd Experimental avx512f
- Shuffle double-precision (64-bit) floating-point elements in a within 256-bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โpermutexvar_ epi32 Experimental avx512f
- Shuffle 32-bit integers in a across lanes using the corresponding index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โpermutexvar_ epi64 Experimental avx512f
- Shuffle 64-bit integers in a across lanes using the corresponding index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โpermutexvar_ pd Experimental avx512f
- Shuffle double-precision (64-bit) floating-point elements in a across lanes using the corresponding index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โpermutexvar_ ps Experimental avx512f
- Shuffle single-precision (32-bit) floating-point elements in a across lanes using the corresponding index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โrcp14_ pd Experimental avx512f
- Compute the approximate reciprocal of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm512_
maskz_ โrcp14_ ps Experimental avx512f
- Compute the approximate reciprocal of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm512_
maskz_ โrol_ epi32 Experimental avx512f
- Rotate the bits in each packed 32-bit integer in a to the left by the number of bits specified in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โrol_ epi64 Experimental avx512f
- Rotate the bits in each packed 64-bit integer in a to the left by the number of bits specified in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โrolv_ epi32 Experimental avx512f
- Rotate the bits in each packed 32-bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โrolv_ epi64 Experimental avx512f
- Rotate the bits in each packed 64-bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โror_ epi32 Experimental avx512f
- Rotate the bits in each packed 32-bit integer in a to the right by the number of bits specified in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โror_ epi64 Experimental avx512f
- Rotate the bits in each packed 64-bit integer in a to the right by the number of bits specified in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โrorv_ epi32 Experimental avx512f
- Rotate the bits in each packed 32-bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โrorv_ epi64 Experimental avx512f
- Rotate the bits in each packed 64-bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โroundscale_ pd Experimental avx512f
- Round packed double-precision (64-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm512_
maskz_ โroundscale_ ps Experimental avx512f
- Round packed single-precision (32-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm512_
maskz_ โroundscale_ round_ pd Experimental avx512f
- Round packed double-precision (64-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm512_
maskz_ โroundscale_ round_ ps Experimental avx512f
- Round packed single-precision (32-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm512_
maskz_ โrsqrt14_ pd Experimental avx512f
- Compute the approximate reciprocal square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm512_
maskz_ โrsqrt14_ ps Experimental avx512f
- Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm512_
maskz_ โscalef_ pd Experimental avx512f
- Scale the packed double-precision (64-bit) floating-point elements in a using values from b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โscalef_ ps Experimental avx512f
- Scale the packed single-precision (32-bit) floating-point elements in a using values from b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โscalef_ round_ pd Experimental avx512f
- Scale the packed double-precision (64-bit) floating-point elements in a using values from b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โscalef_ round_ ps Experimental avx512f
- Scale the packed single-precision (32-bit) floating-point elements in a using values from b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โset1_ epi32 Experimental avx512f
- Broadcast 32-bit integer a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โset1_ epi64 Experimental avx512f
- Broadcast 64-bit integer a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โshuffle_ epi32 Experimental avx512f
- Shuffle 32-bit integers in a within 128-bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โshuffle_ f32x4 Experimental avx512f
- Shuffle 128-bits (composed of 4 single-precision (32-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โshuffle_ f64x2 Experimental avx512f
- Shuffle 128-bits (composed of 2 double-precision (64-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โshuffle_ i32x4 Experimental avx512f
- Shuffle 128-bits (composed of 4 32-bit integers) selected by imm8 from a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โshuffle_ i64x2 Experimental avx512f
- Shuffle 128-bits (composed of 2 64-bit integers) selected by imm8 from a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โshuffle_ pd Experimental avx512f
- Shuffle double-precision (64-bit) floating-point elements within 128-bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โshuffle_ ps Experimental avx512f
- Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsll_ epi32 Experimental avx512f
- Shift packed 32-bit integers in a left by count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsll_ epi64 Experimental avx512f
- Shift packed 64-bit integers in a left by count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โslli_ epi32 Experimental avx512f
- Shift packed 32-bit integers in a left by imm8 while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โslli_ epi64 Experimental avx512f
- Shift packed 64-bit integers in a left by imm8 while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsllv_ epi32 Experimental avx512f
- Shift packed 32-bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsllv_ epi64 Experimental avx512f
- Shift packed 64-bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsqrt_ pd Experimental avx512f
- Compute the square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsqrt_ ps Experimental avx512f
- Compute the square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsqrt_ round_ pd Experimental avx512f
- Compute the square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โsqrt_ round_ ps Experimental avx512f
- Compute the square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โsra_ epi32 Experimental avx512f
- Shift packed 32-bit integers in a right by count while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsra_ epi64 Experimental avx512f
- Shift packed 64-bit integers in a right by count while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsrai_ epi32 Experimental avx512f
- Shift packed 32-bit integers in a right by imm8 while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsrai_ epi64 Experimental avx512f
- Shift packed 64-bit integers in a right by imm8 while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsrav_ epi32 Experimental avx512f
- Shift packed 32-bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsrav_ epi64 Experimental avx512f
- Shift packed 64-bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsrl_ epi32 Experimental avx512f
- Shift packed 32-bit integers in a right by count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsrl_ epi64 Experimental avx512f
- Shift packed 64-bit integers in a right by count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsrli_ epi32 Experimental avx512f
- Shift packed 32-bit integers in a right by imm8 while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsrli_ epi64 Experimental avx512f
- Shift packed 64-bit integers in a right by imm8 while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsrlv_ epi32 Experimental avx512f
- Shift packed 32-bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsrlv_ epi64 Experimental avx512f
- Shift packed 64-bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsub_ epi32 Experimental avx512f
- Subtract packed 32-bit integers in b from packed 32-bit integers in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsub_ epi64 Experimental avx512f
- Subtract packed 64-bit integers in b from packed 64-bit integers in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsub_ pd Experimental avx512f
- Subtract packed double-precision (64-bit) floating-point elements in b from packed double-precision (64-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsub_ ps Experimental avx512f
- Subtract packed single-precision (32-bit) floating-point elements in b from packed single-precision (32-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โsub_ round_ pd Experimental avx512f
- Subtract packed double-precision (64-bit) floating-point elements in b from packed double-precision (64-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โsub_ round_ ps Experimental avx512f
- Subtract packed single-precision (32-bit) floating-point elements in b from packed single-precision (32-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).\
- _mm512_
maskz_ โternarylogic_ epi32 Experimental avx512f
- Bitwise ternary logic that provides the capability to implement any three-operand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 32-bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst using zeromask k at 32-bit granularity (32-bit elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โternarylogic_ epi64 Experimental avx512f
- Bitwise ternary logic that provides the capability to implement any three-operand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 64-bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst using zeromask k at 64-bit granularity (64-bit elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โunpackhi_ epi32 Experimental avx512f
- Unpack and interleave 32-bit integers from the high half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โunpackhi_ epi64 Experimental avx512f
- Unpack and interleave 64-bit integers from the high half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โunpackhi_ pd Experimental avx512f
- Unpack and interleave double-precision (64-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โunpackhi_ ps Experimental avx512f
- Unpack and interleave single-precision (32-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โunpacklo_ epi32 Experimental avx512f
- Unpack and interleave 32-bit integers from the low half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โunpacklo_ epi64 Experimental avx512f
- Unpack and interleave 64-bit integers from the low half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โunpacklo_ pd Experimental avx512f
- Unpack and interleave double-precision (64-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โunpacklo_ ps Experimental avx512f
- Unpack and interleave single-precision (32-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โxor_ epi32 Experimental avx512f
- Compute the bitwise XOR of packed 32-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
maskz_ โxor_ epi64 Experimental avx512f
- Compute the bitwise XOR of packed 64-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm512_
max_ โepi32 Experimental avx512f
- Compare packed signed 32-bit integers in a and b, and store packed maximum values in dst.
- _mm512_
max_ โepi64 Experimental avx512f
- Compare packed signed 64-bit integers in a and b, and store packed maximum values in dst.
- _mm512_
max_ โepu32 Experimental avx512f
- Compare packed unsigned 32-bit integers in a and b, and store packed maximum values in dst.
- _mm512_
max_ โepu64 Experimental avx512f
- Compare packed unsigned 64-bit integers in a and b, and store packed maximum values in dst.
- _mm512_
max_ โpd Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed maximum values in dst.
- _mm512_
max_ โps Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed maximum values in dst.
- _mm512_
max_ โround_ pd Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed maximum values in dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
max_ โround_ ps Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed maximum values in dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
min_ โepi32 Experimental avx512f
- Compare packed signed 32-bit integers in a and b, and store packed minimum values in dst.
- _mm512_
min_ โepi64 Experimental avx512f
- Compare packed signed 64-bit integers in a and b, and store packed minimum values in dst.
- _mm512_
min_ โepu32 Experimental avx512f
- Compare packed unsigned 32-bit integers in a and b, and store packed minimum values in dst.
- _mm512_
min_ โepu64 Experimental avx512f
- Compare packed unsigned 64-bit integers in a and b, and store packed minimum values in dst.
- _mm512_
min_ โpd Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed minimum values in dst.
- _mm512_
min_ โps Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed minimum values in dst.
- _mm512_
min_ โround_ pd Experimental avx512f
- Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed minimum values in dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
min_ โround_ ps Experimental avx512f
- Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed minimum values in dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm512_
movedup_ โpd Experimental avx512f
- Duplicate even-indexed double-precision (64-bit) floating-point elements from a, and store the results in dst.
- _mm512_
movehdup_ โps Experimental avx512f
- Duplicate odd-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst.
- _mm512_
moveldup_ โps Experimental avx512f
- Duplicate even-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst.
- _mm512_
mul_ โepi32 Experimental avx512f
- Multiply the low signed 32-bit integers from each packed 64-bit element in a and b, and store the signed 64-bit results in dst.
- _mm512_
mul_ โepu32 Experimental avx512f
- Multiply the low unsigned 32-bit integers from each packed 64-bit element in a and b, and store the unsigned 64-bit results in dst.
- _mm512_
mul_ โpd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.
- _mm512_
mul_ โps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.
- _mm512_
mul_ โround_ pd Experimental avx512f
- Multiply packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.\
- _mm512_
mul_ โround_ ps Experimental avx512f
- Multiply packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.\
- _mm512_
mullo_ โepi32 Experimental avx512f
- Multiply the packed 32-bit integers in a and b, producing intermediate 64-bit integers, and store the low 32 bits of the intermediate integers in dst.
- _mm512_
mullox_ โepi64 Experimental avx512f
- Multiplies elements in packed 64-bit integer vectors a and b together, storing the lower 64 bits of the result in dst.
- _mm512_
or_ โepi32 Experimental avx512f
- Compute the bitwise OR of packed 32-bit integers in a and b, and store the results in dst.
- _mm512_
or_ โepi64 Experimental avx512f
- Compute the bitwise OR of packed 64-bit integers in a and b, and store the resut in dst.
- _mm512_
or_ โsi512 Experimental avx512f
- Compute the bitwise OR of 512 bits (representing integer data) in a and b, and store the result in dst.
- _mm512_
permute_ โpd Experimental avx512f
- Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst.
- _mm512_
permute_ โps Experimental avx512f
- Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst.
- _mm512_
permutevar_ โepi32 Experimental avx512f
- Shuffle 32-bit integers in a across lanes using the corresponding index in idx, and store the results in dst. Note that this intrinsic shuffles across 128-bit lanes, unlike past intrinsics that use the permutevar name. This intrinsic is identical to _mm512_permutexvar_epi32, and it is recommended that you use that intrinsic name.
- _mm512_
permutevar_ โpd Experimental avx512f
- Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst.
- _mm512_
permutevar_ โps Experimental avx512f
- Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst.
- _mm512_
permutex2var_ โepi32 Experimental avx512f
- Shuffle 32-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst.
- _mm512_
permutex2var_ โepi64 Experimental avx512f
- Shuffle 64-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst.
- _mm512_
permutex2var_ โpd Experimental avx512f
- Shuffle double-precision (64-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst.
- _mm512_
permutex2var_ โps Experimental avx512f
- Shuffle single-precision (32-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst.
- _mm512_
permutex_ โepi64 Experimental avx512f
- Shuffle 64-bit integers in a within 256-bit lanes using the control in imm8, and store the results in dst.
- _mm512_
permutex_ โpd Experimental avx512f
- Shuffle double-precision (64-bit) floating-point elements in a within 256-bit lanes using the control in imm8, and store the results in dst.
- _mm512_
permutexvar_ โepi32 Experimental avx512f
- Shuffle 32-bit integers in a across lanes using the corresponding index in idx, and store the results in dst.
- _mm512_
permutexvar_ โepi64 Experimental avx512f
- Shuffle 64-bit integers in a across lanes using the corresponding index in idx, and store the results in dst.
- _mm512_
permutexvar_ โpd Experimental avx512f
- Shuffle double-precision (64-bit) floating-point elements in a across lanes using the corresponding index in idx, and store the results in dst.
- _mm512_
permutexvar_ โps Experimental avx512f
- Shuffle single-precision (32-bit) floating-point elements in a across lanes using the corresponding index in idx.
- _mm512_
rcp14_ โpd Experimental avx512f
- Compute the approximate reciprocal of packed double-precision (64-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 2^-14.
- _mm512_
rcp14_ โps Experimental avx512f
- Compute the approximate reciprocal of packed single-precision (32-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 2^-14.
- _mm512_
reduce_ โadd_ epi32 Experimental avx512f
- Reduce the packed 32-bit integers in a by addition. Returns the sum of all elements in a.
- _mm512_
reduce_ โadd_ epi64 Experimental avx512f
- Reduce the packed 64-bit integers in a by addition. Returns the sum of all elements in a.
- _mm512_
reduce_ โadd_ pd Experimental avx512f
- Reduce the packed double-precision (64-bit) floating-point elements in a by addition. Returns the sum of all elements in a.
- _mm512_
reduce_ โadd_ ps Experimental avx512f
- Reduce the packed single-precision (32-bit) floating-point elements in a by addition. Returns the sum of all elements in a.
- _mm512_
reduce_ โand_ epi32 Experimental avx512f
- Reduce the packed 32-bit integers in a by bitwise AND. Returns the bitwise AND of all elements in a.
- _mm512_
reduce_ โand_ epi64 Experimental avx512f
- Reduce the packed 64-bit integers in a by bitwise AND. Returns the bitwise AND of all elements in a.
- _mm512_
reduce_ โmax_ epi32 Experimental avx512f
- Reduce the packed signed 32-bit integers in a by maximum. Returns the maximum of all elements in a.
- _mm512_
reduce_ โmax_ epi64 Experimental avx512f
- Reduce the packed signed 64-bit integers in a by maximum. Returns the maximum of all elements in a.
- _mm512_
reduce_ โmax_ epu32 Experimental avx512f
- Reduce the packed unsigned 32-bit integers in a by maximum. Returns the maximum of all elements in a.
- _mm512_
reduce_ โmax_ epu64 Experimental avx512f
- Reduce the packed unsigned 64-bit integers in a by maximum. Returns the maximum of all elements in a.
- _mm512_
reduce_ โmax_ pd Experimental avx512f
- Reduce the packed double-precision (64-bit) floating-point elements in a by maximum. Returns the maximum of all elements in a.
- _mm512_
reduce_ โmax_ ps Experimental avx512f
- Reduce the packed single-precision (32-bit) floating-point elements in a by maximum. Returns the maximum of all elements in a.
- _mm512_
reduce_ โmin_ epi32 Experimental avx512f
- Reduce the packed signed 32-bit integers in a by minimum. Returns the minimum of all elements in a.
- _mm512_
reduce_ โmin_ epi64 Experimental avx512f
- Reduce the packed signed 64-bit integers in a by minimum. Returns the minimum of all elements in a.
- _mm512_
reduce_ โmin_ epu32 Experimental avx512f
- Reduce the packed unsigned 32-bit integers in a by minimum. Returns the minimum of all elements in a.
- _mm512_
reduce_ โmin_ epu64 Experimental avx512f
- Reduce the packed unsigned 64-bit integers in a by minimum. Returns the minimum of all elements in a.
- _mm512_
reduce_ โmin_ pd Experimental avx512f
- Reduce the packed double-precision (64-bit) floating-point elements in a by minimum. Returns the minimum of all elements in a.
- _mm512_
reduce_ โmin_ ps Experimental avx512f
- Reduce the packed single-precision (32-bit) floating-point elements in a by minimum. Returns the minimum of all elements in a.
- _mm512_
reduce_ โmul_ epi32 Experimental avx512f
- Reduce the packed 32-bit integers in a by multiplication. Returns the product of all elements in a.
- _mm512_
reduce_ โmul_ epi64 Experimental avx512f
- Reduce the packed 64-bit integers in a by multiplication. Returns the product of all elements in a.
- _mm512_
reduce_ โmul_ pd Experimental avx512f
- Reduce the packed double-precision (64-bit) floating-point elements in a by multiplication. Returns the product of all elements in a.
- _mm512_
reduce_ โmul_ ps Experimental avx512f
- Reduce the packed single-precision (32-bit) floating-point elements in a by multiplication. Returns the product of all elements in a.
- _mm512_
reduce_ โor_ epi32 Experimental avx512f
- Reduce the packed 32-bit integers in a by bitwise OR. Returns the bitwise OR of all elements in a.
- _mm512_
reduce_ โor_ epi64 Experimental avx512f
- Reduce the packed 64-bit integers in a by bitwise OR. Returns the bitwise OR of all elements in a.
- _mm512_
rol_ โepi32 Experimental avx512f
- Rotate the bits in each packed 32-bit integer in a to the left by the number of bits specified in imm8, and store the results in dst.
- _mm512_
rol_ โepi64 Experimental avx512f
- Rotate the bits in each packed 64-bit integer in a to the left by the number of bits specified in imm8, and store the results in dst.
- _mm512_
rolv_ โepi32 Experimental avx512f
- Rotate the bits in each packed 32-bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst.
- _mm512_
rolv_ โepi64 Experimental avx512f
- Rotate the bits in each packed 64-bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst.
- _mm512_
ror_ โepi32 Experimental avx512f
- Rotate the bits in each packed 32-bit integer in a to the right by the number of bits specified in imm8, and store the results in dst.
- _mm512_
ror_ โepi64 Experimental avx512f
- Rotate the bits in each packed 64-bit integer in a to the right by the number of bits specified in imm8, and store the results in dst.
- _mm512_
rorv_ โepi32 Experimental avx512f
- Rotate the bits in each packed 32-bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst.
- _mm512_
rorv_ โepi64 Experimental avx512f
- Rotate the bits in each packed 64-bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst.
- _mm512_
roundscale_ โpd Experimental avx512f
- Round packed double-precision (64-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm512_
roundscale_ โps Experimental avx512f
- Round packed single-precision (32-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm512_
roundscale_ โround_ pd Experimental avx512f
- Round packed double-precision (64-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm512_
roundscale_ โround_ ps Experimental avx512f
- Round packed single-precision (32-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm512_
rsqrt14_ โpd Experimental avx512f
- Compute the approximate reciprocal square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 2^-14.
- _mm512_
rsqrt14_ โps Experimental avx512f
- Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 2^-14.
- _mm512_
scalef_ โpd Experimental avx512f
- Scale the packed double-precision (64-bit) floating-point elements in a using values from b, and store the results in dst.
- _mm512_
scalef_ โps Experimental avx512f
- Scale the packed single-precision (32-bit) floating-point elements in a using values from b, and store the results in dst.
- _mm512_
scalef_ โround_ pd Experimental avx512f
- Scale the packed double-precision (64-bit) floating-point elements in a using values from b, and store the results in dst.\
- _mm512_
scalef_ โround_ ps Experimental avx512f
- Scale the packed single-precision (32-bit) floating-point elements in a using values from b, and store the results in dst.\
- _mm512_
set1_ โepi8 Experimental avx512f
- Broadcast 8-bit integer a to all elements of dst.
- _mm512_
set1_ โepi16 Experimental avx512f
- Broadcast the low packed 16-bit integer from a to all elements of dst.
- _mm512_
set1_ โepi32 Experimental avx512f
- Broadcast 32-bit integer
a
to all elements ofdst
. - _mm512_
set1_ โepi64 Experimental avx512f
- Broadcast 64-bit integer
a
to all elements ofdst
. - _mm512_
set1_ โpd Experimental avx512f
- Broadcast 64-bit float
a
to all elements ofdst
. - _mm512_
set1_ โps Experimental avx512f
- Broadcast 32-bit float
a
to all elements ofdst
. - _mm512_
set4_ โepi32 Experimental avx512f
- Set packed 32-bit integers in dst with the repeated 4 element sequence.
- _mm512_
set4_ โepi64 Experimental avx512f
- Set packed 64-bit integers in dst with the repeated 4 element sequence.
- _mm512_
set4_ โpd Experimental avx512f
- Set packed double-precision (64-bit) floating-point elements in dst with the repeated 4 element sequence.
- _mm512_
set4_ โps Experimental avx512f
- Set packed single-precision (32-bit) floating-point elements in dst with the repeated 4 element sequence.
- _mm512_
set_ โepi8 Experimental avx512f
- Set packed 8-bit integers in dst with the supplied values.
- _mm512_
set_ โepi16 Experimental avx512f
- Set packed 16-bit integers in dst with the supplied values.
- _mm512_
set_ โepi32 Experimental avx512f
- Sets packed 32-bit integers in
dst
with the supplied values. - _mm512_
set_ โepi64 Experimental avx512f
- Set packed 64-bit integers in dst with the supplied values.
- _mm512_
set_ โpd Experimental avx512f
- Set packed double-precision (64-bit) floating-point elements in dst with the supplied values.
- _mm512_
set_ โps Experimental avx512f
- Sets packed 32-bit integers in
dst
with the supplied values. - _mm512_
setr4_ โepi32 Experimental avx512f
- Set packed 32-bit integers in dst with the repeated 4 element sequence in reverse order.
- _mm512_
setr4_ โepi64 Experimental avx512f
- Set packed 64-bit integers in dst with the repeated 4 element sequence in reverse order.
- _mm512_
setr4_ โpd Experimental avx512f
- Set packed double-precision (64-bit) floating-point elements in dst with the repeated 4 element sequence in reverse order.
- _mm512_
setr4_ โps Experimental avx512f
- Set packed single-precision (32-bit) floating-point elements in dst with the repeated 4 element sequence in reverse order.
- _mm512_
setr_ โepi32 Experimental avx512f
- Sets packed 32-bit integers in
dst
with the supplied values in reverse order. - _mm512_
setr_ โepi64 Experimental avx512f
- Set packed 64-bit integers in dst with the supplied values in reverse order.
- _mm512_
setr_ โpd Experimental avx512f
- Set packed double-precision (64-bit) floating-point elements in dst with the supplied values in reverse order.
- _mm512_
setr_ โps Experimental avx512f
- Sets packed 32-bit integers in
dst
with the supplied values in reverse order. - _mm512_
setzero โExperimental avx512f
- Return vector of type
__m512
with all elements set to zero. - _mm512_
setzero_ โepi32 Experimental avx512f
- Return vector of type
__m512i
with all elements set to zero. - _mm512_
setzero_ โpd Experimental avx512f
- Returns vector of type
__m512d
with all elements set to zero. - _mm512_
setzero_ โps Experimental avx512f
- Returns vector of type
__m512
with all elements set to zero. - _mm512_
setzero_ โsi512 Experimental avx512f
- Returns vector of type
__m512i
with all elements set to zero. - _mm512_
shuffle_ โepi32 Experimental avx512f
- Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst.
- _mm512_
shuffle_ โf32x4 Experimental avx512f
- Shuffle 128-bits (composed of 4 single-precision (32-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst.
- _mm512_
shuffle_ โf64x2 Experimental avx512f
- Shuffle 128-bits (composed of 2 double-precision (64-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst.
- _mm512_
shuffle_ โi32x4 Experimental avx512f
- Shuffle 128-bits (composed of 4 32-bit integers) selected by imm8 from a and b, and store the results in dst.
- _mm512_
shuffle_ โi64x2 Experimental avx512f
- Shuffle 128-bits (composed of 2 64-bit integers) selected by imm8 from a and b, and store the results in dst.
- _mm512_
shuffle_ โpd Experimental avx512f
- Shuffle double-precision (64-bit) floating-point elements within 128-bit lanes using the control in imm8, and store the results in dst.
- _mm512_
shuffle_ โps Experimental avx512f
- Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst.
- _mm512_
sll_ โepi32 Experimental avx512f
- Shift packed 32-bit integers in a left by count while shifting in zeros, and store the results in dst.
- _mm512_
sll_ โepi64 Experimental avx512f
- Shift packed 64-bit integers in a left by count while shifting in zeros, and store the results in dst.
- _mm512_
slli_ โepi32 Experimental avx512f
- Shift packed 32-bit integers in a left by imm8 while shifting in zeros, and store the results in dst.
- _mm512_
slli_ โepi64 Experimental avx512f
- Shift packed 64-bit integers in a left by imm8 while shifting in zeros, and store the results in dst.
- _mm512_
sllv_ โepi32 Experimental avx512f
- Shift packed 32-bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst.
- _mm512_
sllv_ โepi64 Experimental avx512f
- Shift packed 64-bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst.
- _mm512_
sqrt_ โpd Experimental avx512f
- Compute the square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst.
- _mm512_
sqrt_ โps Experimental avx512f
- Compute the square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst.
- _mm512_
sqrt_ โround_ pd Experimental avx512f
- Compute the square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst.\
- _mm512_
sqrt_ โround_ ps Experimental avx512f
- Compute the square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst.\
- _mm512_
sra_ โepi32 Experimental avx512f
- Shift packed 32-bit integers in a right by count while shifting in sign bits, and store the results in dst.
- _mm512_
sra_ โepi64 Experimental avx512f
- Shift packed 64-bit integers in a right by count while shifting in sign bits, and store the results in dst.
- _mm512_
srai_ โepi32 Experimental avx512f
- Shift packed 32-bit integers in a right by imm8 while shifting in sign bits, and store the results in dst.
- _mm512_
srai_ โepi64 Experimental avx512f
- Shift packed 64-bit integers in a right by imm8 while shifting in sign bits, and store the results in dst.
- _mm512_
srav_ โepi32 Experimental avx512f
- Shift packed 32-bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst.
- _mm512_
srav_ โepi64 Experimental avx512f
- Shift packed 64-bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst.
- _mm512_
srl_ โepi32 Experimental avx512f
- Shift packed 32-bit integers in a right by count while shifting in zeros, and store the results in dst.
- _mm512_
srl_ โepi64 Experimental avx512f
- Shift packed 64-bit integers in a right by count while shifting in zeros, and store the results in dst.
- _mm512_
srli_ โepi32 Experimental avx512f
- Shift packed 32-bit integers in a right by imm8 while shifting in zeros, and store the results in dst.
- _mm512_
srli_ โepi64 Experimental avx512f
- Shift packed 64-bit integers in a right by imm8 while shifting in zeros, and store the results in dst.
- _mm512_
srlv_ โepi32 Experimental avx512f
- Shift packed 32-bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst.
- _mm512_
srlv_ โepi64 Experimental avx512f
- Shift packed 64-bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst.
- _mm512_
store_ โepi32 Experimental avx512f
- Store 512-bits (composed of 16 packed 32-bit integers) from a into memory. mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
store_ โepi64 Experimental avx512f
- Store 512-bits (composed of 8 packed 64-bit integers) from a into memory. mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
store_ โpd Experimental avx512f
- Store 512-bits (composed of 8 packed double-precision (64-bit) floating-point elements) from a into memory. mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
store_ โps Experimental avx512f
- Store 512-bits of integer data from a into memory. mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
store_ โsi512 Experimental avx512f
- Store 512-bits of integer data from a into memory. mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
storeu_ โepi32 Experimental avx512f
- Store 512-bits (composed of 16 packed 32-bit integers) from a into memory. mem_addr does not need to be aligned on any particular boundary.
- _mm512_
storeu_ โepi64 Experimental avx512f
- Store 512-bits (composed of 8 packed 64-bit integers) from a into memory. mem_addr does not need to be aligned on any particular boundary.
- _mm512_
storeu_ โpd Experimental avx512f
- Stores 512-bits (composed of 8 packed double-precision (64-bit)
floating-point elements) from
a
into memory.mem_addr
does not need to be aligned on any particular boundary. - _mm512_
storeu_ โps Experimental avx512f
- Stores 512-bits (composed of 16 packed single-precision (32-bit)
floating-point elements) from
a
into memory.mem_addr
does not need to be aligned on any particular boundary. - _mm512_
storeu_ โsi512 Experimental avx512f
- Store 512-bits of integer data from a into memory. mem_addr does not need to be aligned on any particular boundary.
- _mm512_
stream_ โload_ si512 Experimental avx512f
- Load 512-bits of integer data from memory into dst using a non-temporal memory hint. mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated. To minimize caching, the data is flagged as non-temporal (unlikely to be used again soon)
- _mm512_
stream_ โpd Experimental avx512f
- Store 512-bits (composed of 8 packed double-precision (64-bit) floating-point elements) from a into memory using a non-temporal memory hint. mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
stream_ โps Experimental avx512f
- Store 512-bits (composed of 16 packed single-precision (32-bit) floating-point elements) from a into memory using a non-temporal memory hint. mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
stream_ โsi512 Experimental avx512f
- Store 512-bits of integer data from a into memory using a non-temporal memory hint. mem_addr must be aligned on a 64-byte boundary or a general-protection exception may be generated.
- _mm512_
sub_ โepi32 Experimental avx512f
- Subtract packed 32-bit integers in b from packed 32-bit integers in a, and store the results in dst.
- _mm512_
sub_ โepi64 Experimental avx512f
- Subtract packed 64-bit integers in b from packed 64-bit integers in a, and store the results in dst.
- _mm512_
sub_ โpd Experimental avx512f
- Subtract packed double-precision (64-bit) floating-point elements in b from packed double-precision (64-bit) floating-point elements in a, and store the results in dst.
- _mm512_
sub_ โps Experimental avx512f
- Subtract packed single-precision (32-bit) floating-point elements in b from packed single-precision (32-bit) floating-point elements in a, and store the results in dst.
- _mm512_
sub_ โround_ pd Experimental avx512f
- Subtract packed double-precision (64-bit) floating-point elements in b from packed double-precision (64-bit) floating-point elements in a, and store the results in dst.\
- _mm512_
sub_ โround_ ps Experimental avx512f
- Subtract packed single-precision (32-bit) floating-point elements in b from packed single-precision (32-bit) floating-point elements in a, and store the results in dst.\
- _mm512_
ternarylogic_ โepi32 Experimental avx512f
- Bitwise ternary logic that provides the capability to implement any three-operand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 32-bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst.
- _mm512_
ternarylogic_ โepi64 Experimental avx512f
- Bitwise ternary logic that provides the capability to implement any three-operand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 64-bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst.
- _mm512_
test_ โepi32_ mask Experimental avx512f
- Compute the bitwise AND of packed 32-bit integers in a and b, producing intermediate 32-bit values, and set the corresponding bit in result mask k if the intermediate value is non-zero.
- _mm512_
test_ โepi64_ mask Experimental avx512f
- Compute the bitwise AND of packed 64-bit integers in a and b, producing intermediate 64-bit values, and set the corresponding bit in result mask k if the intermediate value is non-zero.
- _mm512_
testn_ โepi32_ mask Experimental avx512f
- Compute the bitwise NAND of packed 32-bit integers in a and b, producing intermediate 32-bit values, and set the corresponding bit in result mask k if the intermediate value is zero.
- _mm512_
testn_ โepi64_ mask Experimental avx512f
- Compute the bitwise NAND of packed 64-bit integers in a and b, producing intermediate 64-bit values, and set the corresponding bit in result mask k if the intermediate value is zero.
- _mm512_
undefined โExperimental avx512f
- Return vector of type __m512 with indeterminate elements.
Despite using the word โundefinedโ (following Intelโs naming scheme), this non-deterministically
picks some valid value and is not equivalent to
mem::MaybeUninit
. In practice, this is typically equivalent tomem::zeroed
. - _mm512_
undefined_ โepi32 Experimental avx512f
- Return vector of type __m512i with indeterminate elements.
Despite using the word โundefinedโ (following Intelโs naming scheme), this non-deterministically
picks some valid value and is not equivalent to
mem::MaybeUninit
. In practice, this is typically equivalent tomem::zeroed
. - _mm512_
undefined_ โpd Experimental avx512f
- Returns vector of type
__m512d
with indeterminate elements. Despite using the word โundefinedโ (following Intelโs naming scheme), this non-deterministically picks some valid value and is not equivalent tomem::MaybeUninit
. In practice, this is typically equivalent tomem::zeroed
. - _mm512_
undefined_ โps Experimental avx512f
- Returns vector of type
__m512
with indeterminate elements. Despite using the word โundefinedโ (following Intelโs naming scheme), this non-deterministically picks some valid value and is not equivalent tomem::MaybeUninit
. In practice, this is typically equivalent tomem::zeroed
. - _mm512_
unpackhi_ โepi32 Experimental avx512f
- Unpack and interleave 32-bit integers from the high half of each 128-bit lane in a and b, and store the results in dst.
- _mm512_
unpackhi_ โepi64 Experimental avx512f
- Unpack and interleave 64-bit integers from the high half of each 128-bit lane in a and b, and store the results in dst.
- _mm512_
unpackhi_ โpd Experimental avx512f
- Unpack and interleave double-precision (64-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst.
- _mm512_
unpackhi_ โps Experimental avx512f
- Unpack and interleave single-precision (32-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst.
- _mm512_
unpacklo_ โepi32 Experimental avx512f
- Unpack and interleave 32-bit integers from the low half of each 128-bit lane in a and b, and store the results in dst.
- _mm512_
unpacklo_ โepi64 Experimental avx512f
- Unpack and interleave 64-bit integers from the low half of each 128-bit lane in a and b, and store the results in dst.
- _mm512_
unpacklo_ โpd Experimental avx512f
- Unpack and interleave double-precision (64-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst.
- _mm512_
unpacklo_ โps Experimental avx512f
- Unpack and interleave single-precision (32-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst.
- _mm512_
xor_ โepi32 Experimental avx512f
- Compute the bitwise XOR of packed 32-bit integers in a and b, and store the results in dst.
- _mm512_
xor_ โepi64 Experimental avx512f
- Compute the bitwise XOR of packed 64-bit integers in a and b, and store the results in dst.
- _mm512_
xor_ โsi512 Experimental avx512f
- Compute the bitwise XOR of 512 bits (representing integer data) in a and b, and store the result in dst.
- _mm512_
zextpd128_ โpd512 Experimental avx512f
- Cast vector of type __m128d to type __m512d; the upper 384 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
zextpd256_ โpd512 Experimental avx512f
- Cast vector of type __m256d to type __m512d; the upper 256 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
zextps128_ โps512 Experimental avx512f
- Cast vector of type __m128 to type __m512; the upper 384 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
zextps256_ โps512 Experimental avx512f
- Cast vector of type __m256 to type __m512; the upper 256 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
zextsi128_ โsi512 Experimental avx512f
- Cast vector of type __m128i to type __m512i; the upper 384 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm512_
zextsi256_ โsi512 Experimental avx512f
- Cast vector of type __m256i to type __m512i; the upper 256 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
- _mm_
abs_ โepi64 Experimental avx512f,avx512vl
- Compute the absolute value of packed signed 64-bit integers in a, and store the unsigned results in dst.
- _mm_
add_ โround_ sd Experimental avx512f
- Add the lower double-precision (64-bit) floating-point element in a and b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst.\
- _mm_
add_ โround_ ss Experimental avx512f
- Add the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
alignr_ โepi32 Experimental avx512f,avx512vl
- Concatenate a and b into a 32-byte immediate result, shift the result right by imm8 32-bit elements, and store the low 16 bytes (4 elements) in dst.
- _mm_
alignr_ โepi64 Experimental avx512f,avx512vl
- Concatenate a and b into a 32-byte immediate result, shift the result right by imm8 64-bit elements, and store the low 16 bytes (2 elements) in dst.
- _mm_
cmp_ โepi32_ mask Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k.
- _mm_
cmp_ โepi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k.
- _mm_
cmp_ โepu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k.
- _mm_
cmp_ โepu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k.
- _mm_
cmp_ โpd_ mask Experimental avx512f,avx512vl
- Compare packed double-precision (64-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k.
- _mm_
cmp_ โps_ mask Experimental avx512f,avx512vl
- Compare packed single-precision (32-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k.
- _mm_
cmp_ โround_ sd_ mask Experimental avx512f
- Compare the lower double-precision (64-bit) floating-point element in a and b based on the comparison operand specified by imm8, and store the result in mask vector k.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
cmp_ โround_ ss_ mask Experimental avx512f
- Compare the lower single-precision (32-bit) floating-point element in a and b based on the comparison operand specified by imm8, and store the result in mask vector k.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
cmp_ โsd_ mask Experimental avx512f
- Compare the lower double-precision (64-bit) floating-point element in a and b based on the comparison operand specified by imm8, and store the result in mask vector k.
- _mm_
cmp_ โss_ mask Experimental avx512f
- Compare the lower single-precision (32-bit) floating-point element in a and b based on the comparison operand specified by imm8, and store the result in mask vector k.
- _mm_
cmpeq_ โepi32_ mask Experimental avx512f,avx512vl
- Compare packed 32-bit integers in a and b for equality, and store the results in mask vector k.
- _mm_
cmpeq_ โepi64_ mask Experimental avx512f,avx512vl
- Compare packed 64-bit integers in a and b for equality, and store the results in mask vector k.
- _mm_
cmpeq_ โepu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for equality, and store the results in mask vector k.
- _mm_
cmpeq_ โepu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for equality, and store the results in mask vector k.
- _mm_
cmpge_ โepi32_ mask Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k.
- _mm_
cmpge_ โepi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k.
- _mm_
cmpge_ โepu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k.
- _mm_
cmpge_ โepu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k.
- _mm_
cmpgt_ โepi32_ mask Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b for greater-than, and store the results in mask vector k.
- _mm_
cmpgt_ โepi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b for greater-than, and store the results in mask vector k.
- _mm_
cmpgt_ โepu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for greater-than, and store the results in mask vector k.
- _mm_
cmpgt_ โepu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for greater-than, and store the results in mask vector k.
- _mm_
cmple_ โepi32_ mask Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b for less-than-or-equal, and store the results in mask vector k.
- _mm_
cmple_ โepi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b for less-than-or-equal, and store the results in mask vector k.
- _mm_
cmple_ โepu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for less-than-or-equal, and store the results in mask vector k.
- _mm_
cmple_ โepu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for less-than-or-equal, and store the results in mask vector k.
- _mm_
cmplt_ โepi32_ mask Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b for less-than, and store the results in mask vector k.
- _mm_
cmplt_ โepi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b for less-than, and store the results in mask vector k.
- _mm_
cmplt_ โepu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for less-than, and store the results in mask vector k.
- _mm_
cmplt_ โepu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for less-than, and store the results in mask vector k.
- _mm_
cmpneq_ โepi32_ mask Experimental avx512f,avx512vl
- Compare packed 32-bit integers in a and b for not-equal, and store the results in mask vector k.
- _mm_
cmpneq_ โepi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b for not-equal, and store the results in mask vector k.
- _mm_
cmpneq_ โepu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for not-equal, and store the results in mask vector k.
- _mm_
cmpneq_ โepu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for not-equal, and store the results in mask vector k.
- _mm_
comi_ โround_ sd Experimental avx512f
- Compare the lower double-precision (64-bit) floating-point element in a and b based on the comparison operand specified by imm8, and return the boolean result (0 or 1).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
comi_ โround_ ss Experimental avx512f
- Compare the lower single-precision (32-bit) floating-point element in a and b based on the comparison operand specified by imm8, and return the boolean result (0 or 1).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
cvt_ โroundi32_ ss Experimental avx512f
- Convert the signed 32-bit integer b to a single-precision (32-bit) floating-point element, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
cvt_ โroundsd_ i32 Experimental avx512f
- Convert the lower single-precision (32-bit) floating-point element in a to a 32-bit integer, and store the result in dst.
Rounding is done according to the rounding[3:0] parameter, which can be one of:\ - _mm_
cvt_ โroundsd_ si32 Experimental avx512f
- Convert the lower double-precision (64-bit) floating-point element in a to a 32-bit integer, and store the result in dst.
Rounding is done according to the rounding[3:0] parameter, which can be one of:\ - _mm_
cvt_ โroundsd_ ss Experimental avx512f
- Convert the lower double-precision (64-bit) floating-point element in b to a single-precision (32-bit) floating-point element, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.
Rounding is done according to the rounding[3:0] parameter, which can be one of:\ - _mm_
cvt_ โroundsd_ u32 Experimental avx512f
- Convert the lower double-precision (64-bit) floating-point element in a to an unsigned 32-bit integer, and store the result in dst.
Rounding is done according to the rounding[3:0] parameter, which can be one of:\ - _mm_
cvt_ โroundsi32_ ss Experimental avx512f
- Convert the signed 32-bit integer b to a single-precision (32-bit) floating-point element, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
cvt_ โroundss_ i32 Experimental avx512f
- Convert the lower single-precision (32-bit) floating-point element in a to a 32-bit integer, and store the result in dst.
Rounding is done according to the rounding[3:0] parameter, which can be one of:\ - _mm_
cvt_ โroundss_ sd Experimental avx512f
- Convert the lower single-precision (32-bit) floating-point element in b to a double-precision (64-bit) floating-point element, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
cvt_ โroundss_ si32 Experimental avx512f
- Convert the lower single-precision (32-bit) floating-point element in a to a 32-bit integer, and store the result in dst.
Rounding is done according to the rounding[3:0] parameter, which can be one of:\ - _mm_
cvt_ โroundss_ u32 Experimental avx512f
- Convert the lower single-precision (32-bit) floating-point element in a to an unsigned 32-bit integer, and store the result in dst.
Rounding is done according to the rounding[3:0] parameter, which can be one of:\ - _mm_
cvt_ โroundu32_ ss Experimental avx512f
- Convert the unsigned 32-bit integer b to a single-precision (32-bit) floating-point element, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.
Rounding is done according to the rounding[3:0] parameter, which can be one of:\ - _mm_
cvtepi32_ โepi8 Experimental avx512f,avx512vl
- Convert packed 32-bit integers in a to packed 8-bit integers with truncation, and store the results in dst.
- _mm_
cvtepi32_ โepi16 Experimental avx512f,avx512vl
- Convert packed 32-bit integers in a to packed 16-bit integers with truncation, and store the results in dst.
- _mm_
cvtepi64_ โepi8 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 8-bit integers with truncation, and store the results in dst.
- _mm_
cvtepi64_ โepi16 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 16-bit integers with truncation, and store the results in dst.
- _mm_
cvtepi64_ โepi32 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 32-bit integers with truncation, and store the results in dst.
- _mm_
cvtepu32_ โpd Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.
- _mm_
cvti32_ โsd Experimental avx512f
- Convert the signed 32-bit integer b to a double-precision (64-bit) floating-point element, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst.
- _mm_
cvti32_ โss Experimental avx512f
- Convert the signed 32-bit integer b to a single-precision (32-bit) floating-point element, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
cvtpd_ โepu32 Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst.
- _mm_
cvtps_ โepu32 Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst.
- _mm_
cvtsd_ โi32 Experimental avx512f
- Convert the lower double-precision (64-bit) floating-point element in a to a 32-bit integer, and store the result in dst.
- _mm_
cvtsd_ โu32 Experimental avx512f
- Convert the lower double-precision (64-bit) floating-point element in a to an unsigned 32-bit integer, and store the result in dst.
- _mm_
cvtsepi32_ โepi8 Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed 8-bit integers with signed saturation, and store the results in dst.
- _mm_
cvtsepi32_ โepi16 Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed 16-bit integers with signed saturation, and store the results in dst.
- _mm_
cvtsepi64_ โepi8 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 8-bit integers with signed saturation, and store the results in dst.
- _mm_
cvtsepi64_ โepi16 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 16-bit integers with signed saturation, and store the results in dst.
- _mm_
cvtsepi64_ โepi32 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 32-bit integers with signed saturation, and store the results in dst.
- _mm_
cvtss_ โi32 Experimental avx512f
- Convert the lower single-precision (32-bit) floating-point element in a to a 32-bit integer, and store the result in dst.
- _mm_
cvtss_ โu32 Experimental avx512f
- Convert the lower single-precision (32-bit) floating-point element in a to an unsigned 32-bit integer, and store the result in dst.
- _mm_
cvtt_ โroundsd_ i32 Experimental avx512f
- Convert the lower double-precision (64-bit) floating-point element in a to a 32-bit integer with truncation, and store the result in dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
cvtt_ โroundsd_ si32 Experimental avx512f
- Convert the lower double-precision (64-bit) floating-point element in a to a 32-bit integer with truncation, and store the result in dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
cvtt_ โroundsd_ u32 Experimental avx512f
- Convert the lower double-precision (64-bit) floating-point element in a to an unsigned 32-bit integer with truncation, and store the result in dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
cvtt_ โroundss_ i32 Experimental avx512f
- Convert the lower single-precision (32-bit) floating-point element in a to a 32-bit integer with truncation, and store the result in dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
cvtt_ โroundss_ si32 Experimental avx512f
- Convert the lower single-precision (32-bit) floating-point element in a to a 32-bit integer with truncation, and store the result in dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
cvtt_ โroundss_ u32 Experimental avx512f
- Convert the lower single-precision (32-bit) floating-point element in a to an unsigned 32-bit integer with truncation, and store the result in dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
cvttpd_ โepu32 Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst.
- _mm_
cvttps_ โepu32 Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst.
- _mm_
cvttsd_ โi32 Experimental avx512f
- Convert the lower double-precision (64-bit) floating-point element in a to a 32-bit integer with truncation, and store the result in dst.
- _mm_
cvttsd_ โu32 Experimental avx512f
- Convert the lower double-precision (64-bit) floating-point element in a to an unsigned 32-bit integer with truncation, and store the result in dst.
- _mm_
cvttss_ โi32 Experimental avx512f
- Convert the lower single-precision (32-bit) floating-point element in a to a 32-bit integer with truncation, and store the result in dst.
- _mm_
cvttss_ โu32 Experimental avx512f
- Convert the lower single-precision (32-bit) floating-point element in a to an unsigned 32-bit integer with truncation, and store the result in dst.
- _mm_
cvtu32_ โsd Experimental avx512f
- Convert the unsigned 32-bit integer b to a double-precision (64-bit) floating-point element, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst.
- _mm_
cvtu32_ โss Experimental avx512f
- Convert the unsigned 32-bit integer b to a single-precision (32-bit) floating-point element, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
cvtusepi32_ โepi8 Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed unsigned 8-bit integers with unsigned saturation, and store the results in dst.
- _mm_
cvtusepi32_ โepi16 Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed unsigned 16-bit integers with unsigned saturation, and store the results in dst.
- _mm_
cvtusepi64_ โepi8 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed unsigned 8-bit integers with unsigned saturation, and store the results in dst.
- _mm_
cvtusepi64_ โepi16 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed unsigned 16-bit integers with unsigned saturation, and store the results in dst.
- _mm_
cvtusepi64_ โepi32 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed unsigned 32-bit integers with unsigned saturation, and store the results in dst.
- _mm_
div_ โround_ sd Experimental avx512f
- Divide the lower double-precision (64-bit) floating-point element in a by the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst.\
- _mm_
div_ โround_ ss Experimental avx512f
- Divide the lower single-precision (32-bit) floating-point element in a by the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
fixupimm_ โpd Experimental avx512f,avx512vl
- Fix up packed double-precision (64-bit) floating-point elements in a and b using packed 64-bit integers in c, and store the results in dst. imm8 is used to set the required flags reporting.
- _mm_
fixupimm_ โps Experimental avx512f,avx512vl
- Fix up packed single-precision (32-bit) floating-point elements in a and b using packed 32-bit integers in c, and store the results in dst. imm8 is used to set the required flags reporting.
- _mm_
fixupimm_ โround_ sd Experimental avx512f
- Fix up the lower double-precision (64-bit) floating-point elements in a and b using the lower 64-bit integer in c, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. imm8 is used to set the required flags reporting.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
fixupimm_ โround_ ss Experimental avx512f
- Fix up the lower single-precision (32-bit) floating-point elements in a and b using the lower 32-bit integer in c, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. imm8 is used to set the required flags reporting.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
fixupimm_ โsd Experimental avx512f
- Fix up the lower double-precision (64-bit) floating-point elements in a and b using the lower 64-bit integer in c, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. imm8 is used to set the required flags reporting.
- _mm_
fixupimm_ โss Experimental avx512f
- Fix up the lower single-precision (32-bit) floating-point elements in a and b using the lower 32-bit integer in c, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. imm8 is used to set the required flags reporting.
- _mm_
fmadd_ โround_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst, and copy the upper element from a to the upper element of dst.\
- _mm_
fmadd_ โround_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
fmsub_ โround_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst, and copy the upper element from a to the upper element of dst.\
- _mm_
fmsub_ โround_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
fnmadd_ โround_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst, and copy the upper element from a to the upper element of dst.\
- _mm_
fnmadd_ โround_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
fnmsub_ โround_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst, and copy the upper element from a to the upper element of dst.\
- _mm_
fnmsub_ โround_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, subtract the lower element in c from the negated intermediate result, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
getexp_ โpd Experimental avx512f,avx512vl
- Convert the exponent of each packed double-precision (64-bit) floating-point element in a to a double-precision (64-bit) floating-point number representing the integer exponent, and store the results in dst. This intrinsic essentially calculates floor(log2(x)) for each element.
- _mm_
getexp_ โps Experimental avx512f,avx512vl
- Convert the exponent of each packed single-precision (32-bit) floating-point element in a to a single-precision (32-bit) floating-point number representing the integer exponent, and store the results in dst. This intrinsic essentially calculates floor(log2(x)) for each element.
- _mm_
getexp_ โround_ sd Experimental avx512f
- Convert the exponent of the lower double-precision (64-bit) floating-point element in b to a double-precision (64-bit) floating-point number representing the integer exponent, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
getexp_ โround_ ss Experimental avx512f
- Convert the exponent of the lower single-precision (32-bit) floating-point element in b to a single-precision (32-bit) floating-point number representing the integer exponent, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
getexp_ โsd Experimental avx512f
- Convert the exponent of the lower double-precision (64-bit) floating-point element in b to a double-precision (64-bit) floating-point number representing the integer exponent, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element.
- _mm_
getexp_ โss Experimental avx512f
- Convert the exponent of the lower single-precision (32-bit) floating-point element in b to a single-precision (32-bit) floating-point number representing the integer exponent, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element.
- _mm_
getmant_ โpd Experimental avx512f,avx512vl
- Normalize the mantissas of packed double-precision (64-bit) floating-point elements in a, and store the results in dst. This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 - _mm_
getmant_ โps Experimental avx512f,avx512vl
- Normalize the mantissas of packed single-precision (32-bit) floating-point elements in a, and store the results in dst. This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. The mantissa is normalized to the interval specified by interv, which can take the following values: _MM_MANT_NORM_1_2 // interval [1, 2) _MM_MANT_NORM_p5_2 // interval [0.5, 2) _MM_MANT_NORM_p5_1 // interval [0.5, 1) _MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5) The sign is determined by sc which can take the following values: _MM_MANT_SIGN_src // sign = sign(src) _MM_MANT_SIGN_zero // sign = 0 _MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
- _mm_
getmant_ โround_ sd Experimental avx512f
- Normalize the mantissas of the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
getmant_ โround_ ss Experimental avx512f
- Normalize the mantissas of the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
getmant_ โsd Experimental avx512f
- Normalize the mantissas of the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
getmant_ โss Experimental avx512f
- Normalize the mantissas of the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
i32scatter_ โepi32 Experimental avx512f,avx512vl
- Stores 4 32-bit integer elements from a to memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale
- _mm_
i32scatter_ โepi64 Experimental avx512f,avx512vl
- Stores 2 64-bit integer elements from a to memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale
- _mm_
i32scatter_ โpd Experimental avx512f,avx512vl
- Stores 2 double-precision (64-bit) floating-point elements from a to memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale
- _mm_
i32scatter_ โps Experimental avx512f,avx512vl
- Stores 4 single-precision (32-bit) floating-point elements from a to memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale
- _mm_
i64scatter_ โepi32 Experimental avx512f,avx512vl
- Stores 2 32-bit integer elements from a to memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale
- _mm_
i64scatter_ โepi64 Experimental avx512f,avx512vl
- Stores 2 64-bit integer elements from a to memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale
- _mm_
i64scatter_ โpd Experimental avx512f,avx512vl
- Stores 2 double-precision (64-bit) floating-point elements from a to memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale
- _mm_
i64scatter_ โps Experimental avx512f,avx512vl
- Stores 2 single-precision (32-bit) floating-point elements from a to memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale
- _mm_
load_ โepi32 Experimental avx512f,avx512vl
- Load 128-bits (composed of 4 packed 32-bit integers) from memory into dst. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
load_ โepi64 Experimental avx512f,avx512vl
- Load 128-bits (composed of 2 packed 64-bit integers) from memory into dst. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
loadu_ โepi32 Experimental avx512f,avx512vl
- Load 128-bits (composed of 4 packed 32-bit integers) from memory into dst. mem_addr does not need to be aligned on any particular boundary.
- _mm_
loadu_ โepi64 Experimental avx512f,avx512vl
- Load 128-bits (composed of 2 packed 64-bit integers) from memory into dst. mem_addr does not need to be aligned on any particular boundary.
- _mm_
mask2_ โpermutex2var_ epi32 Experimental avx512f,avx512vl
- Shuffle 32-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from idx when the corresponding mask bit is not set).
- _mm_
mask2_ โpermutex2var_ epi64 Experimental avx512f,avx512vl
- Shuffle 64-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from idx when the corresponding mask bit is not set).
- _mm_
mask2_ โpermutex2var_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from idx when the corresponding mask bit is not set)
- _mm_
mask2_ โpermutex2var_ ps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from idx when the corresponding mask bit is not set).
- _mm_
mask3_ โfmadd_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm_
mask3_ โfmadd_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm_
mask3_ โfmadd_ round_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper element from c to the upper element of dst.\
- _mm_
mask3_ โfmadd_ round_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper 3 packed elements from c to the upper elements of dst.\
- _mm_
mask3_ โfmadd_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper element from c to the upper element of dst.
- _mm_
mask3_ โfmadd_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper 3 packed elements from c to the upper elements of dst.
- _mm_
mask3_ โfmaddsub_ pd Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm_
mask3_ โfmaddsub_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm_
mask3_ โfmsub_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm_
mask3_ โfmsub_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm_
mask3_ โfmsub_ round_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper element from c to the upper element of dst.\
- _mm_
mask3_ โfmsub_ round_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper 3 packed elements from c to the upper elements of dst.\
- _mm_
mask3_ โfmsub_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper element from c to the upper element of dst.
- _mm_
mask3_ โfmsub_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper 3 packed elements from c to the upper elements of dst.
- _mm_
mask3_ โfmsubadd_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm_
mask3_ โfmsubadd_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm_
mask3_ โfnmadd_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm_
mask3_ โfnmadd_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm_
mask3_ โfnmadd_ round_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper element from c to the upper element of dst.\
- _mm_
mask3_ โfnmadd_ round_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper 3 packed elements from c to the upper elements of dst.\
- _mm_
mask3_ โfnmadd_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper element from c to the upper element of dst.
- _mm_
mask3_ โfnmadd_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper 3 packed elements from c to the upper elements of dst.
- _mm_
mask3_ โfnmsub_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm_
mask3_ โfnmsub_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set).
- _mm_
mask3_ โfnmsub_ round_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper element from c to the upper element of dst.\
- _mm_
mask3_ โfnmsub_ round_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper 3 packed elements from c to the upper elements of dst.\
- _mm_
mask3_ โfnmsub_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper element from c to the upper element of dst.
- _mm_
mask3_ โfnmsub_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper 3 packed elements from c to the upper elements of dst.
- _mm_
mask_ โabs_ epi32 Experimental avx512f,avx512vl
- Compute the absolute value of packed signed 32-bit integers in a, and store the unsigned results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โabs_ epi64 Experimental avx512f,avx512vl
- Compute the absolute value of packed signed 64-bit integers in a, and store the unsigned results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โadd_ epi32 Experimental avx512f,avx512vl
- Add packed 32-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โadd_ epi64 Experimental avx512f,avx512vl
- Add packed 64-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โadd_ pd Experimental avx512f,avx512vl
- Add packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โadd_ ps Experimental avx512f,avx512vl
- Add packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โadd_ round_ sd Experimental avx512f
- Add the lower double-precision (64-bit) floating-point element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.\
- _mm_
mask_ โadd_ round_ ss Experimental avx512f
- Add the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
mask_ โadd_ sd Experimental avx512f
- Add the lower double-precision (64-bit) floating-point element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
mask_ โadd_ ss Experimental avx512f
- Add the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
mask_ โalignr_ epi32 Experimental avx512f,avx512vl
- Concatenate a and b into a 32-byte immediate result, shift the result right by imm8 32-bit elements, and store the low 16 bytes (4 elements) in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โalignr_ epi64 Experimental avx512f,avx512vl
- Concatenate a and b into a 32-byte immediate result, shift the result right by imm8 64-bit elements, and store the low 16 bytes (2 elements) in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โand_ epi32 Experimental avx512f,avx512vl
- Performs element-by-element bitwise AND between packed 32-bit integer elements of a and b, storing the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โand_ epi64 Experimental avx512f,avx512vl
- Compute the bitwise AND of packed 64-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โandnot_ epi32 Experimental avx512f,avx512vl
- Compute the bitwise NOT of packed 32-bit integers in a and then AND with b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โandnot_ epi64 Experimental avx512f,avx512vl
- Compute the bitwise NOT of packed 64-bit integers in a and then AND with b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โblend_ epi32 Experimental avx512f,avx512vl
- Blend packed 32-bit integers from a and b using control mask k, and store the results in dst.
- _mm_
mask_ โblend_ epi64 Experimental avx512f,avx512vl
- Blend packed 64-bit integers from a and b using control mask k, and store the results in dst.
- _mm_
mask_ โblend_ pd Experimental avx512f,avx512vl
- Blend packed double-precision (64-bit) floating-point elements from a and b using control mask k, and store the results in dst.
- _mm_
mask_ โblend_ ps Experimental avx512f,avx512vl
- Blend packed single-precision (32-bit) floating-point elements from a and b using control mask k, and store the results in dst.
- _mm_
mask_ โbroadcastd_ epi32 Experimental avx512f,avx512vl
- Broadcast the low packed 32-bit integer from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โbroadcastq_ epi64 Experimental avx512f,avx512vl
- Broadcast the low packed 64-bit integer from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โbroadcastss_ ps Experimental avx512f,avx512vl
- Broadcast the low single-precision (32-bit) floating-point element from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcmp_ epi32_ mask Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmp_ epi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmp_ epu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmp_ epu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmp_ pd_ mask Experimental avx512f,avx512vl
- Compare packed double-precision (64-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmp_ ps_ mask Experimental avx512f,avx512vl
- Compare packed single-precision (32-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmp_ round_ sd_ mask Experimental avx512f
- Compare the lower double-precision (64-bit) floating-point element in a and b based on the comparison operand specified by imm8, and store the result in mask vector k using zeromask k1 (the element is zeroed out when mask bit 0 is not set).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
mask_ โcmp_ round_ ss_ mask Experimental avx512f
- Compare the lower single-precision (32-bit) floating-point element in a and b based on the comparison operand specified by imm8, and store the result in mask vector k using zeromask k1 (the element is zeroed out when mask bit 0 is not seti).
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
mask_ โcmp_ sd_ mask Experimental avx512f
- Compare the lower double-precision (64-bit) floating-point element in a and b based on the comparison operand specified by imm8, and store the result in mask vector k using zeromask k1 (the element is zeroed out when mask bit 0 is not set).
- _mm_
mask_ โcmp_ ss_ mask Experimental avx512f
- Compare the lower single-precision (32-bit) floating-point element in a and b based on the comparison operand specified by imm8, and store the result in mask vector k using zeromask k1 (the element is zeroed out when mask bit 0 is not set).
- _mm_
mask_ โcmpeq_ epi32_ mask Experimental avx512f,avx512vl
- Compare packed 32-bit integers in a and b for equality, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmpeq_ epi64_ mask Experimental avx512f,avx512vl
- Compare packed 64-bit integers in a and b for equality, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmpeq_ epu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for equality, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmpeq_ epu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for equality, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmpge_ epi32_ mask Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmpge_ epi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmpge_ epu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmpge_ epu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for greater-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmpgt_ epi32_ mask Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b for greater-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmpgt_ epi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b for greater-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmpgt_ epu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for greater-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmpgt_ epu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for greater-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmple_ epi32_ mask Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b for less-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmple_ epi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b for less-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmple_ epu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for less-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmple_ epu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for less-than-or-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmplt_ epi32_ mask Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b for less-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmplt_ epi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b for less-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmplt_ epu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for less-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmplt_ epu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for less-than, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmpneq_ epi32_ mask Experimental avx512f,avx512vl
- Compare packed 32-bit integers in a and b for not-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmpneq_ epi64_ mask Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b for not-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmpneq_ epu32_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b for not-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcmpneq_ epu64_ mask Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b for not-equal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
mask_ โcompress_ epi32 Experimental avx512f,avx512vl
- Contiguously store the active 32-bit integers in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src.
- _mm_
mask_ โcompress_ epi64 Experimental avx512f,avx512vl
- Contiguously store the active 64-bit integers in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src.
- _mm_
mask_ โcompress_ pd Experimental avx512f,avx512vl
- Contiguously store the active double-precision (64-bit) floating-point elements in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src.
- _mm_
mask_ โcompress_ ps Experimental avx512f,avx512vl
- Contiguously store the active single-precision (32-bit) floating-point elements in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src.
- _mm_
mask_ โcompressstoreu_ epi32 Experimental avx512f,avx512vl
- Contiguously store the active 32-bit integers in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm_
mask_ โcompressstoreu_ epi64 Experimental avx512f,avx512vl
- Contiguously store the active 64-bit integers in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm_
mask_ โcompressstoreu_ pd Experimental avx512f,avx512vl
- Contiguously store the active double-precision (64-bit) floating-point elements in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm_
mask_ โcompressstoreu_ ps Experimental avx512f,avx512vl
- Contiguously store the active single-precision (32-bit) floating-point elements in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm_
mask_ โcvt_ roundps_ ph Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed half-precision (16-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
mask_ โcvt_ roundsd_ ss Experimental avx512f
- Convert the lower double-precision (64-bit) floating-point element in b to a single-precision (32-bit) floating-point element, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
Rounding is done according to the rounding[3:0] parameter, which can be one of:\ - _mm_
mask_ โcvt_ roundss_ sd Experimental avx512f
- Convert the lower single-precision (32-bit) floating-point element in b to a double-precision (64-bit) floating-point element, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
mask_ โcvtepi8_ epi32 Experimental avx512f,avx512vl
- Sign extend packed 8-bit integers in a to packed 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtepi8_ epi64 Experimental avx512f,avx512vl
- Sign extend packed 8-bit integers in the low 2 bytes of a to packed 64-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtepi16_ epi32 Experimental avx512f,avx512vl
- Sign extend packed 16-bit integers in a to packed 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtepi16_ epi64 Experimental avx512f,avx512vl
- Sign extend packed 16-bit integers in a to packed 64-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtepi32_ epi8 Experimental avx512f,avx512vl
- Convert packed 32-bit integers in a to packed 8-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtepi32_ epi16 Experimental avx512f,avx512vl
- Convert packed 32-bit integers in a to packed 16-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtepi32_ epi64 Experimental avx512f,avx512vl
- Sign extend packed 32-bit integers in a to packed 64-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtepi32_ pd Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtepi32_ ps Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtepi32_ storeu_ epi8 Experimental avx512f,avx512vl
- Convert packed 32-bit integers in a to packed 8-bit integers with truncation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm_
mask_ โcvtepi32_ storeu_ epi16 Experimental avx512f,avx512vl
- Convert packed 32-bit integers in a to packed 16-bit integers with truncation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm_
mask_ โcvtepi64_ epi8 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 8-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtepi64_ epi16 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 16-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtepi64_ epi32 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 32-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtepi64_ storeu_ epi8 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 8-bit integers with truncation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm_
mask_ โcvtepi64_ storeu_ epi16 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 16-bit integers with truncation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm_
mask_ โcvtepi64_ storeu_ epi32 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 32-bit integers with truncation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm_
mask_ โcvtepu8_ epi32 Experimental avx512f,avx512vl
- Zero extend packed unsigned 8-bit integers in the low 4 bytes of a to packed 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtepu8_ epi64 Experimental avx512f,avx512vl
- Zero extend packed unsigned 8-bit integers in the low 2 bytes of a to packed 64-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtepu16_ epi32 Experimental avx512f,avx512vl
- Zero extend packed unsigned 16-bit integers in a to packed 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtepu16_ epi64 Experimental avx512f,avx512vl
- Zero extend packed unsigned 16-bit integers in the low 4 bytes of a to packed 64-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtepu32_ epi64 Experimental avx512f,avx512vl
- Zero extend packed unsigned 32-bit integers in a to packed 64-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtepu32_ pd Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtpd_ epi32 Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtpd_ epu32 Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtpd_ ps Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtph_ ps Experimental avx512f,avx512vl
- Convert packed half-precision (16-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtps_ epi32 Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtps_ epu32 Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtps_ ph Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed half-precision (16-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
mask_ โcvtsd_ ss Experimental avx512f
- Convert the lower double-precision (64-bit) floating-point element in b to a single-precision (32-bit) floating-point element, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
mask_ โcvtsepi32_ epi8 Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed 8-bit integers with signed saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtsepi32_ epi16 Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed 16-bit integers with signed saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtsepi32_ storeu_ epi8 Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed 8-bit integers with signed saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm_
mask_ โcvtsepi32_ storeu_ epi16 Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed 16-bit integers with signed saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm_
mask_ โcvtsepi64_ epi8 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 8-bit integers with signed saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtsepi64_ epi16 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 16-bit integers with signed saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtsepi64_ epi32 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 32-bit integers with signed saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtsepi64_ storeu_ epi8 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 8-bit integers with signed saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm_
mask_ โcvtsepi64_ storeu_ epi16 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 16-bit integers with signed saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm_
mask_ โcvtsepi64_ storeu_ epi32 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 32-bit integers with signed saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm_
mask_ โcvtss_ sd Experimental avx512f
- Convert the lower single-precision (32-bit) floating-point element in b to a double-precision (64-bit) floating-point element, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
mask_ โcvttpd_ epi32 Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvttpd_ epu32 Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvttps_ epi32 Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvttps_ epu32 Experimental avx512f,avx512vl
- Convert packed double-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtusepi32_ epi8 Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed unsigned 8-bit integers with unsigned saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtusepi32_ epi16 Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed unsigned 16-bit integers with unsigned saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtusepi32_ storeu_ epi8 Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed 8-bit integers with unsigned saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm_
mask_ โcvtusepi32_ storeu_ epi16 Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed unsigned 16-bit integers with unsigned saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm_
mask_ โcvtusepi64_ epi8 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed unsigned 8-bit integers with unsigned saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtusepi64_ epi16 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed unsigned 16-bit integers with unsigned saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtusepi64_ epi32 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed unsigned 32-bit integers with unsigned saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โcvtusepi64_ storeu_ epi8 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed 8-bit integers with unsigned saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm_
mask_ โcvtusepi64_ storeu_ epi16 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed 16-bit integers with unsigned saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm_
mask_ โcvtusepi64_ storeu_ epi32 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed 32-bit integers with unsigned saturation, and store the active results (those with their respective bit set in writemask k) to unaligned memory at base_addr.
- _mm_
mask_ โdiv_ pd Experimental avx512f,avx512vl
- Divide packed double-precision (64-bit) floating-point elements in a by packed elements in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โdiv_ ps Experimental avx512f,avx512vl
- Divide packed single-precision (32-bit) floating-point elements in a by packed elements in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โdiv_ round_ sd Experimental avx512f
- Divide the lower double-precision (64-bit) floating-point element in a by the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.\
- _mm_
mask_ โdiv_ round_ ss Experimental avx512f
- Divide the lower single-precision (32-bit) floating-point element in a by the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
mask_ โdiv_ sd Experimental avx512f
- Divide the lower double-precision (64-bit) floating-point element in a by the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
mask_ โdiv_ ss Experimental avx512f
- Divide the lower single-precision (32-bit) floating-point element in a by the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
mask_ โexpand_ epi32 Experimental avx512f,avx512vl
- Load contiguous active 32-bit integers from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โexpand_ epi64 Experimental avx512f,avx512vl
- Load contiguous active 64-bit integers from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โexpand_ pd Experimental avx512f,avx512vl
- Load contiguous active double-precision (64-bit) floating-point elements from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โexpand_ ps Experimental avx512f,avx512vl
- Load contiguous active single-precision (32-bit) floating-point elements from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โexpandloadu_ epi32 Experimental avx512f,avx512vl
- Load contiguous active 32-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โexpandloadu_ epi64 Experimental avx512f,avx512vl
- Load contiguous active 64-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โexpandloadu_ pd Experimental avx512f,avx512vl
- Load contiguous active double-precision (64-bit) floating-point elements from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โexpandloadu_ ps Experimental avx512f,avx512vl
- Load contiguous active single-precision (32-bit) floating-point elements from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โfixupimm_ pd Experimental avx512f,avx512vl
- Fix up packed double-precision (64-bit) floating-point elements in a and b using packed 64-bit integers in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). imm8 is used to set the required flags reporting.
- _mm_
mask_ โfixupimm_ ps Experimental avx512f,avx512vl
- Fix up packed single-precision (32-bit) floating-point elements in a and b using packed 32-bit integers in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). imm8 is used to set the required flags reporting.
- _mm_
mask_ โfixupimm_ round_ sd Experimental avx512f
- Fix up the lower double-precision (64-bit) floating-point elements in a and b using the lower 64-bit integer in c, store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. imm8 is used to set the required flags reporting.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
mask_ โfixupimm_ round_ ss Experimental avx512f
- Fix up the lower single-precision (32-bit) floating-point elements in a and b using the lower 32-bit integer in c, store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. imm8 is used to set the required flags reporting.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
mask_ โfixupimm_ sd Experimental avx512f
- Fix up the lower double-precision (64-bit) floating-point elements in a and b using the lower 64-bit integer in c, store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. imm8 is used to set the required flags reporting.
- _mm_
mask_ โfixupimm_ ss Experimental avx512f
- Fix up the lower single-precision (32-bit) floating-point elements in a and b using the lower 32-bit integer in c, store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. imm8 is used to set the required flags reporting.
- _mm_
mask_ โfmadd_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm_
mask_ โfmadd_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm_
mask_ โfmadd_ round_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.\
- _mm_
mask_ โfmadd_ round_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
mask_ โfmadd_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
mask_ โfmadd_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
mask_ โfmaddsub_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm_
mask_ โfmaddsub_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm_
mask_ โfmsub_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm_
mask_ โfmsub_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm_
mask_ โfmsub_ round_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.\
- _mm_
mask_ โfmsub_ round_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
mask_ โfmsub_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
mask_ โfmsub_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
mask_ โfmsubadd_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm_
mask_ โfmsubadd_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm_
mask_ โfnmadd_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm_
mask_ โfnmadd_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm_
mask_ โfnmadd_ round_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.\
- _mm_
mask_ โfnmadd_ round_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
mask_ โfnmadd_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
mask_ โfnmadd_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
mask_ โfnmsub_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm_
mask_ โfnmsub_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm_
mask_ โfnmsub_ round_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.\
- _mm_
mask_ โfnmsub_ round_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
mask_ โfnmsub_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
mask_ โfnmsub_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
mask_ โgetexp_ pd Experimental avx512f,avx512vl
- Convert the exponent of each packed double-precision (64-bit) floating-point element in a to a double-precision (64-bit) floating-point number representing the integer exponent, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element.
- _mm_
mask_ โgetexp_ ps Experimental avx512f,avx512vl
- Convert the exponent of each packed single-precision (32-bit) floating-point element in a to a single-precision (32-bit) floating-point number representing the integer exponent, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element.
- _mm_
mask_ โgetexp_ round_ sd Experimental avx512f
- Convert the exponent of the lower double-precision (64-bit) floating-point element in b to a double-precision (64-bit) floating-point number representing the integer exponent, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
mask_ โgetexp_ round_ ss Experimental avx512f
- Convert the exponent of the lower single-precision (32-bit) floating-point element in b to a single-precision (32-bit) floating-point number representing the integer exponent, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
mask_ โgetexp_ sd Experimental avx512f
- Convert the exponent of the lower double-precision (64-bit) floating-point element in b to a double-precision (64-bit) floating-point number representing the integer exponent, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element.
- _mm_
mask_ โgetexp_ ss Experimental avx512f
- Convert the exponent of the lower single-precision (32-bit) floating-point element in b to a single-precision (32-bit) floating-point number representing the integer exponent, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element.
- _mm_
mask_ โgetmant_ pd Experimental avx512f,avx512vl
- Normalize the mantissas of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 - _mm_
mask_ โgetmant_ ps Experimental avx512f,avx512vl
- Normalize the mantissas of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 - _mm_
mask_ โgetmant_ round_ sd Experimental avx512f
- Normalize the mantissas of the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
mask_ โgetmant_ round_ ss Experimental avx512f
- Normalize the mantissas of the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
mask_ โgetmant_ sd Experimental avx512f
- Normalize the mantissas of the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
mask_ โgetmant_ ss Experimental avx512f
- Normalize the mantissas of the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
mask_ โi32scatter_ epi32 Experimental avx512f,avx512vl
- Stores 4 32-bit integer elements from a to memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale using writemask k (elements whose corresponding mask bit is not set are not written to memory).
- _mm_
mask_ โi32scatter_ epi64 Experimental avx512f,avx512vl
- Stores 2 64-bit integer elements from a to memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale using writemask k (elements whose corresponding mask bit is not set are not written to memory).
- _mm_
mask_ โi32scatter_ pd Experimental avx512f,avx512vl
- Stores 2 double-precision (64-bit) floating-point elements from a to memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale using writemask k (elements whose corresponding mask bit is not set are not written to memory).
- _mm_
mask_ โi32scatter_ ps Experimental avx512f,avx512vl
- Stores 4 single-precision (32-bit) floating-point elements from a to memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale using writemask k (elements whose corresponding mask bit is not set are not written to memory).
- _mm_
mask_ โi64scatter_ epi32 Experimental avx512f,avx512vl
- Stores 2 32-bit integer elements from a to memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale using writemask k (elements whose corresponding mask bit is not set are not written to memory).
- _mm_
mask_ โi64scatter_ epi64 Experimental avx512f,avx512vl
- Stores 2 64-bit integer elements from a to memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale using writemask k (elements whose corresponding mask bit is not set are not written to memory).
- _mm_
mask_ โi64scatter_ pd Experimental avx512f,avx512vl
- Stores 2 double-precision (64-bit) floating-point elements from a to memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale using writemask k (elements whose corresponding mask bit is not set are not written to memory).
- _mm_
mask_ โi64scatter_ ps Experimental avx512f,avx512vl
- Stores 2 single-precision (32-bit) floating-point elements from a to memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale using writemask k (elements whose corresponding
- _mm_
mask_ โload_ epi32 Experimental avx512f,avx512vl
- Load packed 32-bit integers from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
mask_ โload_ epi64 Experimental avx512f,avx512vl
- Load packed 64-bit integers from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
mask_ โload_ pd Experimental avx512f,avx512vl
- Load packed double-precision (64-bit) floating-point elements from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
mask_ โload_ ps Experimental avx512f,avx512vl
- Load packed single-precision (32-bit) floating-point elements from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
mask_ โload_ sd Experimental avx512f
- Load a double-precision (64-bit) floating-point element from memory into the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and set the upper element of dst to zero. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
mask_ โload_ ss Experimental avx512f
- Load a single-precision (32-bit) floating-point element from memory into the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and set the upper 3 packed elements of dst to zero. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
mask_ โloadu_ epi32 Experimental avx512f,avx512vl
- Load packed 32-bit integers from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm_
mask_ โloadu_ epi64 Experimental avx512f,avx512vl
- Load packed 64-bit integers from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm_
mask_ โloadu_ pd Experimental avx512f,avx512vl
- Load packed double-precision (64-bit) floating-point elements from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm_
mask_ โloadu_ ps Experimental avx512f,avx512vl
- Load packed single-precision (32-bit) floating-point elements from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm_
mask_ โmax_ epi32 Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmax_ epi64 Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmax_ epu32 Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmax_ epu64 Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmax_ pd Experimental avx512f,avx512vl
- Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmax_ ps Experimental avx512f,avx512vl
- Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmax_ round_ sd Experimental avx512f
- Compare the lower double-precision (64-bit) floating-point elements in a and b, store the maximum value in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
mask_ โmax_ round_ ss Experimental avx512f
- Compare the lower single-precision (32-bit) floating-point elements in a and b, store the maximum value in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
mask_ โmax_ sd Experimental avx512f
- Compare the lower double-precision (64-bit) floating-point elements in a and b, store the maximum value in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
mask_ โmax_ ss Experimental avx512f
- Compare the lower single-precision (32-bit) floating-point elements in a and b, store the maximum value in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
mask_ โmin_ epi32 Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmin_ epi64 Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmin_ epu32 Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmin_ epu64 Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmin_ pd Experimental avx512f,avx512vl
- Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmin_ ps Experimental avx512f,avx512vl
- Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmin_ round_ sd Experimental avx512f
- Compare the lower double-precision (64-bit) floating-point elements in a and b, store the minimum value in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
mask_ โmin_ round_ ss Experimental avx512f
- Compare the lower single-precision (32-bit) floating-point elements in a and b, store the minimum value in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
mask_ โmin_ sd Experimental avx512f
- Compare the lower double-precision (64-bit) floating-point elements in a and b, store the minimum value in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
mask_ โmin_ ss Experimental avx512f
- Compare the lower single-precision (32-bit) floating-point elements in a and b, store the minimum value in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
mask_ โmov_ epi32 Experimental avx512f,avx512vl
- Move packed 32-bit integers from a to dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmov_ epi64 Experimental avx512f,avx512vl
- Move packed 64-bit integers from a to dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmov_ pd Experimental avx512f,avx512vl
- Move packed double-precision (64-bit) floating-point elements from a to dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmov_ ps Experimental avx512f,avx512vl
- Move packed single-precision (32-bit) floating-point elements from a to dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmove_ sd Experimental avx512f
- Move the lower double-precision (64-bit) floating-point element from b to the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
mask_ โmove_ ss Experimental avx512f
- Move the lower single-precision (32-bit) floating-point element from b to the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
mask_ โmovedup_ pd Experimental avx512f,avx512vl
- Duplicate even-indexed double-precision (64-bit) floating-point elements from a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmovehdup_ ps Experimental avx512f,avx512vl
- Duplicate odd-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmoveldup_ ps Experimental avx512f,avx512vl
- Duplicate even-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmul_ epi32 Experimental avx512f,avx512vl
- Multiply the low signed 32-bit integers from each packed 64-bit element in a and b, and store the signed 64-bit results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmul_ epu32 Experimental avx512f,avx512vl
- Multiply the low unsigned 32-bit integers from each packed 64-bit element in a and b, and store the unsigned 64-bit results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmul_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmul_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โmul_ round_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.\
- _mm_
mask_ โmul_ round_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
mask_ โmul_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
mask_ โmul_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
mask_ โmullo_ epi32 Experimental avx512f,avx512vl
- Multiply the packed 32-bit integers in a and b, producing intermediate 64-bit integers, and store the low 32 bits of the intermediate integers in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โor_ epi32 Experimental avx512f,avx512vl
- Compute the bitwise OR of packed 32-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โor_ epi64 Experimental avx512f,avx512vl
- Compute the bitwise OR of packed 64-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โpermute_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โpermute_ ps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โpermutevar_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โpermutevar_ ps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โpermutex2var_ epi32 Experimental avx512f,avx512vl
- Shuffle 32-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm_
mask_ โpermutex2var_ epi64 Experimental avx512f,avx512vl
- Shuffle 64-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm_
mask_ โpermutex2var_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm_
mask_ โpermutex2var_ ps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
- _mm_
mask_ โrcp14_ pd Experimental avx512f,avx512vl
- Compute the approximate reciprocal of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm_
mask_ โrcp14_ ps Experimental avx512f,avx512vl
- Compute the approximate reciprocal of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm_
mask_ โrcp14_ sd Experimental avx512f
- Compute the approximate reciprocal of the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. The maximum relative error for this approximation is less than 2^-14.
- _mm_
mask_ โrcp14_ ss Experimental avx512f
- Compute the approximate reciprocal of the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. The maximum relative error for this approximation is less than 2^-14.
- _mm_
mask_ โrol_ epi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the left by the number of bits specified in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โrol_ epi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the left by the number of bits specified in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โrolv_ epi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โrolv_ epi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โror_ epi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the right by the number of bits specified in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โror_ epi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the right by the number of bits specified in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โrorv_ epi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โrorv_ epi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โroundscale_ pd Experimental avx512f,avx512vl
- Round packed double-precision (64-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
mask_ โroundscale_ ps Experimental avx512f,avx512vl
- Round packed single-precision (32-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
mask_ โroundscale_ round_ sd Experimental avx512f
- Round the lower double-precision (64-bit) floating-point element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
mask_ โroundscale_ round_ ss Experimental avx512f
- Round the lower single-precision (32-bit) floating-point element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
mask_ โroundscale_ sd Experimental avx512f
- Round the lower double-precision (64-bit) floating-point element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
mask_ โroundscale_ ss Experimental avx512f
- Round the lower single-precision (32-bit) floating-point element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
mask_ โrsqrt14_ pd Experimental avx512f,avx512vl
- Compute the approximate reciprocal square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm_
mask_ โrsqrt14_ ps Experimental avx512f,avx512vl
- Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm_
mask_ โrsqrt14_ sd Experimental avx512f
- Compute the approximate reciprocal square root of the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. The maximum relative error for this approximation is less than 2^-14.
- _mm_
mask_ โrsqrt14_ ss Experimental avx512f
- Compute the approximate reciprocal square root of the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. The maximum relative error for this approximation is less than 2^-14.
- _mm_
mask_ โscalef_ pd Experimental avx512f,avx512vl
- Scale the packed double-precision (64-bit) floating-point elements in a using values from b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โscalef_ ps Experimental avx512f,avx512vl
- Scale the packed single-precision (32-bit) floating-point elements in a using values from b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โscalef_ round_ sd Experimental avx512f
- Scale the packed double-precision (64-bit) floating-point elements in a using values from b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.\
- _mm_
mask_ โscalef_ round_ ss Experimental avx512f
- Scale the packed single-precision (32-bit) floating-point elements in a using values from b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
mask_ โscalef_ sd Experimental avx512f
- Scale the packed double-precision (64-bit) floating-point elements in a using values from b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
mask_ โscalef_ ss Experimental avx512f
- Scale the packed single-precision (32-bit) floating-point elements in a using values from b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
mask_ โset1_ epi32 Experimental avx512f,avx512vl
- Broadcast 32-bit integer a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โset1_ epi64 Experimental avx512f,avx512vl
- Broadcast 64-bit integer a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โshuffle_ epi32 Experimental avx512f,avx512vl
- Shuffle 32-bit integers in a within 128-bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โshuffle_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements within 128-bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โshuffle_ ps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsll_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a left by count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsll_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a left by count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โslli_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a left by imm8 while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โslli_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a left by imm8 while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsllv_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsllv_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsqrt_ pd Experimental avx512f,avx512vl
- Compute the square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsqrt_ ps Experimental avx512f,avx512vl
- Compute the square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsqrt_ round_ sd Experimental avx512f
- Compute the square root of the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.\
- _mm_
mask_ โsqrt_ round_ ss Experimental avx512f
- Compute the square root of the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
mask_ โsqrt_ sd Experimental avx512f
- Compute the square root of the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
mask_ โsqrt_ ss Experimental avx512f
- Compute the square root of the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
mask_ โsra_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by count while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsra_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by count while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsrai_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by imm8 while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsrai_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by imm8 while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsrav_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsrav_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsrl_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsrl_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsrli_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by imm8 while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsrli_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by imm8 while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsrlv_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsrlv_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โstore_ epi32 Experimental avx512f,avx512vl
- Store packed 32-bit integers from a into memory using writemask k. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
mask_ โstore_ epi64 Experimental avx512f,avx512vl
- Store packed 64-bit integers from a into memory using writemask k. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
mask_ โstore_ pd Experimental avx512f,avx512vl
- Store packed double-precision (64-bit) floating-point elements from a into memory using writemask k. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
mask_ โstore_ ps Experimental avx512f,avx512vl
- Store packed single-precision (32-bit) floating-point elements from a into memory using writemask k. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
mask_ โstore_ sd Experimental avx512f
- Store a double-precision (64-bit) floating-point element from a into memory using writemask k. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
mask_ โstore_ ss Experimental avx512f
- Store a single-precision (32-bit) floating-point element from a into memory using writemask k. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
mask_ โstoreu_ epi32 Experimental avx512f,avx512vl
- Store packed 32-bit integers from a into memory using writemask k. mem_addr does not need to be aligned on any particular boundary.
- _mm_
mask_ โstoreu_ epi64 Experimental avx512f,avx512vl
- Store packed 64-bit integers from a into memory using writemask k. mem_addr does not need to be aligned on any particular boundary.
- _mm_
mask_ โstoreu_ pd Experimental avx512f,avx512vl
- Store packed double-precision (64-bit) floating-point elements from a into memory using writemask k. mem_addr does not need to be aligned on any particular boundary.
- _mm_
mask_ โstoreu_ ps Experimental avx512f,avx512vl
- Store packed single-precision (32-bit) floating-point elements from a into memory using writemask k. mem_addr does not need to be aligned on any particular boundary.
- _mm_
mask_ โsub_ epi32 Experimental avx512f,avx512vl
- Subtract packed 32-bit integers in b from packed 32-bit integers in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsub_ epi64 Experimental avx512f,avx512vl
- Subtract packed 64-bit integers in b from packed 64-bit integers in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsub_ pd Experimental avx512f,avx512vl
- Subtract packed double-precision (64-bit) floating-point elements in b from packed double-precision (64-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsub_ ps Experimental avx512f,avx512vl
- Subtract packed single-precision (32-bit) floating-point elements in b from packed single-precision (32-bit) floating-point elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โsub_ round_ sd Experimental avx512f
- Subtract the lower double-precision (64-bit) floating-point element in b from the lower double-precision (64-bit) floating-point element in a, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.\
- _mm_
mask_ โsub_ round_ ss Experimental avx512f
- Subtract the lower single-precision (32-bit) floating-point element in b from the lower single-precision (32-bit) floating-point element in a, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
mask_ โsub_ sd Experimental avx512f
- Subtract the lower double-precision (64-bit) floating-point element in b from the lower double-precision (64-bit) floating-point element in a, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
mask_ โsub_ ss Experimental avx512f
- Subtract the lower single-precision (32-bit) floating-point element in b from the lower single-precision (32-bit) floating-point element in a, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
mask_ โternarylogic_ epi32 Experimental avx512f,avx512vl
- Bitwise ternary logic that provides the capability to implement any three-operand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 32-bit integer, the corresponding bit from src, a, and b are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst using writemask k at 32-bit granularity (32-bit elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โternarylogic_ epi64 Experimental avx512f,avx512vl
- Bitwise ternary logic that provides the capability to implement any three-operand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 64-bit integer, the corresponding bit from src, a, and b are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst using writemask k at 64-bit granularity (64-bit elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โtest_ epi32_ mask Experimental avx512f,avx512vl
- Compute the bitwise AND of packed 32-bit integers in a and b, producing intermediate 32-bit values, and set the corresponding bit in result mask k (subject to writemask k) if the intermediate value is non-zero.
- _mm_
mask_ โtest_ epi64_ mask Experimental avx512f,avx512vl
- Compute the bitwise AND of packed 64-bit integers in a and b, producing intermediate 64-bit values, and set the corresponding bit in result mask k (subject to writemask k) if the intermediate value is non-zero.
- _mm_
mask_ โtestn_ epi32_ mask Experimental avx512f,avx512vl
- Compute the bitwise NAND of packed 32-bit integers in a and b, producing intermediate 32-bit values, and set the corresponding bit in result mask k (subject to writemask k) if the intermediate value is zero.
- _mm_
mask_ โtestn_ epi64_ mask Experimental avx512f,avx512vl
- Compute the bitwise NAND of packed 64-bit integers in a and b, producing intermediate 64-bit values, and set the corresponding bit in result mask k (subject to writemask k) if the intermediate value is zero.
- _mm_
mask_ โunpackhi_ epi32 Experimental avx512f,avx512vl
- Unpack and interleave 32-bit integers from the high half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โunpackhi_ epi64 Experimental avx512f,avx512vl
- Unpack and interleave 64-bit integers from the high half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โunpackhi_ pd Experimental avx512f,avx512vl
- Unpack and interleave double-precision (64-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โunpackhi_ ps Experimental avx512f,avx512vl
- Unpack and interleave single-precision (32-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โunpacklo_ epi32 Experimental avx512f,avx512vl
- Unpack and interleave 32-bit integers from the low half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โunpacklo_ epi64 Experimental avx512f,avx512vl
- Unpack and interleave 64-bit integers from the low half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โunpacklo_ pd Experimental avx512f,avx512vl
- Unpack and interleave double-precision (64-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โunpacklo_ ps Experimental avx512f,avx512vl
- Unpack and interleave single-precision (32-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โxor_ epi32 Experimental avx512f,avx512vl
- Compute the bitwise XOR of packed 32-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mask_ โxor_ epi64 Experimental avx512f,avx512vl
- Compute the bitwise XOR of packed 64-bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
maskz_ โabs_ epi32 Experimental avx512f,avx512vl
- Compute the absolute value of packed signed 32-bit integers in a, and store the unsigned results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โabs_ epi64 Experimental avx512f,avx512vl
- Compute the absolute value of packed signed 64-bit integers in a, and store the unsigned results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โadd_ epi32 Experimental avx512f,avx512vl
- Add packed 32-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โadd_ epi64 Experimental avx512f,avx512vl
- Add packed 64-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โadd_ pd Experimental avx512f,avx512vl
- Add packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โadd_ ps Experimental avx512f,avx512vl
- Add packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โadd_ round_ sd Experimental avx512f
- Add the lower double-precision (64-bit) floating-point element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.\
- _mm_
maskz_ โadd_ round_ ss Experimental avx512f
- Add the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
maskz_ โadd_ sd Experimental avx512f
- Add the lower double-precision (64-bit) floating-point element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
maskz_ โadd_ ss Experimental avx512f
- Add the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
maskz_ โalignr_ epi32 Experimental avx512f,avx512vl
- Concatenate a and b into a 32-byte immediate result, shift the result right by imm8 32-bit elements, and store the low 16 bytes (4 elements) in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โalignr_ epi64 Experimental avx512f,avx512vl
- Concatenate a and b into a 32-byte immediate result, shift the result right by imm8 64-bit elements, and store the low 16 bytes (2 elements) in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โand_ epi32 Experimental avx512f,avx512vl
- Compute the bitwise AND of packed 32-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โand_ epi64 Experimental avx512f,avx512vl
- Compute the bitwise AND of packed 64-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โandnot_ epi32 Experimental avx512f,avx512vl
- Compute the bitwise NOT of packed 32-bit integers in a and then AND with b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โandnot_ epi64 Experimental avx512f,avx512vl
- Compute the bitwise NOT of packed 64-bit integers in a and then AND with b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โbroadcastd_ epi32 Experimental avx512f,avx512vl
- Broadcast the low packed 32-bit integer from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โbroadcastq_ epi64 Experimental avx512f,avx512vl
- Broadcast the low packed 64-bit integer from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โbroadcastss_ ps Experimental avx512f,avx512vl
- Broadcast the low single-precision (32-bit) floating-point element from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcompress_ epi32 Experimental avx512f,avx512vl
- Contiguously store the active 32-bit integers in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero.
- _mm_
maskz_ โcompress_ epi64 Experimental avx512f,avx512vl
- Contiguously store the active 64-bit integers in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero.
- _mm_
maskz_ โcompress_ pd Experimental avx512f,avx512vl
- Contiguously store the active double-precision (64-bit) floating-point elements in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero.
- _mm_
maskz_ โcompress_ ps Experimental avx512f,avx512vl
- Contiguously store the active single-precision (32-bit) floating-point elements in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero.
- _mm_
maskz_ โcvt_ roundps_ ph Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed half-precision (16-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
maskz_ โcvt_ roundsd_ ss Experimental avx512f
- Convert the lower double-precision (64-bit) floating-point element in b to a single-precision (32-bit) floating-point element, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
Rounding is done according to the rounding[3:0] parameter, which can be one of:\ - _mm_
maskz_ โcvt_ roundss_ sd Experimental avx512f
- Convert the lower single-precision (32-bit) floating-point element in b to a double-precision (64-bit) floating-point element, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
maskz_ โcvtepi8_ epi32 Experimental avx512f,avx512vl
- Sign extend packed 8-bit integers in a to packed 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtepi8_ epi64 Experimental avx512f,avx512vl
- Sign extend packed 8-bit integers in the low 2 bytes of a to packed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtepi16_ epi32 Experimental avx512f,avx512vl
- Sign extend packed 16-bit integers in a to packed 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtepi16_ epi64 Experimental avx512f,avx512vl
- Sign extend packed 16-bit integers in a to packed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtepi32_ epi8 Experimental avx512f,avx512vl
- Convert packed 32-bit integers in a to packed 8-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtepi32_ epi16 Experimental avx512f,avx512vl
- Convert packed 32-bit integers in a to packed 16-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtepi32_ epi64 Experimental avx512f,avx512vl
- Sign extend packed 32-bit integers in a to packed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtepi32_ pd Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtepi32_ ps Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtepi64_ epi8 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 8-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtepi64_ epi16 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 16-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtepi64_ epi32 Experimental avx512f,avx512vl
- Convert packed 64-bit integers in a to packed 32-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtepu8_ epi32 Experimental avx512f,avx512vl
- Zero extend packed unsigned 8-bit integers in th elow 4 bytes of a to packed 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtepu8_ epi64 Experimental avx512f,avx512vl
- Zero extend packed unsigned 8-bit integers in the low 2 bytes of a to packed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtepu16_ epi32 Experimental avx512f,avx512vl
- Zero extend packed unsigned 16-bit integers in a to packed 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtepu16_ epi64 Experimental avx512f,avx512vl
- Zero extend packed unsigned 16-bit integers in the low 4 bytes of a to packed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtepu32_ epi64 Experimental avx512f,avx512vl
- Zero extend packed unsigned 32-bit integers in a to packed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtepu32_ pd Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtpd_ epi32 Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtpd_ epu32 Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtpd_ ps Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtph_ ps Experimental avx512f,avx512vl
- Convert packed half-precision (16-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtps_ epi32 Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtps_ epu32 Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtps_ ph Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed half-precision (16-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
maskz_ โcvtsd_ ss Experimental avx512f
- Convert the lower double-precision (64-bit) floating-point element in b to a single-precision (32-bit) floating-point element, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
maskz_ โcvtsepi32_ epi8 Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed 8-bit integers with signed saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtsepi32_ epi16 Experimental avx512f,avx512vl
- Convert packed signed 32-bit integers in a to packed 16-bit integers with signed saturation, and store the results in dst.
- _mm_
maskz_ โcvtsepi64_ epi8 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 8-bit integers with signed saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtsepi64_ epi16 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 16-bit integers with signed saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtsepi64_ epi32 Experimental avx512f,avx512vl
- Convert packed signed 64-bit integers in a to packed 32-bit integers with signed saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtss_ sd Experimental avx512f
- Convert the lower single-precision (32-bit) floating-point element in b to a double-precision (64-bit) floating-point element, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
maskz_ โcvttpd_ epi32 Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvttpd_ epu32 Experimental avx512f,avx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvttps_ epi32 Experimental avx512f,avx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvttps_ epu32 Experimental avx512f,avx512vl
- Convert packed double-precision (32-bit) floating-point elements in a to packed unsigned 32-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtusepi32_ epi8 Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed unsigned 8-bit integers with unsigned saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtusepi32_ epi16 Experimental avx512f,avx512vl
- Convert packed unsigned 32-bit integers in a to packed unsigned 16-bit integers with unsigned saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtusepi64_ epi8 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed unsigned 8-bit integers with unsigned saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtusepi64_ epi16 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed unsigned 16-bit integers with unsigned saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โcvtusepi64_ epi32 Experimental avx512f,avx512vl
- Convert packed unsigned 64-bit integers in a to packed unsigned 32-bit integers with unsigned saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โdiv_ pd Experimental avx512f,avx512vl
- Divide packed double-precision (64-bit) floating-point elements in a by packed elements in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โdiv_ ps Experimental avx512f,avx512vl
- Divide packed single-precision (32-bit) floating-point elements in a by packed elements in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โdiv_ round_ sd Experimental avx512f
- Divide the lower double-precision (64-bit) floating-point element in a by the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.\
- _mm_
maskz_ โdiv_ round_ ss Experimental avx512f
- Divide the lower single-precision (32-bit) floating-point element in a by the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
maskz_ โdiv_ sd Experimental avx512f
- Divide the lower double-precision (64-bit) floating-point element in a by the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
maskz_ โdiv_ ss Experimental avx512f
- Divide the lower single-precision (32-bit) floating-point element in a by the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
maskz_ โexpand_ epi32 Experimental avx512f,avx512vl
- Load contiguous active 32-bit integers from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โexpand_ epi64 Experimental avx512f,avx512vl
- Load contiguous active 64-bit integers from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โexpand_ pd Experimental avx512f,avx512vl
- Load contiguous active double-precision (64-bit) floating-point elements from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โexpand_ ps Experimental avx512f,avx512vl
- Load contiguous active single-precision (32-bit) floating-point elements from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โexpandloadu_ epi32 Experimental avx512f,avx512vl
- Load contiguous active 32-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โexpandloadu_ epi64 Experimental avx512f,avx512vl
- Load contiguous active 64-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โexpandloadu_ pd Experimental avx512f,avx512vl
- Load contiguous active double-precision (64-bit) floating-point elements from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โexpandloadu_ ps Experimental avx512f,avx512vl
- Load contiguous active single-precision (32-bit) floating-point elements from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โfixupimm_ pd Experimental avx512f,avx512vl
- Fix up packed double-precision (64-bit) floating-point elements in a and b using packed 64-bit integers in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). imm8 is used to set the required flags reporting.
- _mm_
maskz_ โfixupimm_ ps Experimental avx512f,avx512vl
- Fix up packed single-precision (32-bit) floating-point elements in a and b using packed 32-bit integers in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). imm8 is used to set the required flags reporting.
- _mm_
maskz_ โfixupimm_ round_ sd Experimental avx512f
- Fix up the lower double-precision (64-bit) floating-point elements in a and b using the lower 64-bit integer in c, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. imm8 is used to set the required flags reporting.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
maskz_ โfixupimm_ round_ ss Experimental avx512f
- Fix up the lower single-precision (32-bit) floating-point elements in a and b using the lower 32-bit integer in c, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. imm8 is used to set the required flags reporting.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
maskz_ โfixupimm_ sd Experimental avx512f
- Fix up the lower double-precision (64-bit) floating-point elements in a and b using the lower 64-bit integer in c, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. imm8 is used to set the required flags reporting.
- _mm_
maskz_ โfixupimm_ ss Experimental avx512f
- Fix up the lower single-precision (32-bit) floating-point elements in a and b using the lower 32-bit integer in c, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. imm8 is used to set the required flags reporting.
- _mm_
maskz_ โfmadd_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โfmadd_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โfmadd_ round_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.\
- _mm_
maskz_ โfmadd_ round_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
maskz_ โfmadd_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
maskz_ โfmadd_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
maskz_ โfmaddsub_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โfmaddsub_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โfmsub_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โfmsub_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โfmsub_ round_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.\
- _mm_
maskz_ โfmsub_ round_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
maskz_ โfmsub_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
maskz_ โfmsub_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
maskz_ โfmsubadd_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โfmsubadd_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โfnmadd_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โfnmadd_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โfnmadd_ round_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.\
- _mm_
maskz_ โfnmadd_ round_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
maskz_ โfnmadd_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
maskz_ โfnmadd_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
maskz_ โfnmsub_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โfnmsub_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โfnmsub_ round_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.\
- _mm_
maskz_ โfnmsub_ round_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
maskz_ โfnmsub_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
maskz_ โfnmsub_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
maskz_ โgetexp_ pd Experimental avx512f,avx512vl
- Convert the exponent of each packed double-precision (64-bit) floating-point element in a to a double-precision (64-bit) floating-point number representing the integer exponent, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element.
- _mm_
maskz_ โgetexp_ ps Experimental avx512f,avx512vl
- Convert the exponent of each packed single-precision (32-bit) floating-point element in a to a single-precision (32-bit) floating-point number representing the integer exponent, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element.
- _mm_
maskz_ โgetexp_ round_ sd Experimental avx512f
- Convert the exponent of the lower double-precision (64-bit) floating-point element in b to a double-precision (64-bit) floating-point number representing the integer exponent, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
maskz_ โgetexp_ round_ ss Experimental avx512f
- Convert the exponent of the lower single-precision (32-bit) floating-point element in b to a single-precision (32-bit) floating-point number representing the integer exponent, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
maskz_ โgetexp_ sd Experimental avx512f
- Convert the exponent of the lower double-precision (64-bit) floating-point element in b to a double-precision (64-bit) floating-point number representing the integer exponent, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element.
- _mm_
maskz_ โgetexp_ ss Experimental avx512f
- Convert the exponent of the lower single-precision (32-bit) floating-point element in b to a single-precision (32-bit) floating-point number representing the integer exponent, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element.
- _mm_
maskz_ โgetmant_ pd Experimental avx512f,avx512vl
- Normalize the mantissas of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 - _mm_
maskz_ โgetmant_ ps Experimental avx512f,avx512vl
- Normalize the mantissas of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 - _mm_
maskz_ โgetmant_ round_ sd Experimental avx512f
- Normalize the mantissas of the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
maskz_ โgetmant_ round_ ss Experimental avx512f
- Normalize the mantissas of the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
maskz_ โgetmant_ sd Experimental avx512f
- Normalize the mantissas of the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
maskz_ โgetmant_ ss Experimental avx512f
- Normalize the mantissas of the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates ยฑ(2^k)*|x.significand|, where k depends on the interval range defined by interv and the sign depends on sc and the source sign.
The mantissa is normalized to the interval specified by interv, which can take the following values:
_MM_MANT_NORM_1_2 // interval [1, 2)
_MM_MANT_NORM_p5_2 // interval [0.5, 2)
_MM_MANT_NORM_p5_1 // interval [0.5, 1)
_MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5)
The sign is determined by sc which can take the following values:
_MM_MANT_SIGN_src // sign = sign(src)
_MM_MANT_SIGN_zero // sign = 0
_MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
maskz_ โload_ epi32 Experimental avx512f,avx512vl
- Load packed 32-bit integers from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
maskz_ โload_ epi64 Experimental avx512f,avx512vl
- Load packed 64-bit integers from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
maskz_ โload_ pd Experimental avx512f,avx512vl
- Load packed double-precision (64-bit) floating-point elements from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
maskz_ โload_ ps Experimental avx512f,avx512vl
- Load packed single-precision (32-bit) floating-point elements from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
maskz_ โload_ sd Experimental avx512f
- Load a double-precision (64-bit) floating-point element from memory into the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and set the upper element of dst to zero. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
maskz_ โload_ ss Experimental avx512f
- Load a single-precision (32-bit) floating-point element from memory into the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and set the upper 3 packed elements of dst to zero. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
maskz_ โloadu_ epi32 Experimental avx512f,avx512vl
- Load packed 32-bit integers from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm_
maskz_ โloadu_ epi64 Experimental avx512f,avx512vl
- Load packed 64-bit integers from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm_
maskz_ โloadu_ pd Experimental avx512f,avx512vl
- Load packed double-precision (64-bit) floating-point elements from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm_
maskz_ โloadu_ ps Experimental avx512f,avx512vl
- Load packed single-precision (32-bit) floating-point elements from memory into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.
- _mm_
maskz_ โmax_ epi32 Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmax_ epi64 Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmax_ epu32 Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmax_ epu64 Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmax_ pd Experimental avx512f,avx512vl
- Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmax_ ps Experimental avx512f,avx512vl
- Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmax_ round_ sd Experimental avx512f
- Compare the lower double-precision (64-bit) floating-point elements in a and b, store the maximum value in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
maskz_ โmax_ round_ ss Experimental avx512f
- Compare the lower single-precision (32-bit) floating-point elements in a and b, store the maximum value in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
maskz_ โmax_ sd Experimental avx512f
- Compare the lower double-precision (64-bit) floating-point elements in a and b, store the maximum value in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
maskz_ โmax_ ss Experimental avx512f
- Compare the lower single-precision (32-bit) floating-point elements in a and b, store the maximum value in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
maskz_ โmin_ epi32 Experimental avx512f,avx512vl
- Compare packed signed 32-bit integers in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmin_ epi64 Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmin_ epu32 Experimental avx512f,avx512vl
- Compare packed unsigned 32-bit integers in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmin_ epu64 Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmin_ pd Experimental avx512f,avx512vl
- Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmin_ ps Experimental avx512f,avx512vl
- Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmin_ round_ sd Experimental avx512f
- Compare the lower double-precision (64-bit) floating-point elements in a and b, store the minimum value in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
maskz_ โmin_ round_ ss Experimental avx512f
- Compare the lower single-precision (32-bit) floating-point elements in a and b, store the minimum value in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
maskz_ โmin_ sd Experimental avx512f
- Compare the lower double-precision (64-bit) floating-point elements in a and b, store the minimum value in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
maskz_ โmin_ ss Experimental avx512f
- Compare the lower single-precision (32-bit) floating-point elements in a and b, store the minimum value in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
maskz_ โmov_ epi32 Experimental avx512f,avx512vl
- Move packed 32-bit integers from a into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmov_ epi64 Experimental avx512f,avx512vl
- Move packed 64-bit integers from a into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmov_ pd Experimental avx512f,avx512vl
- Move packed double-precision (64-bit) floating-point elements from a into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmov_ ps Experimental avx512f,avx512vl
- Move packed single-precision (32-bit) floating-point elements from a into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmove_ sd Experimental avx512f
- Move the lower double-precision (64-bit) floating-point element from b to the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
maskz_ โmove_ ss Experimental avx512f
- Move the lower single-precision (32-bit) floating-point element from b to the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
maskz_ โmovedup_ pd Experimental avx512f,avx512vl
- Duplicate even-indexed double-precision (64-bit) floating-point elements from a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmovehdup_ ps Experimental avx512f,avx512vl
- Duplicate odd-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmoveldup_ ps Experimental avx512f,avx512vl
- Duplicate even-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmul_ epi32 Experimental avx512f,avx512vl
- Multiply the low signed 32-bit integers from each packed 64-bit element in a and b, and store the signed 64-bit results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmul_ epu32 Experimental avx512f,avx512vl
- Multiply the low unsigned 32-bit integers from each packed 64-bit element in a and b, and store the unsigned 64-bit results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmul_ pd Experimental avx512f,avx512vl
- Multiply packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmul_ ps Experimental avx512f,avx512vl
- Multiply packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โmul_ round_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.\
- _mm_
maskz_ โmul_ round_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
maskz_ โmul_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
maskz_ โmul_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
maskz_ โmullo_ epi32 Experimental avx512f,avx512vl
- Multiply the packed 32-bit integers in a and b, producing intermediate 64-bit integers, and store the low 32 bits of the intermediate integers in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โor_ epi32 Experimental avx512f,avx512vl
- Compute the bitwise OR of packed 32-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โor_ epi64 Experimental avx512f,avx512vl
- Compute the bitwise OR of packed 64-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โpermute_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โpermute_ ps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โpermutevar_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โpermutevar_ ps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โpermutex2var_ epi32 Experimental avx512f,avx512vl
- Shuffle 32-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โpermutex2var_ epi64 Experimental avx512f,avx512vl
- Shuffle 64-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โpermutex2var_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โpermutex2var_ ps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โrcp14_ pd Experimental avx512f,avx512vl
- Compute the approximate reciprocal of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm_
maskz_ โrcp14_ ps Experimental avx512f,avx512vl
- Compute the approximate reciprocal of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm_
maskz_ โrcp14_ sd Experimental avx512f
- Compute the approximate reciprocal of the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. The maximum relative error for this approximation is less than 2^-14.
- _mm_
maskz_ โrcp14_ ss Experimental avx512f
- Compute the approximate reciprocal of the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. The maximum relative error for this approximation is less than 2^-14.
- _mm_
maskz_ โrol_ epi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the left by the number of bits specified in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โrol_ epi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the left by the number of bits specified in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โrolv_ epi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โrolv_ epi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โror_ epi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the right by the number of bits specified in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โror_ epi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the right by the number of bits specified in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โrorv_ epi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โrorv_ epi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โroundscale_ pd Experimental avx512f,avx512vl
- Round packed double-precision (64-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
maskz_ โroundscale_ ps Experimental avx512f,avx512vl
- Round packed single-precision (32-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
maskz_ โroundscale_ round_ sd Experimental avx512f
- Round the lower double-precision (64-bit) floating-point element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
maskz_ โroundscale_ round_ ss Experimental avx512f
- Round the lower single-precision (32-bit) floating-point element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
maskz_ โroundscale_ sd Experimental avx512f
- Round the lower double-precision (64-bit) floating-point element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
maskz_ โroundscale_ ss Experimental avx512f
- Round the lower single-precision (32-bit) floating-point element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
maskz_ โrsqrt14_ pd Experimental avx512f,avx512vl
- Compute the approximate reciprocal square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm_
maskz_ โrsqrt14_ ps Experimental avx512f,avx512vl
- Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^-14.
- _mm_
maskz_ โrsqrt14_ sd Experimental avx512f
- Compute the approximate reciprocal square root of the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. The maximum relative error for this approximation is less than 2^-14.
- _mm_
maskz_ โrsqrt14_ ss Experimental avx512f
- Compute the approximate reciprocal square root of the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. The maximum relative error for this approximation is less than 2^-14.
- _mm_
maskz_ โscalef_ pd Experimental avx512f,avx512vl
- Scale the packed double-precision (64-bit) floating-point elements in a using values from b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โscalef_ ps Experimental avx512f,avx512vl
- Scale the packed single-precision (32-bit) floating-point elements in a using values from b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โscalef_ round_ sd Experimental avx512f
- Scale the packed double-precision (64-bit) floating-point elements in a using values from b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.\
- _mm_
maskz_ โscalef_ round_ ss Experimental avx512f
- Scale the packed single-precision (32-bit) floating-point elements in a using values from b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
maskz_ โscalef_ sd Experimental avx512f
- Scale the packed double-precision (64-bit) floating-point elements in a using values from b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
maskz_ โscalef_ ss Experimental avx512f
- Scale the packed single-precision (32-bit) floating-point elements in a using values from b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
maskz_ โset1_ epi32 Experimental avx512f,avx512vl
- Broadcast 32-bit integer a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โset1_ epi64 Experimental avx512f,avx512vl
- Broadcast 64-bit integer a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โshuffle_ epi32 Experimental avx512f,avx512vl
- Shuffle 32-bit integers in a within 128-bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โshuffle_ pd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements within 128-bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โshuffle_ ps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsll_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a left by count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsll_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a left by count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โslli_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a left by imm8 while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โslli_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a left by imm8 while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsllv_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsllv_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsqrt_ pd Experimental avx512f,avx512vl
- Compute the square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsqrt_ ps Experimental avx512f,avx512vl
- Compute the square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsqrt_ round_ sd Experimental avx512f
- Compute the square root of the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.\
- _mm_
maskz_ โsqrt_ round_ ss Experimental avx512f
- Compute the square root of the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
maskz_ โsqrt_ sd Experimental avx512f
- Compute the square root of the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
maskz_ โsqrt_ ss Experimental avx512f
- Compute the square root of the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
maskz_ โsra_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by count while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsra_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by count while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsrai_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by imm8 while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsrai_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by imm8 while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsrav_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsrav_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsrl_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsrl_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsrli_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by imm8 while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsrli_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by imm8 while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsrlv_ epi32 Experimental avx512f,avx512vl
- Shift packed 32-bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsrlv_ epi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsub_ epi32 Experimental avx512f,avx512vl
- Subtract packed 32-bit integers in b from packed 32-bit integers in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsub_ epi64 Experimental avx512f,avx512vl
- Subtract packed 64-bit integers in b from packed 64-bit integers in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsub_ pd Experimental avx512f,avx512vl
- Subtract packed double-precision (64-bit) floating-point elements in b from packed double-precision (64-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsub_ ps Experimental avx512f,avx512vl
- Subtract packed single-precision (32-bit) floating-point elements in b from packed single-precision (32-bit) floating-point elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โsub_ round_ sd Experimental avx512f
- Subtract the lower double-precision (64-bit) floating-point element in b from the lower double-precision (64-bit) floating-point element in a, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.\
- _mm_
maskz_ โsub_ round_ ss Experimental avx512f
- Subtract the lower single-precision (32-bit) floating-point element in b from the lower single-precision (32-bit) floating-point element in a, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
maskz_ โsub_ sd Experimental avx512f
- Subtract the lower double-precision (64-bit) floating-point element in b from the lower double-precision (64-bit) floating-point element in a, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst.
- _mm_
maskz_ โsub_ ss Experimental avx512f
- Subtract the lower single-precision (32-bit) floating-point element in b from the lower single-precision (32-bit) floating-point element in a, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
maskz_ โternarylogic_ epi32 Experimental avx512f,avx512vl
- Bitwise ternary logic that provides the capability to implement any three-operand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 32-bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst using zeromask k at 32-bit granularity (32-bit elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โternarylogic_ epi64 Experimental avx512f,avx512vl
- Bitwise ternary logic that provides the capability to implement any three-operand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 64-bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst using zeromask k at 64-bit granularity (64-bit elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โunpackhi_ epi32 Experimental avx512f,avx512vl
- Unpack and interleave 32-bit integers from the high half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โunpackhi_ epi64 Experimental avx512f,avx512vl
- Unpack and interleave 64-bit integers from the high half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โunpackhi_ pd Experimental avx512f,avx512vl
- Unpack and interleave double-precision (64-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โunpackhi_ ps Experimental avx512f,avx512vl
- Unpack and interleave single-precision (32-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โunpacklo_ epi32 Experimental avx512f,avx512vl
- Unpack and interleave 32-bit integers from the low half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โunpacklo_ epi64 Experimental avx512f,avx512vl
- Unpack and interleave 64-bit integers from the low half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โunpacklo_ pd Experimental avx512f,avx512vl
- Unpack and interleave double-precision (64-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โunpacklo_ ps Experimental avx512f,avx512vl
- Unpack and interleave single-precision (32-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โxor_ epi32 Experimental avx512f,avx512vl
- Compute the bitwise XOR of packed 32-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
maskz_ โxor_ epi64 Experimental avx512f,avx512vl
- Compute the bitwise XOR of packed 64-bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
- _mm_
max_ โepi64 Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b, and store packed maximum values in dst.
- _mm_
max_ โepu64 Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b, and store packed maximum values in dst.
- _mm_
max_ โround_ sd Experimental avx512f
- Compare the lower double-precision (64-bit) floating-point elements in a and b, store the maximum value in the lower element of dst, and copy the upper element from a to the upper element of dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
max_ โround_ ss Experimental avx512f
- Compare the lower single-precision (32-bit) floating-point elements in a and b, store the maximum value in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
min_ โepi64 Experimental avx512f,avx512vl
- Compare packed signed 64-bit integers in a and b, and store packed minimum values in dst.
- _mm_
min_ โepu64 Experimental avx512f,avx512vl
- Compare packed unsigned 64-bit integers in a and b, and store packed minimum values in dst.
- _mm_
min_ โround_ sd Experimental avx512f
- Compare the lower double-precision (64-bit) floating-point elements in a and b, store the minimum value in the lower element of dst , and copy the upper element from a to the upper element of dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
min_ โround_ ss Experimental avx512f
- Compare the lower single-precision (32-bit) floating-point elements in a and b, store the minimum value in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.
Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. - _mm_
mmask_ โi32gather_ epi32 Experimental avx512f,avx512vl
- Loads 4 32-bit integer elements from memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mmask_ โi32gather_ epi64 Experimental avx512f,avx512vl
- Loads 2 64-bit integer elements from memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mmask_ โi32gather_ pd Experimental avx512f,avx512vl
- Loads 2 double-precision (64-bit) floating-point elements from memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mmask_ โi32gather_ ps Experimental avx512f,avx512vl
- Loads 4 single-precision (32-bit) floating-point elements from memory starting at location base_addr at packed 32-bit integer indices stored in vindex scaled by scale using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mmask_ โi64gather_ epi32 Experimental avx512f,avx512vl
- Loads 2 32-bit integer elements from memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mmask_ โi64gather_ epi64 Experimental avx512f,avx512vl
- Loads 2 64-bit integer elements from memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mmask_ โi64gather_ pd Experimental avx512f,avx512vl
- Loads 2 double-precision (64-bit) floating-point elements from memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mmask_ โi64gather_ ps Experimental avx512f,avx512vl
- Loads 2 single-precision (32-bit) floating-point elements from memory starting at location base_addr at packed 64-bit integer indices stored in vindex scaled by scale using writemask k (elements are copied from src when the corresponding mask bit is not set).
- _mm_
mul_ โround_ sd Experimental avx512f
- Multiply the lower double-precision (64-bit) floating-point element in a and b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst.\
- _mm_
mul_ โround_ ss Experimental avx512f
- Multiply the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
or_ โepi32 Experimental avx512f,avx512vl
- Compute the bitwise OR of packed 32-bit integers in a and b, and store the results in dst.
- _mm_
or_ โepi64 Experimental avx512f,avx512vl
- Compute the bitwise OR of packed 64-bit integers in a and b, and store the resut in dst.
- _mm_
permutex2var_ โepi32 Experimental avx512f,avx512vl
- Shuffle 32-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst.
- _mm_
permutex2var_ โepi64 Experimental avx512f,avx512vl
- Shuffle 64-bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst.
- _mm_
permutex2var_ โpd Experimental avx512f,avx512vl
- Shuffle double-precision (64-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst.
- _mm_
permutex2var_ โps Experimental avx512f,avx512vl
- Shuffle single-precision (32-bit) floating-point elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst.
- _mm_
rcp14_ โpd Experimental avx512f,avx512vl
- Compute the approximate reciprocal of packed double-precision (64-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 2^-14.
- _mm_
rcp14_ โps Experimental avx512f,avx512vl
- Compute the approximate reciprocal of packed single-precision (32-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 2^-14.
- _mm_
rcp14_ โsd Experimental avx512f
- Compute the approximate reciprocal of the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. The maximum relative error for this approximation is less than 2^-14.
- _mm_
rcp14_ โss Experimental avx512f
- Compute the approximate reciprocal of the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. The maximum relative error for this approximation is less than 2^-14.
- _mm_
rol_ โepi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the left by the number of bits specified in imm8, and store the results in dst.
- _mm_
rol_ โepi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the left by the number of bits specified in imm8, and store the results in dst.
- _mm_
rolv_ โepi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst.
- _mm_
rolv_ โepi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst.
- _mm_
ror_ โepi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the right by the number of bits specified in imm8, and store the results in dst.
- _mm_
ror_ โepi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the right by the number of bits specified in imm8, and store the results in dst.
- _mm_
rorv_ โepi32 Experimental avx512f,avx512vl
- Rotate the bits in each packed 32-bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst.
- _mm_
rorv_ โepi64 Experimental avx512f,avx512vl
- Rotate the bits in each packed 64-bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst.
- _mm_
roundscale_ โpd Experimental avx512f,avx512vl
- Round packed double-precision (64-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
roundscale_ โps Experimental avx512f,avx512vl
- Round packed single-precision (32-bit) floating-point elements in a to the number of fraction bits specified by imm8, and store the results in dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
roundscale_ โround_ sd Experimental avx512f
- Round the lower double-precision (64-bit) floating-point element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
roundscale_ โround_ ss Experimental avx512f
- Round the lower single-precision (32-bit) floating-point element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
roundscale_ โsd Experimental avx512f
- Round the lower double-precision (64-bit) floating-point element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
roundscale_ โss Experimental avx512f
- Round the lower single-precision (32-bit) floating-point element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:\ - _mm_
rsqrt14_ โpd Experimental avx512f,avx512vl
- Compute the approximate reciprocal square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 2^-14.
- _mm_
rsqrt14_ โps Experimental avx512f,avx512vl
- Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 2^-14.
- _mm_
rsqrt14_ โsd Experimental avx512f
- Compute the approximate reciprocal square root of the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. The maximum relative error for this approximation is less than 2^-14.
- _mm_
rsqrt14_ โss Experimental avx512f
- Compute the approximate reciprocal square root of the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. The maximum relative error for this approximation is less than 2^-14.
- _mm_
scalef_ โpd Experimental avx512f,avx512vl
- Scale the packed double-precision (64-bit) floating-point elements in a using values from b, and store the results in dst.
- _mm_
scalef_ โps Experimental avx512f,avx512vl
- Scale the packed single-precision (32-bit) floating-point elements in a using values from b, and store the results in dst.
- _mm_
scalef_ โround_ sd Experimental avx512f
- Scale the packed double-precision (64-bit) floating-point elements in a using values from b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst.\
- _mm_
scalef_ โround_ ss Experimental avx512f
- Scale the packed single-precision (32-bit) floating-point elements in a using values from b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
scalef_ โsd Experimental avx512f
- Scale the packed double-precision (64-bit) floating-point elements in a using values from b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst.
- _mm_
scalef_ โss Experimental avx512f
- Scale the packed single-precision (32-bit) floating-point elements in a using values from b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.
- _mm_
sqrt_ โround_ sd Experimental avx512f
- Compute the square root of the lower double-precision (64-bit) floating-point element in b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst.\
- _mm_
sqrt_ โround_ ss Experimental avx512f
- Compute the square root of the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
sra_ โepi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by count while shifting in sign bits, and store the results in dst.
- _mm_
srai_ โepi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by imm8 while shifting in sign bits, and store the results in dst.
- _mm_
srav_ โepi64 Experimental avx512f,avx512vl
- Shift packed 64-bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst.
- _mm_
store_ โepi32 Experimental avx512f,avx512vl
- Store 128-bits (composed of 4 packed 32-bit integers) from a into memory. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
store_ โepi64 Experimental avx512f,avx512vl
- Store 128-bits (composed of 2 packed 64-bit integers) from a into memory. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
- _mm_
storeu_ โepi32 Experimental avx512f,avx512vl
- Store 128-bits (composed of 4 packed 32-bit integers) from a into memory. mem_addr does not need to be aligned on any particular boundary.
- _mm_
storeu_ โepi64 Experimental avx512f,avx512vl
- Store 128-bits (composed of 2 packed 64-bit integers) from a into memory. mem_addr does not need to be aligned on any particular boundary.
- _mm_
sub_ โround_ sd Experimental avx512f
- Subtract the lower double-precision (64-bit) floating-point element in b from the lower double-precision (64-bit) floating-point element in a, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst.\
- _mm_
sub_ โround_ ss Experimental avx512f
- Subtract the lower single-precision (32-bit) floating-point element in b from the lower single-precision (32-bit) floating-point element in a, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.\
- _mm_
ternarylogic_ โepi32 Experimental avx512f,avx512vl
- Bitwise ternary logic that provides the capability to implement any three-operand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 32-bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst.
- _mm_
ternarylogic_ โepi64 Experimental avx512f,avx512vl
- Bitwise ternary logic that provides the capability to implement any three-operand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 64-bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst.
- _mm_
test_ โepi32_ mask Experimental avx512f,avx512vl
- Compute the bitwise AND of packed 32-bit integers in a and b, producing intermediate 32-bit values, and set the corresponding bit in result mask k if the intermediate value is non-zero.
- _mm_
test_ โepi64_ mask Experimental avx512f,avx512vl
- Compute the bitwise AND of packed 64-bit integers in a and b, producing intermediate 64-bit values, and set the corresponding bit in result mask k if the intermediate value is non-zero.
- _mm_
testn_ โepi32_ mask Experimental avx512f,avx512vl
- Compute the bitwise NAND of packed 32-bit integers in a and b, producing intermediate 32-bit values, and set the corresponding bit in result mask k if the intermediate value is zero.
- _mm_
testn_ โepi64_ mask Experimental avx512f,avx512vl
- Compute the bitwise NAND of packed 64-bit integers in a and b, producing intermediate 64-bit values, and set the corresponding bit in result mask k if the intermediate value is zero.
- _mm_
xor_ โepi32 Experimental avx512f,avx512vl
- Compute the bitwise XOR of packed 32-bit integers in a and b, and store the results in dst.
- _mm_
xor_ โepi64 Experimental avx512f,avx512vl
- Compute the bitwise XOR of packed 64-bit integers in a and b, and store the results in dst.
- _store_
mask16 โExperimental avx512f
- Store 16-bit mask to memory