Does anyone know of a fix for an MSVC compiler bug/annoyance where SIMD Extension settings get "stuck" on AVX?

429 Views Asked by At

Does anyone know of a fix for an MSVC compiler bug/annoyance where SIMD Extension settings get "stuck" on AVX?

The context of this question is coding up SIMD CPU dispatchers, closely following Agner's well-known dispatch_example2.cpp project. I've been going back and forth in three different MSVC projects and have dead-ended with this issue in two of them, after which one of those two "fixed itself" somehow.

The question is pretty simple: To compile the dispatchers I need to compile 4 times with

/arch:AVX512 /DINSTRSET=10
/arch:AVX2 /DINSTRSET=8
/arch:AVX /DINSTRSET=7
/arch:SSE2 /D__SSE4_2__

While I'm doing this I'm watching the value of INSTRSET and this code:

#if defined ( __AVX512VL__ ) && defined ( __AVX512BW__ ) && defined ( __AVX512DQ__ )
#define AVX512_FLAG 1
#else
#define AVX512_FLAG 2
#endif

#if defined ( __AVX2__ )
#define AVX2_FLAG 1
#else
#define AVX2_FLAG 2
#endif

#if defined ( __AVX__ )
#define AVX_FLAG 1
#else
#define AVX_FLAG 2
#endif

The behavior is like this: For the three AVX compiles everything is exactly as expected. When the problem is not happening, the SSE2 compile shows as expected (AVX512_FLAG, AVX2_FLAG, AVX_FLAG == 2) and the final code runs fine.

When the problem is happening, for the /arch:SSE2 /D__SSE4_2__ compile the code above shows AVX512_FLAG == 2 but AVX2_FLAG == AVX_FLAG == 1 and INSTRSET == 8, and the compiler thinks the AVX2 instructions are enabled - the project compiles, but crashes on an SSE4.2 machine.

If I try /arch:SSE2 /DINSTRSET=6 then I get INSTRSET == 6 for the compile, but the code above still shows AVX2_FLAG == 1 and AVX_FLAG == 1, and the final project still crashes on an SSE4.2 machine.

The crashes happen even if I don't run any vector code - anything that calls into the dispatcher crashes immediately even if all vector code is short circuited.

FYI, trying /DINSTRSET=6 is just an act of desperation - I've never gotten anything to work with SSE4.2 without using /D__SSE4_2__

Does anyone know how to fix this problem that is completely halting my progress? Tried "Clean Solution" already.

2

There are 2 best solutions below

0
dts On BEST ANSWER

I figured this out (it's simple and boring). For the incremental object files I'm compiling 3 .obj files from the same .cpp (the .cpp with the vector code). When the MSVC SIMD settings are changed in the project level Properties, they may or may not get inherited in the .cpp file Properties. This is where the project gets "stuck" on AVX (sometimes, not always). Just need to check the .cpp file properties and make sure they are correct.

BTW I'm using VS 2019, /std:c++17 and the context above is the 32-bit build.

4
Soonts On

If you want a single binary which works on SSE-only computers, but can leverage AVX when available, you need to do following.

  1. At the project level, set “Enable enhanced instruction set: Not set” if you’re building for Win64, or “SSE2” if you’re building for Win32.

  2. Set “Enable enhanced instruction set: AVX” or AVX2 only on the *.cpp files which contain AVX version of your functions.

  3. Make sure to never call these AVX functions unless both CPU and OS (see GetEnabledXStateFeature WinAPI) actually have the support.

Practically speaking, instead of compiling same source file multiple times with different settings, compile 4 different source files. They can contain the same code, C++ has #include preprocessor directive. If you have a single implementation dispatched with these macros, move that implementation into *.inl or *.hpp file, and include that file into 4 different *.cpp files for different CPUs.