I know the following question has been ask here already multiple times but sadly none of the suggested solutions seem to works for me.
I tried to get numpy (and scipy) to stop multithreading. The most common posted answer for this was to set some environmental variables to "1", i.e.
import os
N_THREADS = '1'
os.environ['OMP_NUM_THREADS'] = N_THREADS
os.environ['OPENBLAS_NUM_THREADS'] = N_THREADS
os.environ['MKL_NUM_THREADS'] = N_THREADS
os.environ['VECLIB_MAXIMUM_THREADS'] = N_THREADS
os.environ['NUMEXPR_NUM_THREADS'] = N_THREADS
import numpy as np
while True:
np.linalg.inv(np.identity(100))
However, this Processes still uses 70 or so threads. :/
I also checked weather the environmental variables are set corretly with
for name, value in os.environ.items():
print("{0}: {1}".format(name, value))
And indeed the variables show up:
OMP_NUM_THREADS: 1
OPENBLAS_NUM_THREADS: 1
MKL_NUM_THREADS: 1
VECLIB_MAXIMUM_THREADS: 1
NUMEXPR_NUM_THREADS: 1
An alternative i saw was using a function, i.e.
import mkl
mkl.set_num_threads(1)
However, i still get the same issue.
Next i checked which libary numpy is using with
np.__config__.show()
The results are
openblas64__info:
libraries = ['openblas64_', 'openblas64_']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None), ('BLAS_SYMBOL_SUFFIX', '64_'), ('HAVE_BLAS_ILP64', None)]
runtime_library_dirs = ['/usr/local/lib']
blas_ilp64_opt_info:
libraries = ['openblas64_', 'openblas64_']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None), ('BLAS_SYMBOL_SUFFIX', '64_'), ('HAVE_BLAS_ILP64', None)]
runtime_library_dirs = ['/usr/local/lib']
openblas64__lapack_info:
libraries = ['openblas64_', 'openblas64_']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None), ('BLAS_SYMBOL_SUFFIX', '64_'), ('HAVE_BLAS_ILP64', None), ('HAVE_LAPACKE', None)]
runtime_library_dirs = ['/usr/local/lib']
lapack_ilp64_opt_info:
libraries = ['openblas64_', 'openblas64_']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None), ('BLAS_SYMBOL_SUFFIX', '64_'), ('HAVE_BLAS_ILP64', None), ('HAVE_LAPACKE', None)]
runtime_library_dirs = ['/usr/local/lib']
Supported SIMD extensions in this NumPy install:
baseline = SSE,SSE2,SSE3
found = SSSE3,SSE41,POPCNT,SSE42,AVX,F16C,FMA3,AVX2
not found = AVX512F,AVX512CD,AVX512_KNL,AVX512_KNM,AVX512_SKX,AVX512_CLX,AVX512_CNL,AVX512_ICL
So my numpy seems to be using openblas and not mkl. Next it tried using this random code i found in the net:
def disable_openblas_threading():
"""
A convenience function for turning off openblas threading to avoid costly overhead.
Just setting the `OPENBLAS_NUM_THREADS` environment variable to `1` would be much simpler, but
that only works if the user hasn't already imported `numpy`. This function attempts to use
`ctypes` to load the OpenBLAS library and access the `openblas_set_num_threads` function, which
will work even if the user already imported numpy or scipy.
"""
import numpy as np
import ctypes
from ctypes.util import find_library
try:
np_lib_dir = np.__config__.__dict__['openblas_info']['library_dirs'][0]
except KeyError:
np_lib_dir = None
try_paths = ['{}/libopenblas.so'.format(np_lib_dir),
'{}/libopenblas.dylib'.format(np_lib_dir),
'/opt/OpenBLAS/lib/libopenblas.so',
'/lib/libopenblas.so',
'/usr/lib/libopenblas.so.0',
find_library('openblas')]
openblas_lib = None
for path in try_paths:
try:
openblas_lib = ctypes.cdll.LoadLibrary(path)
except OSError:
continue
try:
openblas_lib.openblas_set_num_threads(1)
except AttributeError:
raise EnvironmentError('Could not locate an OpenBLAS shared library', 2)
It still doesn't work :/.
Now i'm pretty much out of ideas what i can do. Do you have any ideas?
One fact that is maybe relevant for my problem is that i'm working on a pc-cluster but i asked some people responsible for managing the cluster and they said it shouldn't influence my problem.
I never get any error messages but simply my process is using to many threads.
I'm using numpy 1.22.1
Edit: I count the threads by looking at htop before and after starting the script.
Update: Using threadpoolctl solved the Problem :), i.e.
import numpy as np
from threadpoolctl import threadpool_limits
with threadpool_limits(limits=1, user_api='blas'):
while True:
np.linalg.inv(np.identity(100))
Thanks.