-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
BUG: np.dot is not thread-safe with OpenBLAS #11046
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
re-occurance of #4813, there the issue was solved by upgrading to OpenBLAS 0.2.9 |
with multiprocessing everything is fine, it's about multi-threading. I think it's a different issue. |
could anyone reproduce this issue ? |
Yes, reproducible in fedora 28 + openblas 0.2.20, didn't seem to occur with ATLAS. |
There is some experimenting going on with the OpenBLAS versions. It would be good to have a test for this, probably in |
@artemru did you report this bug to the OpenBLAS developers? If so what is the URL of the report? |
@ogrisel, indeed it looks like a pure openblas issue. I did not report it to openblas developers (lack of time and I'm not fluent in c++). Yet, it's not clear for me whether openblas guaranties the thread-safety (I've just looked at https://github.com/xianyi/OpenBLAS/wiki/faq |
This issue is troubling me, but I am not quite sure how it can be solved. Possibly we can push openblas to solve the big bugs? Even than it would be annoying if other BLAS implementations are also not thread safe (by default). I don't like hacks, but for this thing, I don't mind if the solution is seriously ugly or not, I would just prefer if there is one at all... |
I think we should work with upstream OpenBLAS to make it thread-safe. |
It would also be interesting to try to build OpenBLAS with OpenMP instead of its internal libpthread backend and check it the race condition reported by @artemru still happens in this case. OpenMP runtimes are thread-safe by design (I believe), so it's likely that it would fix the issue reported by @artemru. In the past the @matthew-brett decided to build the OpenBLAS included in numpy & scipy wheels with the libpthread backend instead of OpenMP so as to avoid the fork-safety issues of the GCC implementation of the OpenMP runtime named GOMP. @njsmith submitted a patch to the GOMP developers to make it fork-safe but the review stalled: https://gcc.gnu.org/ml/gcc-patches/2014-02/msg00813.html and C-libraries that use OpenMP are still subject to deadlock or crash Python programs that use multiprocessing with the fork startmethod. Nowadays I suspect that OpenBLAS could be built with OpenMP using clang so as to avoid running into the GOMP fork-safety limitations. clang / llvm use the implementation of the openmp runtime opensourced by Intel and as far as I know it is fork-safe. Edit: the thread-safety issue in OpenBLAS is apparently unrelated to its threading backend (pthread vs openmp) as it also occurs when OpenBLAS is compiled with the single thread-mode flag (OpenMathLib/OpenBLAS#1844). |
Note that current NumPy wheels are linked with OpenBLAS 0.3.0 |
it's also reproducible with OpenBLAS 0.3.0. |
Just to note, I have opened an issue at OpenBLAS (OpenMathLib/OpenBLAS#1844), to hopefully discuss things there. Since I do not know the technical details here, any continuation of discussion there would be very welcome. For all I know right now, this seems like a high priority issue to me (also happens as default on Linux Systems when OpenBLAS is used), and if we can provide some help to OpenBLAS it might be good. For all I see, downstream users have no reason to suspect such issues and it seems like it could randomly, once in a while create incorrect results (frankly, I mean I might suspect such issues, but half the people who work in a similar environment as I do are probably not even aware that OpenBLAS is threading). |
There might be a need to hold the GIL for some lapack/blas implementations if they cannot promise thread safety. Unfortunately we do not have a way to query, at runtime, which implementation we are using, see issue #11826 |
I think it's better to work with upstream to ensure that they are all thread-safe. MKL is thread safe, OpenBLAS can probably be fixed. I don't know for Blis but I would believe so. |
Well, maybe we can add code such as the one you linked to to change the number of threads. If numpy knows the BLAS implementation it should release the GIL. But it could refuse to release the GIL if it sees one it does not recognize. For OpenBLAS and the typical ones it should definitely be rather fixed of course. |
My impression is that, for reasonable behavior with multithreading, the thread server (
Despite not affecting correctness, those problems lead to worse performance than one can reasonably expect --- than, say, if each main thread spawned its own n-1 worker threads, or if they shared the same n-1 worker threads in a reasonable way. And there there are some one-off things that becomes outright bugs in a multithread setting, like this global buffer, and some other bug I don't yet understand that happens with this code snippet. This last one has been frustrating me for quite a while. |
Would you consider opening an OpenBLAS issue on Github, to give a home for discussion? |
There is already a OpenBLAS issue (OpenMathLib/OpenBLAS#1844), and I have been trying to discuss there for a while. I decided to escape here for my mental health. |
OpenBLAS fixed OpenMathLib/OpenBLAS#1844. |
I guess we can close this, since OpenBLAS is fixed, and we are making sure to link a newer version (even point it out in the release notes). |
Uh oh!
There was an error while loading. Please reload this page.
I'm using numpy (
1.14.1
) linked against OpenBLAS0.2.18
and it looks likenp.dot
(that uses
dgemm
routine from openblas) is not thread-safe :I don't know if this kind of behavior is expected, is it numpy or rather openblas bug ?
Note :
export OPENBLAS_NUM_THREADS=1
)Some extra info if needed :
The text was updated successfully, but these errors were encountered: