-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
gh-75880: Deadlocks in concurrent.futures.ProcessPoolExecutor
with unpickling error
#4256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f72d0fd
to
51b411d
Compare
The windows failures are unrelated to this PR ( |
51b411d
to
4cf2a47
Compare
4cf2a47
to
a70485b
Compare
a70485b
to
b7422fe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't what the best solution is but this PR should not be merged while the following regression is not addressed.
work_id, exception=_ExceptionWithTraceback(e, e.__traceback__) | ||
)) | ||
result_queue._put_bytes(work_id.to_bytes(WORK_ID_SIZE, WORK_ID_ENC) + | ||
serialize_res) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bytes concatenation triggers a potentially very large memory copy which is a regression compared to master.
There are two possible solutions to this problem:
- Add a
send_byte_chunks
API down to the Connection that would make it possible to send a topple of bytes without materializing the full buffer. We don't know if it is possible on windows. - Implement a new Queue sub class to send and receive 2 messages at a time. This would introduce a small call overhead, potentially critical for small messages.
concurrent.futures.ProcessPoolExecutor
with unpickling errorconcurrent.futures.ProcessPoolExecutor
with unpickling error
https://bugs.python.org/issue31699 is closed. What is the status of this PR? |
This PR is stale because it has been open for 30 days with no activity. |
concurrent.futures.ProcessPoolExecutor
with unpickling errorconcurrent.futures.ProcessPoolExecutor
with unpickling error
This PR is stale because it has been open for 30 days with no activity. |
When using
concurrent.futures.ProcessPoolExecutor
with objects that are not picklable or unpicklable, several situations results in a deadlock, with the interpreter frozen.This is the case for different scenarios, for instance, https://gist.github.com/tomMoral/cc27a938d669edcf0286c57516942369. This PR is a follow up of #3895 and specifically fixes the unpickling behavior. With this PR, the unpickling failures do not result in broken executors. This is done by ensuring that the
_callItem
and the_ResultItem
are sent in queues along the work_id with a specific communication protocol, which sends thework_id
as bytes, followed by the object serialized bypickle
.To this end, we introduce private API in
multiprocessing.Queue
to allow modified serialization protocols.Overall, the goal is to make
concurrent.futures.ProcessPoolExecutor
more robust to faulty user code.This work was done as part of the tommoral/loky#48 with the intent to re-use the executor in multiple independent parts of the program, in collaboration with @ogrisel. See #1013 for more the details.
https://bugs.python.org/issue31699
concurrent.futures.ProcessPoolExecutor
with pickling error #75880