Skip to content

bpo-33056 FIX leaking fd in concurrent.futures.ProcessPoolExecutor #6084

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 12, 2018

Conversation

tomMoral
Copy link
Contributor

@tomMoral tomMoral commented Mar 12, 2018

The recent changes introduced by #3895 leaks some file descriptors (the Pipe open in _ThreadWakeup, see here).
They should be properly closed at shutdown.

https://bugs.python.org/issue33056

@tomMoral tomMoral changed the title FIX leaking fd in concurrent.futures.ProcessPoolExecutor bpo-33056 FIX leaking fd in concurrent.futures.ProcessPoolExecutor Mar 12, 2018
@asvetlov
Copy link
Contributor

Please add a NEWS entry

@asvetlov asvetlov merged commit 095ee41 into python:master Mar 12, 2018
@bedevere-bot
Copy link

@asvetlov: Please replace # with GH- in the commit message next time. Thanks!

@tomMoral
Copy link
Contributor Author

Thank you for the fast review! I will try to use GH- next time.

@tomMoral tomMoral deleted the FIX_leaking_threadwakeup_files branch March 12, 2018 18:05
@asvetlov
Copy link
Contributor

@tomMoral sure.
It's my bad, committer should edit commit message always.
Unfortunately it cannot be done by robot, thus we have too many messages like this from bot.
Very common problem for core devs

@miss-islington
Copy link
Contributor

Thanks @tomMoral for the PR, and @asvetlov for merging it 🌮🎉.. I'm working now to backport this PR to: 3.7.
🐍🍒⛏🤖

@bedevere-bot
Copy link

GH-6092 is a backport of this pull request to the 3.7 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Mar 12, 2018
pitrou pushed a commit that referenced this pull request Mar 13, 2018
@rad-pat
Copy link

rad-pat commented Dec 11, 2019

@tomMoral, I believe that either this change and/or #3895 causes OSError: handle is closed when shutting down ProcessPoolExecutor with wait=False due to self._queue_management_thread_wakeup being closed, but still referenced in _threads_wakeups on exit of Python, e.g. with script:

from concurrent.futures import ProcessPoolExecutor


class ObjectWithPickleError():
    """Triggers a RuntimeError when sending job to the workers"""

    def __reduce__(self):
        raise RuntimeError()


if __name__ == "__main__":
    e = ProcessPoolExecutor()
    f = e.submit(id, ObjectWithPickleError())
    e.shutdown(wait=False)
    # f.result()  # Deadlock on get

I'm not sure what the right solution is here, obviously this change is to prevent leaking the file descriptors which are taken care of in the close(), but perhaps the _python_exit method needs a review. Hopefully you'll have some insight please?

@tomMoral
Copy link
Contributor Author

@rad-pat Thanks a lot for the reproducing script!

there is 2 issues here:

  • The first one is the one I described and tried to fix in bpo-39104: Fix hanging ProcessPoolExecutor on shutdown nowait with pickling failure #17670. The queue_management_thread should handle all the closing and the wakeup method should be robust to closed fd as there might be concurent events that make the executor shutdown. This explains the hang when using the f.result
  • The second one is due to the fact that on_queue_feeder_error is ignored when the interpreter is closing. This means that the queue_management_thread will never be notified of the fact that the pending task it is waiting already failed. For that, I don't really know what is the right solution.

rad-pat pushed a commit to rad-pat/cpython that referenced this pull request Feb 14, 2020
miss-islington pushed a commit that referenced this pull request Feb 16, 2020
…ckling failure (GH-17670)

As reported initially by @rad-pat in #6084, the following script causes a deadlock.

```
from concurrent.futures import ProcessPoolExecutor


class ObjectWithPickleError():
    """Triggers a RuntimeError when sending job to the workers"""

    def __reduce__(self):
        raise RuntimeError()


if __name__ == "__main__":
    e = ProcessPoolExecutor()
    f = e.submit(id, ObjectWithPickleError())
    e.shutdown(wait=False)
    f.result()  # Deadlock on get
```

This is caused by the fact that the main process is closing communication channels that might be necessary to the `queue_management_thread` later. To avoid this, this PR let the `queue_management_thread` manage all the closing.



https://bugs.python.org/issue39104



Automerge-Triggered-By: @pitrou
colesbury referenced this pull request in colesbury/nogil Oct 6, 2021
…ckling failure (GH-17670)

As reported initially by @rad-pat in #6084, the following script causes a deadlock.

```
from concurrent.futures import ProcessPoolExecutor


class ObjectWithPickleError():
    """Triggers a RuntimeError when sending job to the workers"""

    def __reduce__(self):
        raise RuntimeError()


if __name__ == "__main__":
    e = ProcessPoolExecutor()
    f = e.submit(id, ObjectWithPickleError())
    e.shutdown(wait=False)
    f.result()  # Deadlock on get
```

This is caused by the fact that the main process is closing communication channels that might be necessary to the `queue_management_thread` later. To avoid this, this PR let the `queue_management_thread` manage all the closing.



https://bugs.python.org/issue39104



Automerge-Triggered-By: @pitrou
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants