-
Notifications
You must be signed in to change notification settings - Fork 6
3.5 Concurrency Models
The various concurrency models in Python have strengths and weaknesses/deficiencies, some of which overlap. Also note that there is some overlap in use cases. There's still a meaningful gap that subinterpreters can fill.
Threads
Multiprocessing
IPC / Sockets
"async"
Subinterpreters
Threads have been a part of Python for most of its existence. They're also a widely used feature in the software industry as a whole.
Strengths:
- familiar to most software developers
- well-studied in computer science
- relatively simple to use (at least at a small scale)
- efficient with memory
- new threads start relatively quickly
Weaknesses:
- greatly impacted by the GIL
- signal-handling (e.g. Ctrl-C)
- easy to blindly share state, exposing yourself or others to later suffering
- easy to clobber data in other threads
- hard to correlate execution in one thread to side-effects in another (i.e. cause & effect, "spooky action at a distance")
- race conditions
- many threading-specific problems are hard to debug
- tracebacks are decoupled from where the thread was created/started
Use Cases:
- ...
Prior Art:
- ...
Other observations:
- coming from other languages, folks reach for threads
- in data science (e.g. Dask), threshold where MPI (distributed) makes more sense than threads is ~3?
In general, the concept of code working across multiple processes is fairly well established. However, here we're looking at the "multiprocessing" module specifically. The module supports the following start methods:
- fork
- uses
os.fork()
- supports COW ("Copy-on-Write")
- platform:
- not supported on Windows
- the default on *nix
- resources:
- startup: relatively fast (mostly just worker initialization)
- FDs/handles: all inherited
- initial state (modules, stack, closures, etc.): copied
- uses
- forkserver
- uses a (single-threaded) daemon process which is forked for each requested process
- new in 3.4
- platform:
- not supported on Windows
- must support passing FDs over a Unix pipe
- resources:
- startup: first time relatively slow, otherwise relatively fast (mostly just worker initialization)
- FDs/handles: none inherited
- initial state (modules, stack, closures, etc.): not copied
- spawn
- creates a new Python process for each requested process
- new (on *nix) in 3.4
- platform:
- supported on Windows and *nix
- default on Windows
- (more-or-less) the pre-3.4 behavior on Windows
- resources:
- startup: relatively slow (full interpreter overhead + worker initialization)
- FDs/handles: none inherited
- initial state (modules, stack, closures, etc.): not copied
Strengths:
- processes are strongly isolated by the operating system
- can leverage OS-level concurrency tools & techniques
- startup is relatively fast on *nix (if using fork)
- can utilize multiple cores
- extension modules are completely compatible (and isolated between processes)
Weaknesses:
- relatively steep learning curve
- many caveats
- relatively complex API
- somewhat platform-dependent
-
Basically, obtaining good performance out of a full multiprocessing model is a cross-platform compatibility nightmare that requires spectacularly deep knowledge of platform internals when anything goes wrong, so folks that don't see anything wrong with the status quo tend to be those that either aren't pushing multiprocessing near any of its multitude of edge cases, or else have the privilege of personally only needing to worry about platforms where they already have a strong understanding of the underlying primitives.
--Nick Coghlan
-
- large data -> expensive serialization (PEP 574?)
- fork start -> much undefined behavior
- fork + threads/FDs/etc. don't mix well
- fork: COW benefits are mostly eliminated due to refcounting (see
gc.freeze()
in 3.7) - hard for debuggers to deal with
- communication, sharing, and synchronization between processes is relatively inefficient
- pickle
- ...
- implementation has a heavy maintenance burden
- multiple processes have a lot more system resource overhead than 1 process with multiple threads
- ...particularly with the extra resources needed by the multiprocessing module
- leads to higher devops costs
- managing multiple processes has a higher devops cost than 1 process with multiple threads
Use Cases:
- ...
Prior Art:
- ...
Other observations:
- to folks coming from other languages, multiprocessing seems like overkill/bad
- in data science, multiprocessing doesn't really factor in (they either use threads or MPI)
- the multiprocessing module should be able to support subinterpreters without much trouble
- if there had been a stdlib module for subinterpreters then the multiprocessing might not exist
This really breaks down to 2 models, though most solutions can handle both:
- interaction on the same host
- interaction between two hosts
For the former think of .... For the latter think of MPI or web-servers.
Strengths:
- ...
Weaknesses:
- ...
Use Cases: *
Prior Art:
- ...
Other observations:
- ...
- asyncio + async/await syntax
- coroutines
Strengths:
- ...
Weaknesses:
- ...
Use Cases: *
Prior Art:
- ...
Other observations:
- ...
- CSP (message passing)
Strengths:
- ...
Weaknesses:
- ...
Use Cases:
- ...
Prior Art:
- ...
Other observations:
- ...