3.5 Concurrency Models

The various concurrency models in Python have strengths and weaknesses/deficiencies, some of which overlap. Also note that there is some overlap in use cases. There's still a meaningful gap that subinterpreters can fill.

Threads
Multiprocessing
IPC / Sockets
"async"
Subinterpreters

Threads

Threads have been a part of Python for most of its existence. They're also a widely used feature in the software industry as a whole.

Strengths:

familiar to most software developers
well-studied in computer science
relatively simple to use (at least at a small scale)
efficient with memory
new threads start relatively quickly

Weaknesses:

greatly impacted by the GIL
signal-handling (e.g. Ctrl-C)
easy to blindly share state, exposing yourself or others to later suffering
easy to clobber data in other threads
hard to correlate execution in one thread to side-effects in another (i.e. cause & effect, "spooky action at a distance")
race conditions
many threading-specific problems are hard to debug
tracebacks are decoupled from where the thread was created/started

Use Cases:

...

Prior Art:

...

Other observations:

coming from other languages, folks reach for threads
in data science (e.g. Dask), threshold where MPI (distributed) makes more sense than threads is ~3?

Multiprocessing

In general, the concept of code working across multiple processes is fairly well established. However, here we're looking at the "multiprocessing" module specifically. The module supports the following start methods:

fork
- uses os.fork()
- supports COW ("Copy-on-Write")
- platform:
  - not supported on Windows
  - the default on *nix
- resources:
  - startup: relatively fast (mostly just worker initialization)
  - FDs/handles: all inherited
  - initial state (modules, stack, closures, etc.): copied
forkserver
- uses a (single-threaded) daemon process which is forked for each requested process
- new in 3.4
- platform:
  - not supported on Windows
  - must support passing FDs over a Unix pipe
- resources:
  - startup: first time relatively slow, otherwise relatively fast (mostly just worker initialization)
  - FDs/handles: none inherited
  - initial state (modules, stack, closures, etc.): not copied
spawn
- creates a new Python process for each requested process
- new (on *nix) in 3.4
- platform:
  - supported on Windows and *nix
  - default on Windows
  - (more-or-less) the pre-3.4 behavior on Windows
- resources:
  - startup: relatively slow (full interpreter overhead + worker initialization)
  - FDs/handles: none inherited
  - initial state (modules, stack, closures, etc.): not copied

Strengths:

processes are strongly isolated by the operating system
can leverage OS-level concurrency tools & techniques
startup is relatively fast on *nix (if using fork)
can utilize multiple cores
extension modules are completely compatible (and isolated between processes)

Weaknesses:

relatively steep learning curve
- many caveats
- relatively complex API
somewhat platform-dependent
- Basically, obtaining good performance out of a full multiprocessing model is a cross-platform compatibility nightmare that requires spectacularly deep knowledge of platform internals when anything goes wrong, so folks that don't see anything wrong with the status quo tend to be those that either aren't pushing multiprocessing near any of its multitude of edge cases, or else have the privilege of personally only needing to worry about platforms where they already have a strong understanding of the underlying primitives. --Nick Coghlan
large data -> expensive serialization (PEP 574?)
fork start -> much undefined behavior
- fork + threads/FDs/etc. don't mix well
fork: COW benefits are mostly eliminated due to refcounting (see gc.freeze() in 3.7)
hard for debuggers to deal with
communication, sharing, and synchronization between processes is relatively inefficient
- pickle
- ...
implementation has a heavy maintenance burden
multiple processes have a lot more system resource overhead than 1 process with multiple threads
- ...particularly with the extra resources needed by the multiprocessing module
- leads to higher devops costs
managing multiple processes has a higher devops cost than 1 process with multiple threads

Use Cases:

...

Prior Art:

...

Other observations:

to folks coming from other languages, multiprocessing seems like overkill/bad
in data science, multiprocessing doesn't really factor in (they either use threads or MPI)
the multiprocessing module should be able to support subinterpreters without much trouble
if there had been a stdlib module for subinterpreters then the multiprocessing might not exist

IPC / Sockets

This really breaks down to 2 models, though most solutions can handle both:

interaction on the same host
interaction between two hosts

For the former think of .... For the latter think of MPI or web-servers.

Strengths:

...

Weaknesses:

...

Use Cases: *

Prior Art:

...

Other observations:

...

"async"

asyncio + async/await syntax
coroutines

Strengths:

...

Weaknesses:

...

Use Cases: *

Prior Art:

...

Other observations:

...

Subinterpreters

CSP (message passing)

Strengths:

...

Weaknesses:

...

Use Cases:

...

Prior Art:

...

Other observations:

...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

3.5 Concurrency Models

Threads

Multiprocessing

IPC / Sockets

"async"

Subinterpreters

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally