Skip to content

[2.7] bpo-34110: Fix module load problem of cPickle in case of multithreading #8276

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

sangongs
Copy link

@sangongs sangongs commented Jul 13, 2018

In Python 2.7, sys.modules may contain a partially loaded module. This is caused by the following logic:

cpython/Python/import.c

Lines 721 to 727 in a45fa39

PyObject *
PyImport_ExecCodeModuleEx(char *name, PyObject *co, char *pathname)
{
PyObject *modules = PyImport_GetModuleDict();
PyObject *m, *d, *v;
m = PyImport_AddModule(name);

When executing the code of a module, an empty module is added into the module dict. Thus if any code in the module releases the GIL, another thread may get a partially loaded module from sys.modules. Unfortunately, cPickle checks sys.modules before trying to import and may get an incomplete module, leading to an AttributeError when trying to obtain an object of the module.

This fix acquires import lock before checking sys.modules, make sure there is no partially loaded module.

To reproduce, first create a foo.py:

import time
time.sleep(0.1)
class foo():
    pass

Then create and run main.py:

import threading
import cPickle 

threads = [threading.Thread(target=cPickle.loads, args=('cfoo\nfoo\np0\n.',)) for _ in range(2)]
[thread.start() for thread in threads]
[thread.join() for thread in threads]

The following error should be raised:

Exception in thread Thread-2:
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
AttributeError: 'module' object has no attribute 'foo'

https://bugs.python.org/issue34110

@the-knights-who-say-ni
Copy link

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Unfortunately our records indicate you have not signed the CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

When your account is ready, please add a comment in this pull request
and a Python core developer will remove the CLA not signed label
to make the bot check again.

Thanks again for your contribution, we look forward to reviewing it!

@sangongs sangongs force-pushed the fix_cpickle branch 2 times, most recently from f4b6047 to 53a5943 Compare July 13, 2018 19:13
methane added a commit to methane/cpython that referenced this pull request Jul 14, 2018
pythonGH-8276 fixed regression in 3.7.
While the regression is not in 3.6, it's worth to backport
test cases to 3.6 branch too.
@sangongs
Copy link
Author

I think my account is ready now. I see the star after my username.

@csabella
Copy link
Contributor

csabella commented Apr 9, 2019

Hi @sangongs, based on @pablogsal's comment on the bug tracker, would you be able to rebase this over the master branch instead of just on the 2.7 branch? Thanks!

@csabella
Copy link
Contributor

csabella commented Apr 9, 2019

@pitrou, as you made the last change to this code, would you be able to review this change? Thanks!

@sangongs
Copy link
Author

sangongs commented Apr 9, 2019

@pitrou Sure. I will try to rebase it. It may take some time since I am not familiar with the code base of Python 3.

BTW: The scripts to reproduce the problem in Python 3:

foo.py

import time
time.sleep(0.1)
class foo():
    pass

main.py

import threading
import _pickle as cPickle

threads = [threading.Thread(target=cPickle.loads, args=(b'cfoo\nfoo\np0\n.',)) for _ in range(10)]
[thread.start() for thread in threads]
[thread.join() for thread in threads]

@pitrou
Copy link
Member

pitrou commented Apr 9, 2019

This looks to be a duplicate of https://bugs.python.org/issue34572, which was already fixed in Python 3 (but not backported to Python 2.7). See PRs there.

@sangongs
Copy link
Author

sangongs commented Apr 9, 2019

Hmm. Yes, I tested with 3.7.3 and it is fixed.

@csabella
Copy link
Contributor

csabella commented Apr 9, 2019

Thanks @pitrou! Apologies for not seeing that other issue. Since PyImport_GetModule isn't in the 2.7 version, it's not possible to backport that fix as-is into 2.7. Do you think that the fix here would be to remove the PyDict_GetItem and just use PyImport_Import?

@pitrou
Copy link
Member

pitrou commented Apr 9, 2019

The PyDict_GetItem is an optimization. Removing it would probably make things slower - though whether or not it would be noticeable is an open question.

As for what a correct solution may look like, perhaps one of the import experts may want to answer - cc @brettcannon @ncoghlan .

@ncoghlan
Copy link
Contributor

I'm not sure what a correct Python 2 fix looks like, but I'll note that any Python 2.7 approach based on acquiring the import lock creates the risk of introducing deadlocks where none existed before: https://docs.python.org/2/library/threading.html#importing-in-threaded-code

We didn't have that problem in the Python 3.x fix, as all the versions where the fix was implemented already included the change to switch the import system over to using per-module locks.

@pitrou
Copy link
Member

pitrou commented Apr 10, 2019

Hmm, you're right, the risk of introducing a regression here is too important. And I don't know of a solution that would not require a lock (since we're talking about multi-threading synchronization).

@brettcannon
Copy link
Member

@pitrou @ncoghlan should we just close this then so as to not risk the deadlock?

@pitrou
Copy link
Member

pitrou commented Apr 11, 2019

There may be another possible approach, but I don't know if it fixes @sangongs 's issue:

    module = PySys_GetObject("modules");
    if (module == NULL)
        return NULL;

    module = PyDict_GetItem(module, py_module_name);  /* borrowed ref */
    if (module != NULL) {
        global = PyObject_GetAttr(module, py_global_name);
        if (global != NULL || !PyErr_ExceptionMatches(PyExc_AttributeError)) {
            return global;
        }
        PyErr_Clear();
    }
    /* Either module not found in sys.modules, or attribute lookup
     * in module failed.  In both cases, we retry after actually
     * importing the module
     */
    module = PyImport_Import(py_module_name);  /* new ref */
    if (!module)
        return NULL;
    global = PyObject_GetAttr(module, py_global_name);
    Py_DECREF(module);
    return global;

@ncoghlan
Copy link
Contributor

@pitrou Doesn't that still implicitly acquire the import lock inside the full import call, thus potentially deadlocking some cases that are currently getting by with a reference to a partially initialized module?

@pitrou
Copy link
Member

pitrou commented Apr 15, 2019

@ncoghlan Indeed, those cases currently raise an AttributeError but they might now deadlock.

@ncoghlan
Copy link
Contributor

@pitrou I was more concerned about the cases where the module is initialised well enough that all the attributes that the unpickling operation needs are already there, even though the module level code hasn't finished executing.

Any such code would be spectacularly fragile in general, but it would still be problematic to break it in a maintenance release.

@pitrou
Copy link
Member

pitrou commented Apr 15, 2019

If the attributes are already there then the first PyObject_GetAttr call above would succeed?

@ncoghlan
Copy link
Contributor

ncoghlan commented Apr 19, 2019

I now think you're right, so posting the argument where I was attempting to convince you I was right, but ended up convincing myself that you're right instead, and I had just misunderstood your proposal.

So +1 from me.


The import system adds the sys.modules entry before starting the module execution, but doesn't add the attribute on the parent package until the module execution finishes:

$ mkdir -p _example
$ echo "import _example; print(dir(_example))" > _example/parent_keys.py
$ python3 -c "import _example.parent_keys; print(dir(_example))"
['__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__']
['__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'parent_keys']

In current Python 2.7 code, the pickle module can access _example.parent_keys while it is being imported. If the keys it needs are already there, that's a good thing. If they're not there, it's a problem.

Illustrative example where I believe being unable to access a partially imported module could end up being problematic:

parent_pkg.module_a.py:

import pickle
import module_b
class ExampleClass(object):
    pass
# This works because module_a.ExampleClass already exists
module_b.send_data(pickle.dumps(ExampleClass()))

module_b.py:

import pickle
_received_objects = []

def send_data(data):
    _received_objects.append(pickle.loads(data))

As written, that code would be fine regardless of how the locking works, since the unpickle call happens with the import lock for module_a held no matter what.

In Python 2.7, it also works even if the unpickling happens in another thread, since pickle doesn't try to acquire the import lock, and the required module attributes are available before the potential unpickle is invoked.

In Python 3.7, there's technically a chance to deadlock with another thread, but only if module_a specifically is waiting for that thread to finish (which seems spectacularly unlikely).

With a global import lock, though, there would be many more ways to end up in a situation where the pickle.loads call ran with module_a was still being imported (so it fell through to the "try to acquire the import lock" code).

However, the key to your proposal is that the cases like my hypothetical one above won't fall through to the import lock, because the attributes they need will already be there. It's only the motivating cases that currently raise AttributeError that will risk deadlocking (but probably won't deadlock in practice, since they probably won't be implicitly waiting for any other threads during the import).

@pitrou
Copy link
Member

pitrou commented Apr 19, 2019

That said, there's also the slight risk that someone somewhere is fine getting an AttributeError but wouldn't like Python to deadlock. I'm not sure it's a good idea to make such changes at the very end of the 2.7 maintenance cycle. @benjaminp What is your opinion on this?

@benjaminp
Copy link
Contributor

I didn't read the whole in-depth discussion, but, a priori, AttributeError is always better than a deadlock. If you're confident, this is a strict improvement, it seems okay.

Clearly, we've been living with this cPickle problem for quite a long while.

@brettcannon brettcannon changed the title bpo-34110: Fix module load problem of cPickle in case of multithreading [2.7] bpo-34110: Fix module load problem of cPickle in case of multithreading Apr 22, 2019
@csabella csabella closed this Jan 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants