Skip to content

gh-102381: don't call watcher callback with dead object #102382

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Mar 8, 2023

Conversation

carljm
Copy link
Member

@carljm carljm commented Mar 2, 2023

Fixes #102381, #102351 and #102385.

Ensure that func_dealloc, code_dealloc, and dict_dealloc temporarily incref the function/code object/dict before calling the watcher callback, so we don't pass around dead objects.

A watcher callback could take a reference to the object; check for this and allow it to be resurrected in that case, rather than going ahead and freeing it (making the taken reference a pointer to freed memory.)

Also, don't pass the object itself to PyErr_WriteUnraisable if the callback raises, allowing an unraisablehook to take a reference to the object and resurrect it and, in the worst case, causing an infinite loop. Safer to construct an informative debug string and pass that as context to PyErr_WriteUnraisable. This actually provides more debug information than before, since the new message clarifies that the error occurred in a watcher callback for the mentioned object.

@carljm carljm changed the title gh-102381: don't call dict watcher callback with dead dict gh-102381: don't call watcher callback with dead object Mar 3, 2023
sobolevn added a commit to sobolevn/cpython that referenced this pull request Mar 3, 2023
@carljm
Copy link
Member Author

carljm commented Mar 3, 2023

Skipping news since this is just fixing a bug in an unreleased new feature.

@carljm carljm added the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Mar 3, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @carljm for commit 6d0b789 🤖

If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Mar 3, 2023
@carljm carljm marked this pull request as ready for review March 3, 2023 23:40
@carljm carljm requested a review from Yhg1s March 3, 2023 23:40
Copy link
Member

@JelleZijlstra JelleZijlstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for fixing this! I'd like other reviewers to also have a look though.

@carljm carljm added the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Mar 4, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @carljm for commit 9b938c7 🤖

If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Mar 4, 2023
carljm and others added 3 commits March 3, 2023 22:40
* main:
  pythongh-102021 : Allow multiple input files for interpreter loop generator (python#102022)
  Add import of `unittest.mock.Mock` in documentation (python#102346)
  pythongh-102383: [docs] Arguments of `PyObject_CopyData` are `PyObject *` (python#102390)
  pythongh-101754: Document that Windows converts keys in `os.environ` to uppercase (pythonGH-101840)
  pythongh-102324: Improve tests of `typing.override` (python#102325)
@carljm
Copy link
Member Author

carljm commented Mar 4, 2023

Thanks @Yhg1s for the review! I've updated per all comments.

carljm added 2 commits March 6, 2023 14:16
* main: (21 commits)
  pythongh-102192: Replace PyErr_Fetch/Restore etc by more efficient alternatives in sub interpreters module (python#102472)
  pythongh-95672: Fix versionadded indentation of get_pagesize in test.rst (pythongh-102455)
  pythongh-102416: Do not memoize incorrectly loop rules in the parser (python#102467)
  pythonGH-101362: Optimise PurePath(PurePath(...)) (pythonGH-101667)
  pythonGH-101362: Check pathlib.Path flavour compatibility at import time (pythonGH-101664)
  pythonGH-101362: Call join() only when >1 argument supplied to pathlib.PurePath() (python#101665)
  pythongh-102444: Fix minor bugs in `test_typing` highlighted by pyflakes (python#102445)
  pythonGH-102341: Improve the test function for pow (python#102342)
  Fix unused classes in a typing test (pythonGH-102437)
  pythongh-101979: argparse: fix a bug where parentheses in metavar argument of add_argument() were dropped (python#102318)
  pythongh-102356: Add thrashcan macros to filter object dealloc (python#102426)
  Move around example in to_bytes() to avoid confusion (python#101595)
  pythonGH-97546: fix flaky asyncio `test_wait_for_race_condition` test (python#102421)
  pythongh-96821: Add config option `--with-strict-overflow` (python#96823)
  pythongh-101992: update pstlib module documentation (python#102133)
  pythongh-63301: Set exit code when tabnanny CLI exits on error (python#7699)
  pythongh-101863: Fix wrong comments in EUC-KR codec (pythongh-102417)
  pythongh-102302 Micro-optimize `inspect.Parameter.__hash__` (python#102303)
  pythongh-102179: Fix `os.dup2` error reporting for negative fds (python#102180)
  pythongh-101892: Fix `SystemError` when a callable iterator call exhausts the iterator (python#101896)
  ...
@carljm
Copy link
Member Author

carljm commented Mar 6, 2023

@Yhg1s I pushed the explicit exhaustiveness check, and one possible way of addressing your comments about the x-macros. I really don't feel strongly about how to handle the event type logging, so if you have a clearly preferred approach, just let me know and I'll do that.

Copy link
Member

@JelleZijlstra JelleZijlstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, noticed one more thing. I think this can affect users who include the header in C code that independently defines a DEF_EVENT macro.

* main:
  pythongh-102493: fix normalization in PyErr_SetObject (python#102502)
  pythongh-87092: compiler's CFG construction moved to after codegen stage (python#102320)
  pythongh-95913: Consolidate build requirements changes in 3.11 WhatsNew (pythonGH-98781)
  Remove redundant `_ensure_future` in favor of `ensure_future` in `asyncio` (python#102398)
  pythongh-95913: Edit Faster CPython section in 3.11 WhatsNew (pythonGH-98429)
  pythongh-90110: Fix the c-analyzer Tool (python#102483)
  pythongh-101759: Update macOS installer SQLite 3.40.1 checksum (pythongh-102485)
  Remove unused import of `warnings` from `unittest.loader` (python#102479)
  Add gettext support to tools/extensions/c_annotations.py (python#101989)
@JelleZijlstra
Copy link
Member

And as discussed offline, let's run the buildbots again to be safe; after that feel free to merge.

@carljm carljm added the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Mar 7, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @carljm for commit 05e2ce0 🤖

If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Mar 7, 2023
@carljm carljm merged commit 1e703a4 into python:main Mar 8, 2023
@carljm carljm deleted the nodeaddict branch March 8, 2023 00:11
Py_SET_REFCNT(op, Py_REFCNT(op) - 1);
return;
}
Py_SET_REFCNT(op, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this Py_SET_REFCNT necessary? IIUC this is redundant here otherwise it isn't obvious to me and a comment would be nice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the normal case, assuming the watcher has not taken any new reference to the object, the refcount will be 1 at this point (because we set it to 1 above.) We should set it to 0 before continuing with deallocation, because objects being deallocated should have a refcount of zero.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The watcher must take a reference to any object it is watching, otherwise the refcount is incorrect and we crash and burn.

The GC, or an optimizer, can legitimately assume that any object with a reference count of zero and no weakrefs is to be freed and re-use the memory. That's not going to end well if the watcher still thinks that memory is still a dictionary.

Copy link
Member Author

@carljm carljm Mar 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markshannon I think your comment is more relevant to the discussion below (about why to have dealloc watchers at all) than it is to this thread, so I'll reply below.

The answer to @kumaraditya303's question here is that this Py_SET_REFCNT(op, 0) is the decref that has to be paired with the incref Py_SET_REFCNT(op, 1); that occurs on line 799. We use explicit Py_SET_REFCNT instead of Py_INCREF and Py_DECREF because Py_DECREF would cause recursive deallocation, and if we use Py_SET_REFCNT for the decref we have to use it for the incref also, otherwise the total refcount tracking will be off.

@markshannon
Copy link
Member

I can't help feeling that we would be better off without these watchers.
They appear to be bug magnets. Handling the "zombie" objects is tricky and error prone.
How is the watcher interacting with the freed object if there are no references to it?

This PR adds quite a lot of code to dict_dealloc which is called for every dictionary created, which has performance implications.

I think we should just remove these watchers. What are they for?
I've been going through possible optimizations for CPython and these watchers aren't needed for any of them.
I don't see why Cinder would need them either.

(As an aside: we won't need the creation events either, but I can see why Cinder wants those).

@carljm
Copy link
Member Author

carljm commented Mar 8, 2023

This PR adds quite a lot of code to dict_dealloc which is called for every dictionary created, which has performance implications.

I was also initially concerned about adding an incref and decref in this path, but on a pyperformance run (with --enable-optimizations build, on a bare-metal AWS host) this PR was 1.00x faster than the base commit in main, so it doesn't seem to have a detectable impact?

I don't see why Cinder would need them either.

In general, in Cinder they are used for de-registering objects from JIT registries consisting of borrowed references, to avoid having dangling pointers in these registries. E.g. for each watched dict, we remember what key(s) we care about in that dict, and for each key we may also watch a different builtins (and maybe also globals) dict that that key might actually have been found in (or might still need to be found in if the key goes away in the main dict). If the main dict is deallocated we unwatch the additional globals/builtins dicts, and erase the whole entry from the registry. Similarly, we have a registry of created functions/code-objects that are pending compilation (in some modes we don't compile eagerly but do a multi-threaded compilation of everything pending compilation, right before fork), so if a function or code object is deallocated we remove it from that registry.

In every case the deallocated handler doesn't do anything with the zombie PyObject itself, it just needs the pointer value to erase any registry entry for that pointer value.

@carljm
Copy link
Member Author

carljm commented Mar 8, 2023

The watcher must take a reference to any object it is watching, otherwise the refcount is incorrect and we crash and burn.

Cinder JIT currently doesn't take references to objects it is watching, so as not to keep them alive if they would otherwise die. There is no "crash and burn" precisely because of the dealloc events; if the object is deallocated we know about it and can erase any pointer to it we were keeping. Effectively it's like we are keeping a weakref, but without the allocation and indirection overhead of an actual weakref object. Perhaps we should be keeping an actual weakref instead; it would take some experimentation to check the performance impact of that.

carljm added a commit to carljm/cpython that referenced this pull request Mar 8, 2023
* main:
  pythongh-102381: don't call watcher callback with dead object (python#102382)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

raising an exception in a dict/code/function watcher callback will segfault on a DEALLOCATED event
6 participants