-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
gh-129185: Fix PyTraceMalloc_Untrack() at Python exit #129191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Support calling PyTraceMalloc_Track() and PyTraceMalloc_Untrack() during late Python finalization.
@@ -1255,6 +1255,11 @@ int | |||
PyTraceMalloc_Track(unsigned int domain, uintptr_t ptr, | |||
size_t size) | |||
{ | |||
// gh-129185: Pre-check to support calls after _PyTraceMalloc_Fini() | |||
if (!tracemalloc_config.tracing) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to hold the lock or GIL, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of this "pre-check" is to avoid the GIL and TABLES_LOCK().
The flag is tested again below with TABLES_LOCK().
PyTraceMalloc_Untrack() doesn't lock the GIL, only TABLES_LOCK(), when tracing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change restores the check that we had before recent refactoring. For example, Python 3.14 alpha3 uses:
int
PyTraceMalloc_Track(unsigned int domain, uintptr_t ptr,
size_t size)
{
int res;
PyGILState_STATE gil_state;
if (!tracemalloc_config.tracing) {
/* tracemalloc is not tracing: do nothing */
return -2;
}
gil_state = PyGILState_Ensure();
TABLES_LOCK();
res = tracemalloc_add_trace(domain, ptr, size);
TABLES_UNLOCK();
PyGILState_Release(gil_state);
return res;
}
The check is done without holding the GIL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then we lose the thread safety: if one thread that holds the GIL were to write to tracing
, then a thread that calls one of these without the GIL would race.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the easiest solution would be to grab the tables lock for the read, and then unlock it before calling PyGILState_Ensure
to prevent lock-ordering deadlocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The race is right here. Thread A might check tracing at the exact same time as thread B, which is a data race.
What is the problem of "continue execution" of thread A, since thread A checks again tracing
with the tables lock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it might not reach the second tracing
read at all--if the first one races, then we're probably going to get a crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you're talking about thread B calling _PyTraceMalloc_Fini()
, not tracemalloc.stop()
. In this case, thread A can call TABLES_LOCK()
after thread B deleted the lock, and yes, we can get a crash.
Sadly, PyTraceMalloc_Track() and PyTraceMalloc_Untrack() API doesn't require the caller to hold the GIL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That can happen too, but I meant that we'll get a crash from the data race on tracing
, not a use-after-free on the lock. I think we just hold the GIL (via PyGILState_Ensure
) and call without the tables lock held, that should be thread safe enough for 3.12.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified the Untrack() function to get the GIL. It should address your last concern.
@ZeroIntensity: Please review the updated PR, it should reduce the risk of race. #129185 is not about threads running in parallel. It's just a single thread and PyTraceMalloc_Track/Untrack() called by Python objects destroyed after _PyTraceMalloc_Fini(). I propose to merge this incomplete fix and backport it, and then consider switching to PyMutex in the main branch to avoid the last race. |
Yeah, I'm talking about the other issue that you recently fixed. The previous approach would make it a problem again.
I agree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.12, 3.13. |
Sorry, @vstinner, I could not cleanly backport this to
|
Sorry, @vstinner, I could not cleanly backport this to
|
…29191) Support calling PyTraceMalloc_Track() and PyTraceMalloc_Untrack() during late Python finalization. * Call _PyTraceMalloc_Fini() later in Python finalization. * Test also PyTraceMalloc_Untrack() without the GIL * PyTraceMalloc_Untrack() now gets the GIL. * Test also PyTraceMalloc_Untrack() in test_tracemalloc_track_race(). (cherry picked from commit 46c7e13)
GH-129217 is a backport of this pull request to the 3.13 branch. |
#129217) gh-129185: Fix PyTraceMalloc_Untrack() at Python exit (#129191) Support calling PyTraceMalloc_Track() and PyTraceMalloc_Untrack() during late Python finalization. * Call _PyTraceMalloc_Fini() later in Python finalization. * Test also PyTraceMalloc_Untrack() without the GIL * PyTraceMalloc_Untrack() now gets the GIL. * Test also PyTraceMalloc_Untrack() in test_tracemalloc_track_race(). (cherry picked from commit 46c7e13)
#129217) (#129221) [3.13] gh-129185: Fix PyTraceMalloc_Untrack() at Python exit (#129191) (#129217) gh-129185: Fix PyTraceMalloc_Untrack() at Python exit (#129191) Support calling PyTraceMalloc_Track() and PyTraceMalloc_Untrack() during late Python finalization. * Call _PyTraceMalloc_Fini() later in Python finalization. * Test also PyTraceMalloc_Untrack() without the GIL * PyTraceMalloc_Untrack() now gets the GIL. * Test also PyTraceMalloc_Untrack() in test_tracemalloc_track_race(). (cherry picked from commit 46c7e13) (cherry picked from commit e3b3e01)
Support calling PyTraceMalloc_Track() and PyTraceMalloc_Untrack() during late Python finalization.