Skip to content

gh-130695: add count create/destroy refs to tracemalloc #130696

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

tom-pytel
Copy link
Contributor

@tom-pytel tom-pytel commented Feb 28, 2025

Add a way to track the number of allocations and specifically DEALLOCATIONS in tracemalloc. For a concrete example of why this is needed see:

#130382

and

#130689

for the rationale for this PR.


📚 Documentation preview 📚: https://cpython-previews--130696.org.readthedocs.build/

@tom-pytel
Copy link
Contributor Author

Ping @tomasr8 , @vstinner ?

@colesbury
Copy link
Contributor

Oh, I just realized we already have a test in _testcapimodule.c. I think it doesn't catch the bug because it only creates and destroys objects through the C API and doesn't execute any Python code through the interpreter:

static int _simpletracer(PyObject *obj, PyRefTracerEvent event, void* data) {
struct simpletracer_data* the_data = (struct simpletracer_data*)data;
assert(the_data->create_count + the_data->destroy_count < (int)Py_ARRAY_LENGTH(the_data->addresses));
the_data->addresses[the_data->create_count + the_data->destroy_count] = obj;
if (event == PyRefTracer_CREATE) {
the_data->create_count++;
} else {
the_data->destroy_count++;
}
return 0;
}
static PyObject *
test_reftracer(PyObject *ob, PyObject *Py_UNUSED(ignored))
{
// Save the current tracer and data to restore it later
void* current_data;
PyRefTracer current_tracer = PyRefTracer_GetTracer(&current_data);
struct simpletracer_data tracer_data = {0};
void* the_data = &tracer_data;
// Install a simple tracer function
if (PyRefTracer_SetTracer(_simpletracer, the_data) != 0) {
goto failed;
}
// Check that the tracer was correctly installed
void* data;
if (PyRefTracer_GetTracer(&data) != _simpletracer || data != the_data) {
PyErr_SetString(PyExc_AssertionError, "The reftracer not correctly installed");
(void)PyRefTracer_SetTracer(NULL, NULL);
goto failed;
}
// Create a bunch of objects
PyObject* obj = PyList_New(0);
if (obj == NULL) {
goto failed;
}
PyObject* obj2 = PyDict_New();
if (obj2 == NULL) {
Py_DECREF(obj);
goto failed;
}
// Kill all objects
Py_DECREF(obj);
Py_DECREF(obj2);
// Remove the tracer
(void)PyRefTracer_SetTracer(NULL, NULL);
// Check that the tracer was removed
if (PyRefTracer_GetTracer(&data) != NULL || data != NULL) {
PyErr_SetString(PyExc_ValueError, "The reftracer was not correctly removed");
goto failed;
}
if (tracer_data.create_count != 2 ||
tracer_data.addresses[0] != obj ||
tracer_data.addresses[1] != obj2) {
PyErr_SetString(PyExc_ValueError, "The object creation was not correctly traced");
goto failed;
}
if (tracer_data.destroy_count != 2 ||
tracer_data.addresses[2] != obj ||
tracer_data.addresses[3] != obj2) {
PyErr_SetString(PyExc_ValueError, "The object destruction was not correctly traced");
goto failed;
}
PyRefTracer_SetTracer(current_tracer, current_data);
Py_RETURN_NONE;
failed:
PyRefTracer_SetTracer(current_tracer, current_data);
return NULL;
}

Given that we already have a test, it might make sense to improve it instead of adding additional functionality to tracemalloc.

@tom-pytel
Copy link
Contributor Author

tom-pytel commented Feb 28, 2025

Oh, I just realized we already have a test in _testcapimodule.c.

I saw that but thought its too simple for general testing. Your call... (I mean its just testing the C API like the box says)...

@@ -94,6 +94,12 @@ struct _tracemalloc_runtime_state {
/* domain (unsigned int) => traces (_Py_hashtable_t).
Protected by TABLES_LOCK(). */
_Py_hashtable_t *domains;
/* Number of allocations.
Protected by TABLES_LOCK(). */
Py_ssize_t allocations;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to rename it to "ref_created" and "ref_destroyed".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started with that but as per colesbury above: """The terminology here is a bit confusing, but I don't think we should be calling these "references" or "refs". It's tracking object allocation and deallocation, not references or reference counts.""". Technically its tracking the first reference on an object (not all references), so excluding some unallocated "static" objects, its essentially allocations, but its called RefTracer, so that's unfortunate. But maybe following the established naming convention, although imperfect, is better? Anyways, putting it back to refs_created/_destroyed, but of course you have the final say so let me know which you prefer.

.. function:: get_traced_refs()

Get the current count of created and destroyed traced references.
:mod:`tracemalloc` module as a tuple: ``(created: int, destroyed: int)``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this sentence: "tracemalloc module as a tuple"?

f()
refs = tracemalloc.get_traced_refs()
if refs == (1, 0):
warnings.warn("ceval Py_DECREF doesn't emit PyRefTracer_DESTROY in this build")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect a test failure, not a warning. In which case we would fall into this case? And why ignoring this failure?


Get the current count of created and destroyed traced references.
:mod:`tracemalloc` module as a tuple: ``(created: int, destroyed: int)``.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should explain what are "references" here. They are not the classical reference count. Maybe explain the difference with the reference count?

Comment on lines +302 to +303
Clear traces of memory blocks allocated and counts of created and destroyed
memory blocks by Python.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Clear traces of memory blocks allocated and counts of created and destroyed
memory blocks by Python.
Clear traces of memory blocks allocated by Python, and reset counts
of created and destroyed references.

@vstinner
Copy link
Member

vstinner commented Mar 4, 2025

So far, I'm not convinced that this feature really makes sense to end users to help them tracking memory leaks. Examples given seem to be more about checking if the Python implementation is correct.

@tom-pytel
Copy link
Contributor Author

So far, I'm not convinced that this feature really makes sense to end users to help them tracking memory leaks. Examples given seem to be more about checking if the Python implementation is correct.

Yeah, I actually agree, and especially adding another lock on memfree wrt perfomance is probably not worth it. Will see about suggestion above to reuse _testcapimodule.c for the test instead of this.

@tom-pytel tom-pytel closed this Mar 4, 2025
@tom-pytel tom-pytel deleted the fix-issue-130695 branch March 25, 2025 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants