Skip to content

Valgrind Memory Leak Checking #8954

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

wiredfool
Copy link
Member

@wiredfool wiredfool commented May 13, 2025

Changes proposed in this pull request:

  • Add support in the Makefile to use valgrind to check for definite leaks.
  • Add some suppressions of python object allocations for leak checks. Python is a bit noisy for leak checking.
  • Fix a leak in the Arrow schema where child schemas weren't released.
  • Fix a leak in webp encode on error
  • Fix a leak in UnsharpMask on error
  • Fix a leak in TiffEncode on error
  • Fix a leak in Font getmask on error
  • Fix a leak in JpegEncode on error

To Do:

  • Consider how to run this periodically in CI. This is significantly slower than plain valgrind -- locally I had a 4 hour run at one point. That's too slow for a merges or prs, but we need to see the results.
  • Consider setting the leak detection to something other than definite.
  • Review the python suppressions. It's definitely possible that some of them are leaks -- they'd be pointing to issues with our PyObject handling at the C level, either erroring from a function without decrefing or similar.

Note -- this is built on the Arrow memory leak check PR #8953.

@wiredfool wiredfool added Bug Any unexpected behavior, until confirmed feature. Memory labels May 13, 2025
@aclark4life
Copy link
Member

@wiredfool We can trigger a workflow manually or run daily independent of PRs.

@@ -99,6 +99,13 @@ valgrind:
--log-file=/tmp/valgrind-output \
python3 -m pytest --no-memcheck -vv --valgrind --valgrind-log=/tmp/valgrind-output

.PHONY: valgrind-leak
valgrind-leak:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
valgrind-leak:
valgrind-leak:
python3 -c "import pytest_valgrind" > /dev/null 2>&1 || python3 -m pip install pytest-valgrind

Install pytest-valgrind if it is missing, like the 'valgrind' target does

@@ -37,6 +37,10 @@ ReleaseExportedSchema(struct ArrowSchema *array) {
child->release = NULL;
}
// UNDONE -- should I be releasing the children?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this UNDONE comment resolved now?

@wiredfool
Copy link
Member Author

We can trigger a workflow manually or run daily independent of PRs.

The question then is how to we prevent this from being run as a scheduled action and subsequently ignored? The tension is that this is by far the most valuable when a PR is in flight, because then it's obvious when you've gone from current (xfail) to fail, but the runtime is likely 2-3x worse than our longest other test run, which is already too long. (Even running single tests with valgrind take ~ 1 min, because all the pytest setup infra runs under valgrind as well)

@aclark4life
Copy link
Member

In that case, long PR it is! We typically have PRs queued for days, or longer, before merging.

@wiredfool
Copy link
Member Author

On My Machine:
4711 passed, 345 skipped, 11 xfailed, 8 warnings in 15074.00s (4:11:14)

That's vs 87 minutes for ordinary valgrind on this pr in the GHA testing.

Unfortunately, we're single threaded for valgrind tests.

@wiredfool
Copy link
Member Author

Ok, We'll see how long this takes: https://github.com/python-pillow/Pillow/actions/runs/15054197221/job/42316085246?pr=8954

I'm only running this on the pull request, not the push. I'm expecting 5 hours.

This is injecting the test script in the Pillow repo, instead of building a new copy of the valgrind image. At some point we can move that over, but I'd like to be tuning the supressions without having to build a new valgrind image, so it's probably better to leave it here for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Any unexpected behavior, until confirmed feature. Memory
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants