Valgrind Memory Leak Checking #8954

wiredfool · 2025-05-13T22:01:01Z

Changes proposed in this pull request:

Add support in the Makefile to use valgrind to check for definite leaks.
Add some suppressions of python object allocations for leak checks. Python is a bit noisy for leak checking.
Fix a leak in the Arrow schema where child schemas weren't released.
Fix a leak in webp encode on error
Fix a leak in UnsharpMask on error
Fix a leak in TiffEncode on error
Fix a leak in Font getmask on error
Fix a leak in JpegEncode on error

To Do:

Consider how to run this periodically in CI. This is significantly slower than plain valgrind -- locally I had a 4 hour run at one point. That's too slow for a merges or prs, but we need to see the results.
Consider setting the leak detection to something other than definite.
Review the python suppressions. It's definitely possible that some of them are leaks -- they'd be pointing to issues with our PyObject handling at the C level, either erroring from a function without decrefing or similar.

Note -- this is built on the Arrow memory leak check PR #8953.

* Free the output buffer on webp encode error

* If setimage errors out, the tiff client state was not freed.

* Return after setting the error for advanced features without libraqm. Not returning here leads to an alloc that's never freed.

aclark4life · 2025-05-13T23:23:52Z

@wiredfool We can trigger a workflow manually or run daily independent of PRs.

radarhere · 2025-05-14T07:31:55Z

Makefile

@@ -99,6 +99,13 @@ valgrind:
            --log-file=/tmp/valgrind-output \
            python3 -m pytest --no-memcheck -vv --valgrind --valgrind-log=/tmp/valgrind-output

+.PHONY: valgrind-leak
+valgrind-leak:


Suggested change

valgrind-leak:

valgrind-leak:

python3 -c "import pytest_valgrind" > /dev/null 2>&1 || python3 -m pip install pytest-valgrind

Install pytest-valgrind if it is missing, like the 'valgrind' target does

radarhere · 2025-05-14T07:33:03Z

src/libImaging/Arrow.c

@@ -37,6 +37,10 @@ ReleaseExportedSchema(struct ArrowSchema *array) {
            child->release = NULL;
        }
        // UNDONE -- should I be releasing the children?


Is this UNDONE comment resolved now?

wiredfool · 2025-05-14T08:11:17Z

We can trigger a workflow manually or run daily independent of PRs.

The question then is how to we prevent this from being run as a scheduled action and subsequently ignored? The tension is that this is by far the most valuable when a PR is in flight, because then it's obvious when you've gone from current (xfail) to fail, but the runtime is likely 2-3x worse than our longest other test run, which is already too long. (Even running single tests with valgrind take ~ 1 min, because all the pytest setup infra runs under valgrind as well)

aclark4life · 2025-05-14T11:26:38Z

In that case, long PR it is! We typically have PRs queued for days, or longer, before merging.

wiredfool · 2025-05-14T12:15:26Z

On My Machine:
4711 passed, 345 skipped, 11 xfailed, 8 warnings in 15074.00s (4:11:14)

That's vs 87 minutes for ordinary valgrind on this pr in the GHA testing.

Unfortunately, we're single threaded for valgrind tests.

wiredfool · 2025-05-15T20:23:13Z

Ok, We'll see how long this takes: https://github.com/python-pillow/Pillow/actions/runs/15054197221/job/42316085246?pr=8954

I'm only running this on the pull request, not the push. I'm expecting 5 hours.

This is injecting the test script in the Pillow repo, instead of building a new copy of the valgrind image. At some point we can move that over, but I'd like to be tuning the supressions without having to build a new valgrind image, so it's probably better to leave it here for now.

wiredfool added 10 commits May 12, 2025 00:27

Fix memory leak in arrow export using array structure

74ab5ac

valgrind memory leak check

4984c45

fix memory leak in arrow schema

fdfba98

Suppress all python level leaks for now

84b88a9

Fix leak in webp_encode

eaab435

* Free the output buffer on webp encode error

Fix leak of destination image in ImagingUnsharpMask when an error occurs

a9bcd7d

Fix memory leak in TiffEncode

e2e40c5

* If setimage errors out, the tiff client state was not freed.

Fix memory leak

f792e0b

* Return after setting the error for advanced features without libraqm. Not returning here leads to an alloc that's never freed.

Fix memory leak when JpegEncode returns an error.

789631c

Wrap Makefile

7aa6a61

wiredfool added Bug Any unexpected behavior, until confirmed feature. Memory labels May 13, 2025

radarhere reviewed May 14, 2025

View reviewed changes

wiredfool added 5 commits May 15, 2025 21:10

Adding pytest-valgrind install

fb126af

Guess so.

d5449d5

Add github workflow/test-script

218f055

executable

a6b8b3a

correct target

2d506f6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Valgrind Memory Leak Checking #8954

Valgrind Memory Leak Checking #8954

wiredfool commented May 13, 2025 •

edited

Loading

aclark4life commented May 13, 2025

radarhere May 14, 2025

radarhere May 14, 2025

wiredfool commented May 14, 2025

aclark4life commented May 14, 2025

wiredfool commented May 14, 2025

wiredfool commented May 15, 2025

	valgrind-leak:
	valgrind-leak:
	python3 -c "import pytest_valgrind" > /dev/null 2>&1 \|\| python3 -m pip install pytest-valgrind

Valgrind Memory Leak Checking #8954

Are you sure you want to change the base?

Valgrind Memory Leak Checking #8954

Conversation

wiredfool commented May 13, 2025 • edited Loading

aclark4life commented May 13, 2025

radarhere May 14, 2025

Choose a reason for hiding this comment

radarhere May 14, 2025

Choose a reason for hiding this comment

wiredfool commented May 14, 2025

aclark4life commented May 14, 2025

wiredfool commented May 14, 2025

wiredfool commented May 15, 2025

wiredfool commented May 13, 2025 •

edited

Loading