Skip to content

Memory leak when calling PIL.Image.Image.__arrow_c_array__() #8950

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Joshix-1 opened this issue May 10, 2025 · 1 comment · May be fixed by #8953
Open

Memory leak when calling PIL.Image.Image.__arrow_c_array__() #8950

Joshix-1 opened this issue May 10, 2025 · 1 comment · May be fixed by #8953
Labels

Comments

@Joshix-1
Copy link

What did you do?

I'm calling the PIL.Image.Image.__arrow_c_array__ method on images to access their data.

What did you expect to happen?

max rss should stay constant after a while

What actually happened?

memory gets leaked

The output of the reproduction script below:

RSS:      88656
RSS:      88836
RSS:      88908
=== accessing __arrow_c_array__ ===
RSS:     395212
RSS:     713164
=== with arro3.core.Array ===
RSS:    1031376
RSS:    1349072
=== with pyarrow.array ===
RSS:    1667588
RSS:    1986052

What are your OS, Python and Pillow versions?

  • OS: Arch Linux
  • Python: Python 3.13.3
  • Pillow: 11.2.1
$ python3 -m PIL --report
--------------------------------------------------------------------
Pillow 11.2.1
Python 3.13.3 (main, Apr  9 2025, 07:44:25) [GCC 14.2.1 20250207]
--------------------------------------------------------------------
Python executable is /home/josh/code/an-website/venv/bin/python3
Environment Python files loaded from /home/josh/code/an-website/venv
System Python files loaded from /usr
--------------------------------------------------------------------
Python Pillow modules loaded from /home/josh/code/an-website/venv/lib/python3.13/site-packages/PIL
Binary Pillow modules loaded from /home/josh/code/an-website/venv/lib/python3.13/site-packages/PIL
--------------------------------------------------------------------
--- PIL CORE support ok, compiled for 11.2.1
--- TKINTER support ok, loaded 8.6
--- FREETYPE2 support ok, loaded 2.13.3
--- LITTLECMS2 support ok, loaded 2.17
--- WEBP support ok, loaded 1.5.0
*** AVIF support not installed
--- JPEG support ok, compiled for libjpeg-turbo 3.1.0
--- OPENJPEG (JPEG2000) support ok, loaded 2.5.3
--- ZLIB (PNG/ZIP) support ok, loaded 1.3.1, compiled for zlib-ng 2.2.4
--- LIBTIFF support ok, loaded 4.7.0
--- RAQM (Bidirectional Text) support ok, loaded 0.10.1, fribidi 1.0.16, harfbuzz 11.0.1
*** LIBIMAGEQUANT (Quantization method) support not installed
--- XCB (X protocol) support ok
--------------------------------------------------------------------

Reproduction script

Doesn't need to be run with uv. If run with cpython, make sure the dependencies are installed>

#!/usr/bin/env -S uv run --script
# /// script
# dependencies = [
#   "arro3-core",
#   "numpy",
#   "pillow",
#   "pyarrow",
# ]
# ///

import gc, io, urllib.request, resource

import numpy, pyarrow
from arro3.core import Array
from PIL import Image

COUNT = 400

# the exact image doesn't matter
with urllib.request.urlopen("https://testimages.org/img/testimages_screenshot.jpg") as file:
    data = file.read()
    del file

# 3 times without arrow
for _ in range(3):
    for _ in range(COUNT):
        image = Image.open(io.BytesIO(data))
        image.load()

        numpy.asarray(image)
        tuple(image.getdata())

        image.close()
        del image
    gc.collect()
    print(f"RSS: {resource.getrusage(resource.RUSAGE_SELF).ru_maxrss: 10}")

print("=== accessing __arrow_c_array__ ===")
for _ in range(2):
    for _ in range(COUNT):
        image = Image.open(io.BytesIO(data))
        image.load()

        a, b = image.__arrow_c_array__()

        image.close()
        del image, a, b
    gc.collect()
    print(f"RSS: {resource.getrusage(resource.RUSAGE_SELF).ru_maxrss: 10}")

print("=== with arro3.core.Array ===")
for _ in range(2):
    for _ in range(COUNT):
        image = Image.open(io.BytesIO(data))
        image.load()

        Array(image)

        image.close()
        del image
    gc.collect()
    print(f"RSS: {resource.getrusage(resource.RUSAGE_SELF).ru_maxrss: 10}")

print("=== with pyarrow.array ===")
for _ in range(2):
    for _ in range(COUNT):
        image = Image.open(io.BytesIO(data))
        image.load()

        pyarrow.array(image)

        image.close()
        del image
    gc.collect()
    print(f"RSS: {resource.getrusage(resource.RUSAGE_SELF).ru_maxrss: 10}")

Related to: #8329 #8330

@wiredfool
Copy link
Member

wiredfool commented May 11, 2025

Ok, Looking at this in valgrind is clearly showing that the image memory is retained. This is the first arrow section with fewer iterations:


    MB
71.09^                                                                    #   
     |                                                                   @#:::
     |                                                                @@@@#:::
     |                                                               @@@@@#:::
     |                                                             @@@@@@@#:::
     |                                                           @@@@@@@@@#:::
     |                                                         @@@@@@@@@@@#:::
     |                                                        @@@@@@@@@@@@#:::
     |                                                     :::@@@@@@@@@@@@#:::
     |                                                  : @:: @@@@@@@@@@@@#:::
     |                                                 :::@:: @@@@@@@@@@@@#:::
     |                                               :::::@:: @@@@@@@@@@@@#:::
     |                                              :: :::@:: @@@@@@@@@@@@#:::
     |                                           ::@:: :::@:: @@@@@@@@@@@@#:::
     |                                         ::::@:: :::@:: @@@@@@@@@@@@#:::
     |                                       ::::::@:: :::@:: @@@@@@@@@@@@#:::
     |                                     ::::::::@:: :::@:: @@@@@@@@@@@@#:::
     |                                   ::::::::::@:: :::@:: @@@@@@@@@@@@#:::
     |                                 :@::::::::::@:: :::@:: @@@@@@@@@@@@#:::
     |                      ::@:::::@:::@::::::::::@:: :::@:: @@@@@@@@@@@@#:::
   0 +----------------------------------------------------------------------->Gi
     0                                                                   4.157

Looks like an off by one in the refcount implementation not releasing the Image when the pycapsule used child arrays, with a fix I'm seeing flat memory usage. This is for all three arrow sections.

    MB
9.763^                                                             :          
     |                  #::::::@::::@@:::::::::@::::::::@:::::::::@:    @:::: 
     |                  #: ::: @: ::@ ::::: :::@::::::::@: ::::: :@:    @:::: 
     |                  #: ::: @: ::@ ::::: :::@::::::::@: ::::: :@:::::@:::: 
     |                  #: ::: @: ::@ ::::: :::@::::::::@: ::::: :@:::::@:::: 
     |                  #: ::: @: ::@ ::::: :::@::::::::@: ::::: :@:::::@:::::
     |                 @#: ::: @: ::@ ::::: :::@::::::::@: ::::: :@:::::@:::::
     |                 @#: ::: @: ::@ ::::: :::@::::::::@: ::::: :@:::::@:::::
     |                :@#: ::: @: ::@ ::::: :::@::::::::@: ::::: :@:::::@:::::
     |              :::@#: ::: @: ::@ ::::: :::@::::::::@: ::::: :@:::::@:::::
     |              : :@#: ::: @: ::@ ::::: :::@::::::::@: ::::: :@:::::@:::::
     |             :: :@#: ::: @: ::@ ::::: :::@::::::::@: ::::: :@:::::@:::::
     |             :: :@#: ::: @: ::@ ::::: :::@::::::::@: ::::: :@:::::@:::::
     |            ::: :@#: ::: @: ::@ ::::: :::@::::::::@: ::::: :@:::::@:::::
     |   @@@@@::::::: :@#: ::: @: ::@ ::::: :::@::::::::@: ::::: :@:::::@:::::
     |  :@    :: :::: :@#: ::: @: ::@ ::::: :::@::::::::@: ::::: :@:::::@:::::
     |  :@    :: :::: :@#: ::: @: ::@ ::::: :::@::::::::@: ::::: :@:::::@:::::
     | ::@    :: :::: :@#: ::: @: ::@ ::::: :::@::::::::@: ::::: :@:::::@:::::
     | ::@    :: :::: :@#: ::: @: ::@ ::::: :::@::::::::@: ::::: :@:::::@:::::
     | ::@    :: :::: :@#: ::: @: ::@ ::::: :::@::::::::@: ::::: :@:::::@:::::
   0 +----------------------------------------------------------------------->Gi
     0                                                                   8.291

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants