-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
Adding an internal C-API for dynamic arrays #125543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Note for others: the general idea is to have a structure like a Python list that does not affect the reference counter state. |
How would this API address the leak in #124776? |
From my understanding, (part of) the leak in that PR was caused by incorrectly setting up a dynamic array--that could have been prevented with a proper internal API for it. |
I'm not convinced that we need such API. Do you have more examples of code which would benefit of such API? |
There are a number of array-like structures throughout the interpreter, namely on the interpreter and thread state off the top of my head. It would be nice if we didn't have to reinvent the wheel every time we wanted to do that :) The idea here is to get a nice unified interface in place, and then migrate some (but not all!) of the internal dynamic arrays to |
Do you have concrete examples? Can you point me to the code? |
Some that I found: My issue with this approach of creating a new linked list every time we need a new array internally is that a). linked lists have lower performance when used in this way (e.g. indexing is |
This is still not the level of explanation and concrete motivating examples that I'd like to see before adding a new abstraction. It's not clear to me that any of these fields are better suited to a generic dynamic array API than their currently implementations.
Most of these examples do not require indexing arbitrary elements. They also do not require an allocation for every addition. The linked list pointers are embedded in the structures. I don't think a generic |
Yeah, the ones that I found certainly shouldn't all be dynamic arrays--I was just making a point that we have a lot of array-like structures internally :) Agreeing with most of what you said--I don't really want to target this toward the interpreter core (though, I'm sure there are cases somewhere). The main beneficiaries here would be stdlib extensions. For example, asyncio has a linked list for futures, which causes complicated changes like this to exist for improvements. |
How would a dynamic array help here? asyncio uses linked list and there are good reasons for it, it avoids allocations, has fast addition and deletions and doesn't need traverse the list while adding duplicate entries, just check the pointers. |
So does a dynamic array :) The main upside in regards to that PR is that the change would be much simpler, really. |
How? doesn't it need to resize the array? also how can you check for duplicate entries without traversing the array? |
I think Victor and I are still looking for concrete examples where using the proposed API would be an improvement. Victor's original question was "Do you have more examples of code which would benefit of such API?" (and I am also looking for an answer to that same question). By concrete examples, I mean:
At this point, if you have a prototype of the dynamic array API, you should actually prototype the code changes for one or two of the fields to show how the usage would change. One or two good examples are much better than six vague examples. |
It does, but not often. The more it needs to get resized, the less it will need to. I'm not exactly certain how asyncio is checking duplicate entires in the linked list without traversing, but I imagine if it's just checking
I don't know. My impression from when I originally proposed the idea was that people had their own places where it would benefit, but I don't know where those places are. There's enough contention here that I don't think the change is worth pursuing, so I'm going to close this. |
I don't think that replacing a linked list with an array just to reuse code is a good rationale. I'm quite sure that if a linked list is used, there is a reason for that :-) |
In some cases, yeah, but C arrays are hard--I'm almost certain that linked lists have been retrofitted in some places simply because they're easier. Let's keep this idea in the back of our minds in case more uses come up :) |
Feature or enhancement
Proposal:
Per the leak introduced in #124776, it would be nice if we had an internal structure for dynamically allocated arrays, similar to how we have
_Py_hashtable_t
.cc @Eclips4 for bikeshedding the name and API (
_Py_dynarray_t
right now)Has this already been discussed elsewhere?
No response given
Links to previous discussion of this feature:
Discussed on discord
Linked PRs
The text was updated successfully, but these errors were encountered: