-
-
Notifications
You must be signed in to change notification settings - Fork 266
x86: Shadow stacks compatibility for interceptor #791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I have tests failing locally and I'm taking a look. It seems that |
666e10e
to
8147f85
Compare
The current patch works on my machine (which has CET shadow stacks support) and still lets me intercept functions from CETCOMPAT modules in processes that use the user shadow stack mitigation on Windows. It is now ready for review. |
31b7474
to
ba13090
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay, this is awesome! 🤩 Thanks a lot for doing this.
Since it took me so long to review due to a rabbit-hole that kept me busy for way too long, I went ahead and made the final tweaks needed to merge this (only a few minor tweaks). While doing that I noticed the CI acting up, so I'm going to get to the bottom of that and remove those debug commits before merging.
This makes Interceptor code compatible with Intel CET shadow stacks. - Detect CPUs that have support for CET shadow stacks. - Make the on_leave_trampoline a legit address to return to from a CET-enabled intercepted function's point of view. - Ensure a proper CALL/RET discipline. On Windows, this means that it will now be possible to intercept functions that live in CETCOMPAT modules when the user shadow stack mitigation is active (though not in strict mode).
fe86f1f
to
aceb3ae
Compare
Hey, you're welcome! Thanks for reviewing and integrating the patch. Let me know if some points are still unclear. Actually there are two points I would like to emphasize here. First, I took a somewhat arbitrary design decision. The question is essentially, do we want all users to run the same code always, although that includes instructions that are useless if the mitigation is not active? There are essentially three possibilities here: (a) ensure proper call/ret discipline for all x86 users always ; I would say (a) is the best long-term solution from a maintenance and stability point of view because it guarantees that the code that users run takes exactly the same code paths that were run during tests no matter what. (c) would be the exact opposite, only running the extra instructions when necessary, at the cost that you may discover subtle bugs at some point because the tests ran without shadow stack support and the user runs Frida in a process that has them, or vice versa. (c) also requires support from the operating system to know if the mitigation is currently active, so it is a bit more work on the portability and maintenance side. The current patch implements (b) which is some kind of a compromise based on the idea that if your CPU is recent enough to support shadow stacks, it's probably also fast enough to run the extra instructions without any significant performance cost and as time passes and people change their CPUs, (b) gets closer to (a) anyway. I would probably have gone for (a) if I owned the library, because I don't think these extra instructions will actually lead to a significant performance regression for anyone, but I went for (b) because I can't guarantee that. If you think (a) or (c) matches better with the goals for Frida, we could adapt the patch. Second, we're replacing a |
🙌
I totally agree that a) is the ideal from a maintenance and stability point of view. But I think we should strive to reduce our overhead over time, so we can minimize our impact on timings, to be able to observe a system without changing its behavior more than necessary. So I think I would lean towards c). Given that Interceptor is used for a lot of things and these trampolines are common to all use-cases, it seems worth it to trade some simplicity for speed.
Good point! I think this is fine for now. We currently ensure that we don't clobber any registers or CPU flags on x86, and I think it is important to retain this guarantee as Interceptor does indeed support instruction-level probes as well as targets with custom calling conventions. |
In that case, let me expand a bit on (c). Another issue with (c) is the following: if a process is currently not using the mitigation, could the mitigation become active later in that same process? Because if that is possible, then (c) is dangerous: we could be generating code now that would not be correct anymore at a later point in the process. My testing so far suggests that on Windows at least, either a process has the mitigation from the start, or it will never have it. That can be specified at process creation using |
Good point! It seems unlikely, but I don't know either.
Cool! I suppose we could do that for Windows only initially, and expand on it later as we learn more about the other platforms' internals. |
This patch makes interceptor code compatible with Intel CET shadow stacks.
On Windows, this means that it will now be possible to intercept functions that live in CETCOMPAT modules when the user shadow stack mitigation is active (though not in strict mode).