From a technical perspective, are there any options available to projects & maintainers to detect or avoid situations like the recent xz compromise? #143
Replies: 7 comments 4 replies
-
Discourage the use of binary test cases -- instead, the repo should have scripts that generate those test case binaries when they're needed. Opaque binaries are very difficult to reason about. Doesn't cover all cases -- e.g. images, static data files, etc. |
Beta Was this translation helpful? Give feedback.
-
A heuristic of "high risk change" vs "low risk change" might be helpful based on the content of the PR and perhaps the historical contributions from the committer -- e.g. "I see you only make typo fixes in JavaScript but now you just tweaked the memory allocator in this low level driver code". Related, when a high risk change in a critical project is identified, either manually by perhaps tagging the issue, or through a PR scanner, that kind of signal could be sent out to a paid team of security professionals to review it. |
Beta Was this translation helpful? Give feedback.
-
How about tracing the build/compilation process to see which files/resources get read?
|
Beta Was this translation helpful? Give feedback.
-
Here are some xz backdoor (CVE-2024-3094) notes. There are MANY sources of info about the backdoor in versions 5.6.0 and 5.6.1 of the XZ compression utility and library (CVE-2024-3094). It targeted OpenSSH. Andres Freund discovered the backdoor by noticing that failed SSH logins were taking a lot of CPU time while doing some micro-benchmarking, and tracking down the backdoor from there. It was introduced by XZ co-maintainer "Jia Tan". Here are some places where more details can be found: Here I’ll quickly brainstorm some approaches that MIGHT have helped, with some possible pros and cons. I’m not necessarily advocating for them, since some have some serious negatives, but maybe they’ll spawn good ideas. Don’t GENERATE tarballs or their contents (e.g., configure)The attack hid in a file that was putatively generated by autoconf, but wasn’t. Autoconf was from a time when people generated tarballs with stuff in them. At the time, you couldn’t be sure the recipient would be able to get all the necessary build tools. That’s not usually a problem today. Instead, you could distribute source tarballs that are strictly copies of a version from a version control system, and since the latter are what people review, that would make sense. This is a big change from how autoconf was originally intended to be used, but it’s not that hard to change. This would mean that to build, you’d need to have autotools installed and its build first step (when building the package) would be “autoreconf” (which calls the right tools in the right order). You could verify that what you received is what was stored in the source repo, which is an easy check once nothing is generated. Bonus points if we could make archives reproducible. Currently an archive (tarball, zip, etc.) can have its bits change without a change in its contents. Verify or ignore generated files in tarballsYou could also forcibly regenerate (e.g., with autoreconf) files that were supposedly generated. Least privilege during prep, build, and testIt’d be possible to build systems that greatly reduce privileges. E.g.:
Forbid/Warn on embedded binary filesScorecard already ranks these as risks. That said, there are many cases where test cases legitimately need binary files as tests. This is harder to pull off broadly. Least privilege on linkIt’s really unusual for one program to need to modify the link-time processing of another. Can we disable that, at least in many cases, and require a separate permission to grant it that an attacker can’t manipulate? As I noted here: Developers knowing developersNobody knows who the attacker is (likely a team). I don’t think people need to give up privacy just to contribute code. However, I think it’s important for either (1) the other co-maintainers know & have met the maintainer, or (2) treat the anonymous maintainer with greater suspicion, or ideally (3) both. Multi-person reviewPart of the problem here was that an overworked maintainer was bullied into accepting another maintainer. If every change was reviewed by multiple people before acceptance, and it was KNOWN that they were different people, this would not have worked. Sure, you can have multiple malicious parties, but that is harder. Other ideas?I'm sure there are other ideas. It'd be wise to discuss how to make these kinds of attacks harder to perform. |
Beta Was this translation helpful? Give feedback.
-
It did get caught due to the 2nd order effect of a performance regression. I know I'll be looking more closely at our regression test suites around Go and OpenJDK and asking the author "I see a difference of X here, let's look at the likely root cause of that" |
Beta Was this translation helpful? Give feedback.
-
Identify the top OSS projects (especially with solo maintainers) and fund added work/review of changesIt's obviously not possible to serious fund "all OSS projects". But focusing on the "most important ones" is plausible. That's the premise of the OpenSSF Securing Critical Projects Working Group. In the case of xz, we did have specific warning: a report I wrote in 2015. In 2015 I was the lead author of the report "Open Source Software Projects Needing Security Investments", which was funded by the Linux Foundation. The goal was to do a relatively quick analysis to determine what the "most important OSS" was. We agreed to focus on packages in Linux distributions (specifically Debian). I just double-checked; xz-utils was specifically listed as one of the riskiest, because it is a very widely used compression algorithm. Quick note: it's colloquially often referred to as "Census I" but that title never occurs in the paper; the name was applied after-the-fact due to the later "Census II" work by Harvard on language-level packages. You can get the 2015 paper here: https://www.ida.org/-/media/feature/publications/o/op/open-source-software-projects-needing-security-investments/d-5459.ashx or https://github.com/coreinfrastructure/census/blob/master/OSS-2015-06-19.pdf (on the latter click on "download"). Load at run-time only what you need (e.g., dlopen)A useful countermeasure is to only load software when you need to use it. This can (depending on details) significantly reduce the number of systems affected by some specific attack. The systemd folks have been exploring this: "A while back we started to turn many of the library dependencies of systemd from regular ELF dependencies (which you can explore with tools like lddtree or readelf -d … | grep NEEDED) into dlopen() deps, in order to minimize the dep footprint of systemd.". This potentially reduces the number of systems that are vulnerable. If the software isn't normally loaded and the attacker can't trigger its loading, the system won't be exploitable even if technically the vulnerability exists in storage. In theory this could reduce the size of memory/disk use, though since many systems today page memory in & out, that effect is less significant than it would have been in old systems. One disadvantage when switching from ELF dependencies to dlopen() dependencies is that this suddenly makes makes many dependencies nearly invisible to other tools. The systemd folks have a nice solution for this: they propose creating a new ELF section that would record this information. That way, tools can get this dependency information easily. See: systemd/systemd#32234 Bonus points if dlopen() is rigged to only allow opening those files if there was a corresponding ELF section that recorded that they are dependencies. This would suddenly mean that you have to create an approvelist of libraries that can be dlopen'ed. More information: |
Beta Was this translation helpful? Give feedback.
-
Create linked provenance documents with in-Toto and check it with a layout. Https://GitHub.com/in-toto/witness is a great tool for this. Full disclosure I am a maintainer, and on the in-toto steering committee. |
Beta Was this translation helpful? Give feedback.
-
This goes hand-in-hand with an Issue the WG is creating to identify and advocate for means developers can take to protect themselves from social engineering and bullying as exemplified in the attack pattern (https://boehs.org/node/everything-i-know-about-the-xz-backdoor) - Vuln WG Issue - #142
What tools, technologies, or processes are available (or that could be written) that could help maintainers and persons involved in the downstream supply chain help identify and prevent this style of attack?
Beta Was this translation helpful? Give feedback.
All reactions