Skip to content

Commit 0680e2b

Browse files
committed
merge #4221 into opencontainers/runc:main
Aleksa Sarai (5): VERSION: back to development VERSION: release v1.2.0-rc.1 changelog: update to include all new changes since 1.1.0 changelog: sync changelog entries up to runc 1.1.12 changelog: mention key breaking changes for mount options LGTMs: lifubang AkihiroSuda kolyshkin cyphar
2 parents 4641f17 + 5194bd8 commit 0680e2b

File tree

2 files changed

+217
-7
lines changed

2 files changed

+217
-7
lines changed

CHANGELOG.md

+216-6
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,99 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
66

77
## [Unreleased]
88

9+
## [1.2.0-rc.1] - 2024-04-03
10+
11+
> There's a frood who really knows where his towel is.
12+
13+
`runc` now requires a minimum of Go 1.20 to compile.
14+
915
> **NOTE**: runc currently will not work properly when compiled with Go 1.22 or
1016
> newer. This is due to some unfortunate glibc behaviour that Go 1.22
1117
> exacerbates in a way that results in containers not being able to start on
12-
> some systems. [See this issue for more information.][runc-4233].
18+
> some systems. [See this issue for more information.][runc-4233]
19+
20+
[runc-4233]: https://github.com/opencontainers/runc/issues/4233
21+
22+
### Breaking
23+
24+
* Several aspects of how mount options work has been adjusted in a way that
25+
could theoretically break users that have very strange mount option strings.
26+
This was necessary to fix glaring issues in how mount options were being
27+
treated. The key changes are:
28+
29+
- Mount options on bind-mounts that clear a mount flag are now always
30+
applied. Previously, if a user requested a bind-mount with only clearing
31+
options (such as `rw,exec,dev`) the options would be ignored and the
32+
original bind-mount options would be set. Unfortunately this also means
33+
that container configurations which specified only clearing mount options
34+
will now actually get what they asked for, which could break existing
35+
containers (though it seems unlikely that a user who requested a specific
36+
mount option would consider it "broken" to get the mount options they
37+
asked foruser who requested a specific mount option would consider it
38+
"broken" to get the mount options they asked for). This also allows us to
39+
silently add locked mount flags the user *did not explicitly request to be
40+
cleared* in rootless mode, allowing for easier use of bind-mounts for
41+
rootless containers. (#3967)
42+
43+
- Container configurations using bind-mounts with superblock mount flags
44+
(i.e. filesystem-specific mount flags, referred to as "data" in
45+
`mount(2)`, as opposed to VFS generic mount flags like `MS_NODEV`) will
46+
now return an error. This is because superblock mount flags will also
47+
affect the host mount (as the superblock is shared when bind-mounting),
48+
which is obviously not acceptable. Previously, these flags were silently
49+
ignored so this change simply tells users that runc cannot fulfil their
50+
request rather than just ignoring it. (#3990)
51+
52+
If any of these changes cause problems in real-world workloads, please [open
53+
an issue](https://github.com/opencontainers/runc/issues/new/choose) so we
54+
can adjust the behaviour to avoid compatibility issues.
55+
56+
### Added
57+
58+
* runc has been updated to OCI runtime-spec 1.2.0, and supports all Linux
59+
features with a few minor exceptions. See
60+
[`docs/spec-conformance.md`](https://github.com/opencontainers/runc/blob/v1.2.0-rc.1/docs/spec-conformance.md)
61+
for more details.
62+
* runc now supports id-mapped mounts for bind-mounts (with no restrictions on
63+
the mapping used for each mount). Other mount types are not currently
64+
supported. This feature requires `MOUNT_ATTR_IDMAP` kernel support (Linux
65+
5.12 or newer) as well as kernel support for the underlying filesystem used
66+
for the bind-mount. See [`mount_setattr(2)`][mount_setattr.2] for a list of
67+
supported filesystems and other restrictions. (#3717, #3985, #3993)
68+
* Two new mechanisms for reducing the memory usage of our protections against
69+
[CVE-2019-5736][cve-2019-5736] have been introduced:
70+
- `runc-dmz` is a minimal binary (~8K) which acts as an additional execve
71+
stage, allowing us to only need to protect the smaller binary. It should
72+
be noted that there have been several compatibility issues reported with
73+
the usage of `runc-dmz` (namely related to capabilities and SELinux). As
74+
such, this mechanism is **opt-in** and can be enabled by running `runc`
75+
with the environment variable `RUNC_DMZ=true` (setting this environment
76+
variable in `config.json` will have no effect). This feature can be
77+
disabled at build time using the `runc_nodmz` build tag. (#3983, #3987)
78+
- `contrib/memfd-bind` is a helper daemon which will bind-mount a memfd copy
79+
of `/usr/bin/runc` on top of `/usr/bin/runc`. This entirely eliminates
80+
per-container copies of the binary, but requires care to ensure that
81+
upgrades to runc are handled properly, and requires a long-running daemon
82+
(unfortunately memfds cannot be bind-mounted directly and thus require a
83+
daemon to keep them alive). (#3987)
84+
* runc will now use `cgroup.kill` if available to kill all processes in a
85+
container (such as when doing `runc kill`). (#3135, #3825)
86+
* Add support for setting the umask for `runc exec`. (#3661)
87+
* libct/cg: support `SCHED_IDLE` for runc cgroupfs. (#3377)
88+
* checkpoint/restore: implement `--manage-cgroups-mode=ignore`. (#3546)
89+
* seccomp: refactor flags support; add flags to features, set `SPEC_ALLOW` by
90+
default. (#3588)
91+
* libct/cg/sd: use systemd v240+ new `MAJOR:*` syntax. (#3843)
92+
* Support CFS bandwidth burst for CPU. (#3749, #3145)
93+
* Support time namespaces. (#3876)
94+
* Reduce the `runc` binary size by ~11% by updating
95+
`github.com/checkpoint-restore/go-criu`. (#3652)
96+
* Add `--pidfd-socket` to `runc run` and `runc exec` to allow for management
97+
processes to receive a pidfd for the new process, allowing them to avoid pid
98+
reuse attacks. (#4045)
99+
100+
[mount_setattr.2]: https://man7.org/linux/man-pages/man2/mount_setattr.2.html
101+
[cve-2019-5736]: https://github.com/advisories/GHSA-gxmr-w5mj-v8hh
13102

14103
### Deprecated
15104

@@ -21,12 +110,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
21110
to kill a container (with SIGKILL) which does not have its own private PID
22111
namespace (so that runc would send SIGKILL to all processes). Now, this is
23112
done automatically. (#3864, #3825)
113+
* `github.com/opencontainers/runc/libcontainer/user` is now deprecated, please
114+
use `github.com/moby/sys/user` instead. It will be removed in a future
115+
release. (#4017)
24116

25117
### Changed
26118

27119
* When Intel RDT feature is not available, its initialization is skipped,
28120
resulting in slightly faster `runc exec` and `runc run`. (#3306)
29-
* Enforce absolute paths for mounts. (#3020, #3717)
121+
* `runc features` is no longer experimental. (#3861)
30122
* libcontainer users that create and kill containers from a daemon process
31123
(so that the container init is a child of that process) must now implement
32124
a proper child reaper in case a container does not have its own private PID
@@ -40,6 +132,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
40132
For cgroupv1, `Usage` and `Failcnt` are set by subtracting memory usage
41133
from memory+swap usage. For cgroupv2, `Usage`, `Limit`, and `MaxUsage`
42134
are set. (#4010)
135+
* libcontainer users that create and kill containers from a daemon process
136+
(so that the container init is a child of that process) must now implement
137+
a proper child reaper in case a container does not have its own private PID
138+
namespace, as documented in `container.Signal`. (#3825)
139+
* libcontainer: `container.Signal` no longer takes an `all` argument. Whether
140+
or not it is necessary to kill all processes in the container individually
141+
is now determined automatically. (#3825, #3885)
142+
* seccomp: enable seccomp binary tree optimization. (#3405)
143+
* `runc run`/`runc exec`: ignore SIGURG. (#3368)
144+
* Remove tun/tap from the default device allowlist. (#3468)
145+
* `runc --root non-existent-dir list` now reports an error for non-existent
146+
root directory. (#3374)
43147

44148
### Fixed
45149

@@ -50,9 +154,108 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
50154
support would return `-EPERM` despite the existence of the `-ENOSYS` stub
51155
code (this was due to how s390x does syscall multiplexing). (#3474)
52156
* Remove tun/tap from the default device rules. (#3468)
53-
* specconv: avoid mapping "acl" to MS_POSIXACL. (#3739)
157+
* specconv: avoid mapping "acl" to `MS_POSIXACL`. (#3739)
158+
* libcontainer: fix private PID namespace detection when killing the
159+
container. (#3866, #3825)
160+
* systemd socket notification: fix race where runc exited before systemd
161+
properly handled the `READY` notification. (#3291, #3293)
162+
* The `-ENOSYS` seccomp stub is now always generated for the native
163+
architecture that `runc` is running on. This is needed to work around some
164+
arguably specification-incompliant behaviour from Docker on architectures
165+
such as ppc64le, where the allowed architecture list is set to `null`. This
166+
ensures that we always generate at least one `-ENOSYS` stub for the native
167+
architecture even with these weird configs. (#4219)
54168

55-
[runc-4233]: https://github.com/opencontainers/runc/issues/4233
169+
### Removed
170+
171+
* In order to fix performance issues in the "lightweight" bindfd protection
172+
against [CVE-2019-5736][cve-2019-5736], the temporary `ro` bind-mount of
173+
`/proc/self/exe` has been removed. runc now creates a binary copy in all
174+
cases. See the above notes about `memfd-bind` and `runc-dmz` as well as
175+
`contrib/cmd/memfd-bind/README.md` for more information about how this
176+
(minor) change in memory usage can be further reduced. (#3987, #3599, #2532,
177+
#3931)
178+
* libct/cg: Remove `EnterPid` (a function with no users). (#3797)
179+
* libcontainer: Remove `{Pre,Post}MountCmds` which were never used and are
180+
obsoleted by more generic container hooks. (#3350)
181+
182+
[cve-2019-5736]: https://github.com/advisories/GHSA-gxmr-w5mj-v8hh
183+
184+
## [1.1.12] - 2024-01-31
185+
186+
> Now you're thinking with Portals™!
187+
188+
### Security
189+
190+
* Fix [CVE-2024-21626][cve-2024-21626], a container breakout attack that took
191+
advantage of a file descriptor that was leaked internally within runc (but
192+
never leaked to the container process). In addition to fixing the leak,
193+
several strict hardening measures were added to ensure that future internal
194+
leaks could not be used to break out in this manner again. Based on our
195+
research, while no other container runtime had a similar leak, none had any
196+
of the hardening steps we've introduced (and some runtimes would not check
197+
for any file descriptors that a calling process may have leaked to them,
198+
allowing for container breakouts due to basic user error).
199+
200+
[cve-2024-21626]: https://github.com/opencontainers/runc/security/advisories/GHSA-xr7r-f8xq-vfvv
201+
202+
## [1.1.11] - 2024-01-01
203+
204+
> Happy New Year!
205+
206+
### Fixed
207+
208+
* Fix several issues with userns path handling. (#4122, #4124, #4134, #4144)
209+
210+
### Changed
211+
212+
* Support memory.peak and memory.swap.peak in cgroups v2.
213+
Add `swapOnlyUsage` in `MemoryStats`. This field reports swap-only usage.
214+
For cgroupv1, `Usage` and `Failcnt` are set by subtracting memory usage
215+
from memory+swap usage. For cgroupv2, `Usage`, `Limit`, and `MaxUsage`
216+
are set. (#4000, #4010, #4131)
217+
* build(deps): bump github.com/cyphar/filepath-securejoin. (#4140)
218+
219+
## [1.1.10] - 2023-10-31
220+
221+
> Śruba, przykręcona we śnie, nie zmieni sytuacji, jaka panuje na jawie.
222+
223+
### Added
224+
225+
* Support for `hugetlb.<pagesize>.rsvd` limiting and accounting. Fixes the
226+
issue of postres failing when hugepage limits are set. (#3859, #4077)
227+
228+
### Fixed
229+
230+
* Fixed permissions of a newly created directories to not depend on the value
231+
of umask in tmpcopyup feature implementation. (#3991, #4060)
232+
* libcontainer: cgroup v1 GetStats now ignores missing `kmem.limit_in_bytes`
233+
(fixes the compatibility with Linux kernel 6.1+). (#4028)
234+
* Fix a semi-arbitrary cgroup write bug when given a malicious hugetlb
235+
configuration. This issue is not a security issue because it requires a
236+
malicious `config.json`, which is outside of our threat model. (#4103)
237+
* Various CI fixes. (#4081, #4055)
238+
239+
## [1.1.9] - 2023-08-10
240+
241+
> There is a crack in everything. That's how the light gets in.
242+
243+
### Added
244+
245+
* Added go 1.21 to the CI matrix; other CI updates. (#3976, #3958)
246+
247+
### Fixed
248+
249+
* Fixed losing sticky bit on tmpfs (a regression in 1.1.8). (#3952, #3961)
250+
* intelrdt: fixed ignoring ClosID on some systems. (#3550, #3978)
251+
252+
### Changed
253+
254+
* Sum `anon` and `file` from `memory.stat` for cgroupv2 root usage,
255+
as the root does not have `memory.current` for cgroupv2.
256+
This aligns cgroupv2 root usage more closely with cgroupv1 reporting.
257+
Additionally, report root swap usage as sum of swap and memory usage,
258+
aligned with v1 and existing non-root v2 reporting. (#3933)
56259

57260
## [1.1.8] - 2023-07-20
58261

@@ -472,7 +675,7 @@ implementation (libcontainer) is *not* covered by this policy.
472675
cgroups at all during `runc update`). (#2994)
473676

474677
<!-- minor releases -->
475-
[Unreleased]: https://github.com/opencontainers/runc/compare/v1.1.0...HEAD
678+
[Unreleased]: https://github.com/opencontainers/runc/compare/v1.2.0-rc.1...HEAD
476679
[1.1.0]: https://github.com/opencontainers/runc/compare/v1.1.0-rc.1...v1.1.0
477680
[1.0.0]: https://github.com/opencontainers/runc/releases/tag/v1.0.0
478681

@@ -483,7 +686,11 @@ implementation (libcontainer) is *not* covered by this policy.
483686
[1.0.1]: https://github.com/opencontainers/runc/compare/v1.0.0...v1.0.1
484687

485688
<!-- 1.1.z patch releases -->
486-
[Unreleased 1.1.z]: https://github.com/opencontainers/runc/compare/v1.1.8...release-1.1
689+
[Unreleased 1.1.z]: https://github.com/opencontainers/runc/compare/v1.1.12...release-1.1
690+
[1.1.12]: https://github.com/opencontainers/runc/compare/v1.1.11...v1.1.12
691+
[1.1.11]: https://github.com/opencontainers/runc/compare/v1.1.10...v1.1.11
692+
[1.1.10]: https://github.com/opencontainers/runc/compare/v1.1.9...v1.1.10
693+
[1.1.9]: https://github.com/opencontainers/runc/compare/v1.1.8...v1.1.9
487694
[1.1.8]: https://github.com/opencontainers/runc/compare/v1.1.7...v1.1.8
488695
[1.1.7]: https://github.com/opencontainers/runc/compare/v1.1.6...v1.1.7
489696
[1.1.6]: https://github.com/opencontainers/runc/compare/v1.1.5...v1.1.6
@@ -493,3 +700,6 @@ implementation (libcontainer) is *not* covered by this policy.
493700
[1.1.2]: https://github.com/opencontainers/runc/compare/v1.1.1...v1.1.2
494701
[1.1.1]: https://github.com/opencontainers/runc/compare/v1.1.0...v1.1.1
495702
[1.1.0-rc.1]: https://github.com/opencontainers/runc/compare/v1.0.0...v1.1.0-rc.1
703+
704+
<!-- 1.2.z patch releases -->
705+
[1.2.0-rc.1]: https://github.com/opencontainers/runc/compare/v1.1.0...v1.2.0-rc.1

VERSION

+1-1
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
1.1.0+dev
1+
1.2.0-rc.1+dev

0 commit comments

Comments
 (0)