|
| 1 | +- Start Date: (fill me in with today's date, 2014-09-08) |
| 2 | +- RFC PR: (leave this empty) |
| 3 | +- Rust Issue: (leave this empty) |
| 4 | + |
| 5 | +# Summary |
| 6 | + |
| 7 | +This RFC proposes to remove the *runtime system* that is currently part of the |
| 8 | +standard library, which currently allows the standard library to support both |
| 9 | +native and green threading. In particular: |
| 10 | + |
| 11 | +* The `libgreen` crate and associated support will be moved out of tree, into a |
| 12 | + separate Cargo package. |
| 13 | + |
| 14 | +* The `librustrt` (the runtime) crate will be removed entirely. |
| 15 | + |
| 16 | +* The `std::io` implementation will be directly welded to native threads and |
| 17 | + system calls. |
| 18 | + |
| 19 | +* The `std::io` module will remain completely cross-platform, though *separate* |
| 20 | + platform-specific modules may be added at a later time. |
| 21 | + |
| 22 | +# Motivation |
| 23 | + |
| 24 | +## Background: thread/task models and I/O |
| 25 | + |
| 26 | +Many languages/libraries offer some notion of "task" as a unit of concurrent |
| 27 | +execution, possibly distinct from native OS threads. The characteristics of |
| 28 | +tasks vary along several important dimensions: |
| 29 | + |
| 30 | +* *1:1 vs M:N*. The most fundamental question is whether a "task" always |
| 31 | + corresponds to an OS-level thread (the 1:1 model), or whether there is some |
| 32 | + userspace scheduler that maps tasks onto worker threads (the M:N model). Some |
| 33 | + kernels -- notably, Windows -- support a 1:1 model where the scheduling is |
| 34 | + performed in userspace, which combines some of the advantages of the two |
| 35 | + models. |
| 36 | + |
| 37 | + In the M:N model, there are various choices about whether and when blocked |
| 38 | + tasks can migrate between worker threads. One basic downside of the model, |
| 39 | + however, is that if a task takes a page fault, the entire worker thread is |
| 40 | + essentially blocked until the fault is serviced. Choosing the optimal number |
| 41 | + of worker threads is difficult, and some frameworks attempt to do so |
| 42 | + dynamically, which has costs of its own. |
| 43 | + |
| 44 | +* *Stack management*. In the 1:1 model, tasks are threads and therefore must be |
| 45 | + equipped with their own stacks. In M:N models, tasks may or may not need their |
| 46 | + own stack, but there are important tradeoffs: |
| 47 | + |
| 48 | + * Techniques like *segmented stacks* allow stack size to grow over time, |
| 49 | + meaning that tasks can be equipped with their own stack but still be |
| 50 | + lightweight. Unfortunately, segmented stacks come with |
| 51 | + [a significant performance and complexity cost](https://mail.mozilla.org/pipermail/rust-dev/2013-November/006314.html). |
| 52 | + |
| 53 | + * On the other hand, if tasks are not equipped with their own stack, they |
| 54 | + either cannot be migrated between underlying worker threads (the case for |
| 55 | + frameworks like Java's |
| 56 | + [fork/join](http://gee.cs.oswego.edu/dl/papers/fj.pdf)), or else must be |
| 57 | + implemented using *continuation-passing style (CPS)*, where each blocking |
| 58 | + operation takes a closure representing the work left to do. (CPS essentially |
| 59 | + moves the needed parts of the stack into the continuation closure.) The |
| 60 | + upside is that such tasks can be extremely lightweight -- essentially just |
| 61 | + the size of a closure. |
| 62 | + |
| 63 | +* *Blocking and I/O support*. In the 1:1 model, a task can block freely without |
| 64 | + any risk for other tasks, since each task is an OS thread. In the M:N model, |
| 65 | + however, blocking in the OS sense means blocking the worker thread. (The same |
| 66 | + applies to long-running loops or page faults.) |
| 67 | + |
| 68 | + M:N models can deal with blocking in a couple of ways. The approach taken in |
| 69 | + Java's [fork/join](http://gee.cs.oswego.edu/dl/papers/fj.pdf)) framework, for |
| 70 | + example, is to dynamically spin up/down worker threads. Alternatively, special |
| 71 | + task-aware blocking operations (including I/O) can be provided, which are |
| 72 | + mapped under the hood to nonblocking operations, allowing the worker thread to |
| 73 | + continue. Unfortunately, this latter approach helps only with explicit |
| 74 | + blocking; it does nothing for loops, page faults and the like. |
| 75 | + |
| 76 | +### Where Rust is now |
| 77 | + |
| 78 | +Rust has gradually migrated from a "green" threading model toward a native |
| 79 | +threading model: |
| 80 | + |
| 81 | +* In Rust's green threading, tasks are scheduled M:N and are equipped with their |
| 82 | + own stack. Initially, Rust used segmented stacks to allow growth over time, |
| 83 | + but that |
| 84 | + [was removed](https://mail.mozilla.org/pipermail/rust-dev/2013-November/006314.html) |
| 85 | + in favor of pre-allocated stacks, which means Rust's green threads are not |
| 86 | + "lightweight". The treatment of blocking is described below. |
| 87 | + |
| 88 | +* In Rust's native threading model, tasks are 1:1 with OS threads. |
| 89 | + |
| 90 | +Initially, Rust supported only the green threading model. Later, native |
| 91 | +threading was added and ultimately became the default. |
| 92 | + |
| 93 | +In today's Rust, there is a single I/O API -- `std::io` -- that provides |
| 94 | +blocking operations only and works with both threading models. |
| 95 | +Rust is somewhat unusual in allowing programs to mix native and green threading, |
| 96 | +and furthermore allowing *some* degree of interoperation between the two. This |
| 97 | +feat is achieved through the runtime system -- `librustrt` -- which exposes: |
| 98 | + |
| 99 | +* The `Runtime` trait, which abstracts over the scheduler (via methods like |
| 100 | + `deschedule` and `spawn_sibling`) as well as the entire I/O API (via |
| 101 | + `local_io`). |
| 102 | + |
| 103 | +* The `rtio` module, which provides a number of traits that define the standard I/O |
| 104 | + abstraction. |
| 105 | + |
| 106 | +* The `Task` struct, which includes a `Runtime` trait object as the dynamic entry point |
| 107 | + into the runtime. |
| 108 | + |
| 109 | +In this setup, `libstd` works directly against the runtime interface. When |
| 110 | +invoking an I/O or scheduling operation, it first finds the current `Task`, and |
| 111 | +then extracts the `Runtime` trait object to actually perform the operation. |
| 112 | + |
| 113 | +On native tasks, blocking operations simply block. On green tasks, blocking |
| 114 | +operations are routed through the green scheduler and/or underlying event loop |
| 115 | +and nonblocking I/O. |
| 116 | + |
| 117 | +The actual scheduler and I/O implementations -- `libgreen` and `libnative` -- |
| 118 | +then live as crates "above" `libstd`. |
| 119 | + |
| 120 | +## The problems |
| 121 | + |
| 122 | +While the situation described above may sound good in principle, there are |
| 123 | +several problems in practice. |
| 124 | + |
| 125 | +**Forced co-evolution.** With today's design, the green and native |
| 126 | + threading models must provide the same I/O API at all times. But |
| 127 | + there is functionality that is only appropriate or efficient in one |
| 128 | + of the threading models. |
| 129 | + |
| 130 | + For example, the lightest-weight M:N task models are essentially just |
| 131 | + collections of closures, and do not provide any special I/O support. This |
| 132 | + style of lightweight tasks is used in Servo, but also shows up in |
| 133 | + [java.util.concurrent's exectors](http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Executors.html) |
| 134 | + and [Haskell's par monad](https://hackage.haskell.org/package/monad-par), |
| 135 | + among many others. |
| 136 | + |
| 137 | + On the other hand, green threading systems designed explicitly to support I/O |
| 138 | + may also want to provide low-level access to the underlying event loop -- an |
| 139 | + API surface that doesn't make sense for the native threading model. |
| 140 | + |
| 141 | + Under the native model we want to provide direct non-blocking and/or |
| 142 | + asynchronous I/O support -- as a systems language, Rust should be able to work |
| 143 | + directly with what the OS provides without imposing global abstraction |
| 144 | + costs. These APIs may involve some platform-specific abstractions (`epoll`, |
| 145 | + `kqueue`, IOCP) for maximal performance. But integrating them cleanly with a |
| 146 | + green threading model may be difficult or impossible -- and at the very least, |
| 147 | + makes it difficult to add them quickly and seamlessly to the current I/O |
| 148 | + system. |
| 149 | + |
| 150 | + In short, the current design couples threading and I/O models together, and |
| 151 | + thus forces the green and native models to supply a common I/O interface -- |
| 152 | + despite the fact that they are pulling in different directions. |
| 153 | + |
| 154 | +**Overhead.** The current Rust model allows runtime mixtures of the green and |
| 155 | + native models. The implementation achieves this flexibility by using trait |
| 156 | + objects to model the entire I/O API. Unfortunately, this flexibility has |
| 157 | + several downsides: |
| 158 | + |
| 159 | +- *Binary sizes*. A significant overhead caused by the trait object design is that |
| 160 | + the entire I/O system is included in any binary that statically links to |
| 161 | + `libstd`. See |
| 162 | + [this comment](https://github.com/rust-lang/rust/issues/10740#issuecomment-31475987) |
| 163 | + for more details. |
| 164 | + |
| 165 | +- *Task-local storage*. The current implementation of task-local storage is |
| 166 | + designed to work seamlessly across native and green threads, and its performs |
| 167 | + substantially suffers as a result. While it is feasible to provide a more |
| 168 | + efficient form of "hybrid" TLS that works across models, doing so is *far* |
| 169 | + more difficult than simply using native thread-local storage. |
| 170 | + |
| 171 | +- *Allocation and dynamic dispatch*. With the current design, any invocation of |
| 172 | + I/O involves at least dynamic dispatch, and in many cases allocation, due to |
| 173 | + the use of trait objects. However, in most cases these costs are trivial when |
| 174 | + compared to the cost of actually doing the I/O (or even simply making a |
| 175 | + syscall), so they are not strong arguments against the current design. |
| 176 | + |
| 177 | +**Problematic I/O interactions.** As the |
| 178 | + [documentation for libgreen](http://doc.rust-lang.org/green/#considerations-when-using-libgreen) |
| 179 | + explains, only some I/O and synchronization methods work seamlessly across |
| 180 | + native and green tasks. For example, any invocation of native code that calls |
| 181 | + blocking I/O has the potential to block the worker thread running the green |
| 182 | + scheduler. In particular, `std::io` objects created on a native task cannot |
| 183 | + safely be used within a green task. Thus, even though `std::io` presents a |
| 184 | + unified I/O API for green and native tasks, it is not fully interoperable. |
| 185 | + |
| 186 | +**Embedding Rust.** When embedding Rust code into other contexts -- whether |
| 187 | + calling from C code or embedding in high-level languages -- there is a fair |
| 188 | + amount of setup needed to provide the "runtime" infrastructure that `libstd` |
| 189 | + relies on. If `libstd` was instead bound to the native threading and I/O |
| 190 | + system, the embedding setup would be much simpler. |
| 191 | + |
| 192 | +**Maintenance burden.** Finally, `libstd` is made somewhat more complex by |
| 193 | + providing such a flexible threading model. As this RFC will explain, moving to |
| 194 | + a strictly native threading model will allow substantial simplification and |
| 195 | + reorganization of the structure of Rust's libraries. |
| 196 | + |
| 197 | +# Detailed design |
| 198 | + |
| 199 | +To mitigate the above problems, this RFC proposes to tie `std::io` directly to |
| 200 | +the native threading model, while moving `libgreen` and its supporting |
| 201 | +infrastructure into an external Cargo package with its own I/O API. |
| 202 | + |
| 203 | +## The near-term plan |
| 204 | +### `std::io` and native threading |
| 205 | + |
| 206 | +The plan is to entirely remove `librustrt`, including all of the traits. |
| 207 | +The abstraction layers will then become: |
| 208 | + |
| 209 | +- Highest level: `libstd`, providing cross-platform, high-level I/O and |
| 210 | + scheduling abstractions. The crate will depend on `libnative` (the opposite |
| 211 | + of today's situation). |
| 212 | + |
| 213 | +- Mid-level: `libnative`, providing a cross-platform Rust interface for I/O and |
| 214 | + scheduling. The API will be relatively low-level, compared to `libstd`. The |
| 215 | + crate will depend on `libsys`. |
| 216 | + |
| 217 | +- Low-level: `libsys` (renamed from `liblibc`), providing platform-specific Rust |
| 218 | + bindings to system C APIs. |
| 219 | + |
| 220 | +In this scheme, the actual API of `libstd` will not change significantly. But |
| 221 | +its implementation will invoke functions in `libnative` directly, rather than |
| 222 | +going through a trait object. |
| 223 | + |
| 224 | +A goal of this work is to minimize the complexity of embedding Rust code in |
| 225 | +other contexts. It is not yet clear what the final embedding API will look like. |
| 226 | + |
| 227 | +### Green threading |
| 228 | + |
| 229 | +Despite tying `libstd` to native threading, however, `libgreen` will still be |
| 230 | +supported -- at least initially. The infrastructure in `libgreen` and friends will |
| 231 | +move into its own Cargo package. |
| 232 | + |
| 233 | +Initially, the green threading package will support essentially the same |
| 234 | +interface it does today; there are no immediate plans to change its API, since |
| 235 | +the focus will be on first improving the native threading API. Note, however, |
| 236 | +that the I/O API will be exposed separately within `libgreen`, as opposed to the |
| 237 | +current exposure through `std::io`. |
| 238 | + |
| 239 | +## The long-term plan |
| 240 | + |
| 241 | +Ultimately, a large motivation for the proposed refactoring is to allow the APIs |
| 242 | +for native I/O to grow. |
| 243 | + |
| 244 | +In particular, over time we should expose more of the underlying system |
| 245 | +capabilities under the native threading model. Whenever possible, these |
| 246 | +capabilities should be provided at the `libstd` level -- the highest level of |
| 247 | +cross-platform abstraction. However, an important goal is also to provide |
| 248 | +nonblocking and/or asynchronous I/O, for which system APIs differ greatly. It |
| 249 | +may be necessary to provide additional, platform-specific crates to expose this |
| 250 | +functionality. Ideally, these crates would interoperate smoothly with `libstd`, |
| 251 | +so that for example a `libposix` crate would allow using an `poll` operation |
| 252 | +directly against a `std::io::fs::File` value, for example. |
| 253 | + |
| 254 | +We also wish to expose "lowering" operations in `libstd` -- APIs that allow |
| 255 | +you to get at the file descriptor underlying a `std::io::fs::File`, for example. |
| 256 | + |
| 257 | +On the other hand, we very much want to explore and support truly lightweight |
| 258 | +M:N task models (that do not require per-task stacks) -- supporting efficient |
| 259 | +data parallelism with work stealing for CPU-bound computations. These |
| 260 | +lightweight models will not provide any special support for I/O. But they may |
| 261 | +benefit from a notion of "task-local storage" and interfacing with the task |
| 262 | +scheduler when explicitly synchronizing between tasks (via channels, for |
| 263 | +example). |
| 264 | + |
| 265 | +All of the above long-term plans will require substantial new design and |
| 266 | +implementation work, and the specifics are out of scope for this RFC. The main |
| 267 | +point, though, is that the refactoring proposed by this RFC will make it much |
| 268 | +more plausible to carry out such work. |
| 269 | + |
| 270 | +Finally, a guiding principle for the above work is *uncompromising support* for |
| 271 | +native system APIs, in terms of both functionality and performance. For example, |
| 272 | +it must be possible to use thread-local storage without significant overhead, |
| 273 | +which is very much not the case today. Any abstractions to support M:N threading |
| 274 | +models -- including the now-external `libgreen` package -- must respect this |
| 275 | +constraint. |
| 276 | + |
| 277 | +# Drawbacks |
| 278 | + |
| 279 | +The main drawback of this proposal is that green I/O will be provided by a |
| 280 | +forked interface of `std::io`. This change makes green threading |
| 281 | +"second class", and means there's more to learn when using both models |
| 282 | +together. |
| 283 | + |
| 284 | +This setup also somewhat increases the risk of invoking native blocking I/O on a |
| 285 | +green thread -- though of course that risk is very much present today. One way |
| 286 | +of mitigating this risk in general is the Java executor approach, where the |
| 287 | +native "worker" threads that are executing the green thread scheduler are |
| 288 | +monitored for blocking, and new worker threads are spun up as needed. |
| 289 | + |
| 290 | +# Unresolved questions |
| 291 | + |
| 292 | +There are may unresolved questions about the exact details of the refactoring, |
| 293 | +but these are considered implementation details since the `libstd` interface |
| 294 | +itself will not substantially change as part of this RFC. |
0 commit comments