|
| 1 | +- Start Date: (fill me in with today's date, 2014-08-12) |
| 2 | +- RFC PR #: (leave this empty) |
| 3 | +- Rust Issue #: (leave this empty) |
| 4 | + |
| 5 | +# Summary |
| 6 | + |
| 7 | +This RFC adds *overloaded slice notation*: |
| 8 | + |
| 9 | +- `foo[]` for `foo.as_slice()` |
| 10 | +- `foo[n..m]` for `foo.slice(n, m)` |
| 11 | +- `foo[n..]` for `foo.slice_from(n)` |
| 12 | +- `foo[..m]` for `foo.slice_to(m)` |
| 13 | +- `mut` variants of all the above |
| 14 | + |
| 15 | +via two new traits, `Slice` and `SliceMut`. |
| 16 | + |
| 17 | +It also changes the notation for range `match` patterns to `...`, to |
| 18 | +signify that they are inclusive whereas `..` in slices are exclusive. |
| 19 | + |
| 20 | +# Motivation |
| 21 | + |
| 22 | +There are two primary motivations for introducing this feature. |
| 23 | + |
| 24 | +### Ergonomics |
| 25 | + |
| 26 | +Slicing operations, especially `as_slice`, are a very common and basic thing to |
| 27 | +do with vectors, and potentially many other kinds of containers. We already |
| 28 | +have notation for indexing via the `Index` trait, and this RFC is essentially a |
| 29 | +continuation of that effort. |
| 30 | + |
| 31 | +The `as_slice` operator is particularly important. Since we've moved away from |
| 32 | +auto-slicing in coercions, explicit `as_slice` calls have become extremely |
| 33 | +common, and are one of the |
| 34 | +[leading ergonomic/first impression](https://github.com/rust-lang/rust/issues/14983) |
| 35 | +problems with the language. There are a few other approaches to address this |
| 36 | +particular problem, but these alternatives have downsides that are discussed |
| 37 | +below (see "Alternatives"). |
| 38 | + |
| 39 | +### Error handling conventions |
| 40 | + |
| 41 | +We are gradually moving toward a Python-like world where notation like `foo[n]` |
| 42 | +calls `fail!` when `n` is out of bounds, while corresponding methods like `get` |
| 43 | +return `Option` values rather than failing. By providing similar notation for |
| 44 | +slicing, we open the door to following the same convention throughout |
| 45 | +vector-like APIs. |
| 46 | + |
| 47 | +# Detailed design |
| 48 | + |
| 49 | +The design is a straightforward continuation of the `Index` trait design. We |
| 50 | +introduce two new traits, for immutable and mutable slicing: |
| 51 | + |
| 52 | +```rust |
| 53 | +trait Slice<Idx, S> { |
| 54 | + fn as_slice<'a>(&'a self) -> &'a S; |
| 55 | + fn slice_from(&'a self, from: Idx) -> &'a S; |
| 56 | + fn slice_to(&'a self, to: Idx) -> &'a S; |
| 57 | + fn slice(&'a self, from: Idx, to: Idx) -> &'a S; |
| 58 | +} |
| 59 | + |
| 60 | +trait SliceMut<Idx, S> { |
| 61 | + fn as_mut_slice<'a>(&'a mut self) -> &'a mut S; |
| 62 | + fn slice_from_mut(&'a mut self, from: Idx) -> &'a mut S; |
| 63 | + fn slice_to_mut(&'a mut self, to: Idx) -> &'a mut S; |
| 64 | + fn slice_mut(&'a mut self, from: Idx, to: Idx) -> &'a mut S; |
| 65 | +} |
| 66 | +``` |
| 67 | + |
| 68 | +(Note, the mutable names here are part of likely changes to naming conventions |
| 69 | +that will be described in a separate RFC). |
| 70 | + |
| 71 | +These traits will be used when interpreting the following notation: |
| 72 | + |
| 73 | +*Immutable slicing* |
| 74 | + |
| 75 | +- `foo[]` for `foo.as_slice()` |
| 76 | +- `foo[n..m]` for `foo.slice(n, m)` |
| 77 | +- `foo[n..]` for `foo.slice_from(n)` |
| 78 | +- `foo[..m]` for `foo.slice_to(m)` |
| 79 | + |
| 80 | +*Mutable slicing* |
| 81 | + |
| 82 | +- `foo[mut]` for `foo.as_mut_slice()` |
| 83 | +- `foo[mut n..m]` for `foo.slice_mut(n, m)` |
| 84 | +- `foo[mut n..]` for `foo.slice_from_mut(n)` |
| 85 | +- `foo[mut ..m]` for `foo.slice_to_mut(m)` |
| 86 | + |
| 87 | +Like `Index`, uses of this notation will auto-deref just as if they were method |
| 88 | +invocations. So if `T` implements `Slice<uint, [U]>`, and `s: Smaht<T>`, then |
| 89 | +`s[]` compiles and has type `&[U]`. |
| 90 | + |
| 91 | +Note that slicing is "exclusive" (so `[n..m]` is the interval `n <= x |
| 92 | +< m`), while `..` in `match` patterns is "inclusive". To avoid |
| 93 | +confusion, we propose to change the `match` notation to `...` to |
| 94 | +reflect the distinction. The reason to change the notation, rather |
| 95 | +than the interpretation, is that the exclusive (respectively |
| 96 | +inclusive) interpretation is the right default for slicing |
| 97 | +(respectively matching). |
| 98 | + |
| 99 | +## Rationale for the notation |
| 100 | + |
| 101 | +The choice of square brackets for slicing is straightforward: it matches our |
| 102 | +indexing notation, and slicing and indexing are closely related. |
| 103 | + |
| 104 | +Some other languages (like Python and Go -- and Fortran) use `:` rather than |
| 105 | +`..` in slice notation. The choice of `..` here is influenced by its use |
| 106 | +elsewhere in Rust, for example for fixed-length array types `[T, ..n]`. The `..` |
| 107 | +for slicing has precedent in Perl and D. |
| 108 | + |
| 109 | +See [Wikipedia](http://en.wikipedia.org/wiki/Array_slicing) for more on the |
| 110 | +history of slice notation in programming languages. |
| 111 | + |
| 112 | +### The `mut` qualifier |
| 113 | + |
| 114 | +It may be surprising that `mut` is used as a qualifier in the proposed |
| 115 | +slice notation, but not for the indexing notation. The reason is that |
| 116 | +indexing includes an implicit dereference. If `v: Vec<Foo>` then |
| 117 | +`v[n]` has type `Foo`, not `&Foo` or `&mut Foo`. So if you want to get |
| 118 | +a mutable reference via indexing, you write `&mut v[n]`. More |
| 119 | +generally, this allows us to do resolution/typechecking prior to |
| 120 | +resolving the mutability. |
| 121 | + |
| 122 | +This treatment of `Index` matches the C tradition, and allows us to |
| 123 | +write things like `v[0] = foo` instead of `*v[0] = foo`. |
| 124 | + |
| 125 | +On the other hand, this approach is problematic for slicing, since in |
| 126 | +general it would yield an unsized type (under DST) -- and of course, |
| 127 | +slicing is meant to give you a fat pointer indicating the size of the |
| 128 | +slice, which we don't want to immediately deref. But the consequence |
| 129 | +is that we need to know the mutability of the slice up front, when we |
| 130 | +take it, since it determines the type of the expression. |
| 131 | + |
| 132 | +# Drawbacks |
| 133 | + |
| 134 | +The main drawback is the increase in complexity of the language syntax. This |
| 135 | +seems minor, especially since the notation here is essentially "finishing" what |
| 136 | +was started with the `Index` trait. |
| 137 | + |
| 138 | +## Limitations in the design |
| 139 | + |
| 140 | +Like the `Index` trait, this forces the result to be a reference via |
| 141 | +`&`, which may rule out some generalizations of slicing. |
| 142 | + |
| 143 | +One way of solving this problem is for the slice methods to take |
| 144 | +`self` (by value) rather than `&self`, and in turn to implement the |
| 145 | +trait on `&T` rather than `T`. Whether this approach is viable in the |
| 146 | +long run will depend on the final rules for method resolution and |
| 147 | +auto-ref. |
| 148 | + |
| 149 | +In general, the trait system works best when traits can be applied to |
| 150 | +types `T` rather than borrowed types `&T`. Ultimately, if Rust gains |
| 151 | +higher-kinded types (HKT), we could change the slice type `S` in the |
| 152 | +trait to be higher-kinded, so that it is a *family* of types indexed |
| 153 | +by lifetime. Then we could replace the `&'a S` in the return value |
| 154 | +with `S<'a>`. It should be possible to transition from the current |
| 155 | +`Index` and `Slice` trait designs to an HKT version in the future |
| 156 | +without breaking backwards compatibility by using blanket |
| 157 | +implementations of the new traits (say, `IndexHKT`) for types that |
| 158 | +implement the old ones. |
| 159 | + |
| 160 | +# Alternatives |
| 161 | + |
| 162 | +For improving the ergonomics of `as_slice`, there are two main alternatives. |
| 163 | + |
| 164 | +## Coercions: auto-slicing |
| 165 | + |
| 166 | +One possibility would be re-introducing some kind of coercion that automatically |
| 167 | +slices. |
| 168 | +We used to have a coercion from (in today's terms) `Vec<T>` to |
| 169 | +`&[T]`. Since we no longer coerce owned to borrowed values, we'd probably want a |
| 170 | +coercion `&Vec<T>` to `&[T]` now: |
| 171 | + |
| 172 | +```rust |
| 173 | +fn use_slice(t: &[u8]) { ... } |
| 174 | + |
| 175 | +let v = vec!(0u8, 1, 2); |
| 176 | +use_slice(&v) // automatically coerce here |
| 177 | +use_slice(v.as_slice()) // equivalent |
| 178 | +``` |
| 179 | + |
| 180 | +Unfortunately, adding such a coercion requires choosing between the following: |
| 181 | + |
| 182 | +* Tie the coercion to `Vec` and `String`. This would reintroduce special |
| 183 | + treatment of these otherwise purely library types, and would mean that other |
| 184 | + library types that support slicing would not benefit (defeating some of the |
| 185 | + purpose of DST). |
| 186 | + |
| 187 | +* Make the coercion extensible, via a trait. This is opening pandora's box, |
| 188 | + however: the mechanism could likely be (ab)used to run arbitrary code during |
| 189 | + coercion, so that any invocation `foo(a, b, c)` might involve running code to |
| 190 | + pre-process each of the arguments. While we may eventually want such |
| 191 | + user-extensible coercions, it is a *big* step to take with a lot of potential |
| 192 | + downside when reasoning about code, so we should pursue more conservative |
| 193 | + solutions first. |
| 194 | + |
| 195 | +## Deref |
| 196 | + |
| 197 | +Another possibility would be to make `String` implement `Deref<str>` and |
| 198 | +`Vec<T>` implement `Deref<[T]>`, once DST lands. Doing so would allow explicit |
| 199 | +coercions like: |
| 200 | + |
| 201 | +```rust |
| 202 | +fn use_slice(t: &[u8]) { ... } |
| 203 | + |
| 204 | +let v = vec!(0u8, 1, 2); |
| 205 | +use_slice(&*v) // take advantage of deref |
| 206 | +use_slice(v.as_slice()) // equivalent |
| 207 | +``` |
| 208 | + |
| 209 | +There are at least two downsides to doing so, however: |
| 210 | + |
| 211 | +* It is not clear how the method resolution rules will ultimately interact with |
| 212 | + `Deref`. In particular, a leading proposal is that for a smart pointer `s: Smaht<T>` |
| 213 | + when you invoke `s.m(...)` only *inherent* methods `m` are considered for |
| 214 | + `Smaht<T>`; *trait* methods are only considered for the maximally-derefed |
| 215 | + value `*s`. |
| 216 | + |
| 217 | + With such a resolution strategy, implementing `Deref` for `Vec` would make it |
| 218 | + impossible to use trait methods on the `Vec` type except through UFCS, |
| 219 | + severely limiting the ability of programmers to usefully implement new traits |
| 220 | + for `Vec`. |
| 221 | + |
| 222 | +* The idea of `Vec` as a smart pointer around a slice, and the use of `&*v` as |
| 223 | + above, is somewhat counterintuitive, especially for such a basic type. |
| 224 | + |
| 225 | +Ultimately, notation for slicing seems desireable on its own merits anyway, and |
| 226 | +if it can eliminate the need to implement `Deref` for `Vec` and `String`, all |
| 227 | +the better. |
0 commit comments