Open
Description
#[derive(Clone, Debug, Eq, PartialEq)]
struct Utf8CharDecoder(...);
impl Utf8CharDecoder {
fn new() -> Self { todo!() }
fn feed(&mut self, byte: u8) -> Option<Result<char, SomeError>> { todo!() }
fn finish(self) -> Result<(), SomeError> { todo!() }
fn reset(&mut self) { todo!() }
// Something for decomposing into the inner partial bytes
// Something for testing whether there are currently partial bytes stored
// Something for querying how many continuation bytes are needed to complete the current character
}
-
Also add a lossy variant
-
Error conditions that the error type must cover:
- Codepoint is greater than 0x10FFFF
- Codepoint is a surrogate character
- Non-canonical encoding of codepoint (e.g.,
0b1100_0000 0b1000_0000
for the NUL byte) - Partial UTF-8 sequence followed by non-continuation byte (i.e., ASCII char or new start byte)
- Continuation byte encountered without preceding matching start byte
-
On an error, reset the decoder to the initial state?