-
Notifications
You must be signed in to change notification settings - Fork 97
Open
Labels
Description
I've stripped down my example to the following grammar, expressed in English: source text can contain multiple items separated by newline or comment (double slash //
), each item is identifier followed by whitespace-separated numbers.
Here are tree versions of a grammar:
use combine::parser::char::{digit, space, letter};
use combine::parser::repeat::{repeat_until};
use combine::{Stream, Parser, EasyParser};
use combine::{eof, token, many1, sep_by, value};
use combine::{many, skip_many1, attempt};
fn id<I: Stream<Token=char>>() -> impl Parser<I, Output=String> {
many(letter())
}
fn ws<I: Stream<Token=char>>() -> impl Parser<I, Output=()> {
skip_many1(space())
}
fn num<I: Stream<Token=char>>() -> impl Parser<I, Output=String> {
many1(digit())
}
fn comment<I: Stream<Token=char>>() -> impl Parser<I, Output=()> {
attempt((token('/'), token('/')).silent()).with(value(()))
}
fn newline<I: Stream<Token=char>>() -> impl Parser<I, Output=()> {
token('\n').with(value(())).expected("newline")
}
fn main() {
let mut parser1 = many::<Vec<_>, _, _>(
id()
.and(many::<Vec<_>, _, _>(ws().with(num())))
.and(comment().or(newline())),
);
let mut parser2 = many::<Vec<_>, _, _>(
id()
.and(repeat_until::<Vec<_>, _, _, _>(
ws().with(num()),
comment().or(newline()),
))
.and(comment().or(newline())),
);
let mut parser3 = many::<Vec<_>, _, _>(
id()
.skip(ws())
.and(sep_by::<Vec<_>, _, _, _>(num(), ws()))
.and(comment().or(newline()))
);
let s = r#"a 123/2"#;
let err1 = parser1.easy_parse(s)
.map_err(|e| e.map_position(|p| p.translate_position(s)))
.unwrap_err();
let err2 = parser2.easy_parse(s)
.map_err(|e| e.map_position(|p| p.translate_position(s)))
.unwrap_err();
let err3 = parser3.easy_parse(s)
.map_err(|e| e.map_position(|p| p.translate_position(s)))
.unwrap_err();
println!("{}\n{}\n{}", err1, err2, err3);
}
The output is:
Parse error at 6
Unexpected `2`
Unexpected `/`
Expected `whitespace`, `digit` or `newline`
Parse error at 5
Unexpected ` `
Expected `letter`
Parse error at 6
Unexpected `2`
Unexpected `/`
Expected `whitespace` or `newline`
Note in variant 1:
- Two unexpected's,
/
is at wrong position,2
is not the erroneous character. Looks like a bug? - Position is the position of the character after the erroneous one
- Expected
digit
is wrong. There needs to be whitespace between (or newline, or comment which is silenced)
Note in variant 2:
- Unexpected space is at a different position.
- Erroneous position is (surprisingly) right
letter
can't be here, note that even if I remove the outermostmany
(i.e. only support single item, so there are no letters possible after initial whitespace), this parser also reportsletter
.
Note in variant 3:
- Same issues as with "variant 1" for position and "unexpected"s
- "expected" set is fine
Are there any bugs, or am I misunderstanding parsers somehow? Also why there is such a difference between sep_by
, repeat_until
and many
?