Skip to content

Parser generates invalid URLs #379

Closed
@LEW21

Description

@LEW21

https://url.spec.whatwg.org/commit-snapshots/a1b789c6b6c36fcdb16311da5abd177e84151fca/#url-parsing

For each byte in buffer:​

If byte is less than 0x21 (!), greater than 0x7E (~), or is 0x22 ("), 0x23 (#), 0x3C (<), or 0x3E (>), append byte, percent encoded, to url’s query.

Otherwise, append a code point whose value is byte to url’s query.

This leads to creation of invalid URLs - ones that contain [, \, ], ^, `, {, |, }, which are neither URL code points nor '%' and trigger validation errors:

Otherwise:

If c is not a URL code point and not U+0025 (%), validation error.

I think that either:

  • the list of valid query characters should be expanded to include more characters, or
  • these ones should be escaped too.

Related issues: #378, #17

Metadata

Metadata

Assignees

No one assigned

    Labels

    topic: parsertopic: validationPertaining to the rules for URL writing and validity (as opposed to parsing)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions