Skip to content
andychu edited this page Dec 10, 2016 · 16 revisions

What algorithms and tools do production-quality languages use for parsing?

  • lexing
  • parsing, including operator precedence parsing
  • AST representation

TODO: Add links

POSIX Shells

  • bash: multiple parsers.
    • yacc is used for the command language (judged to be a mistake in retrospect).
    • recursive descent (Wirth-style) is used for arithmetic, i.e. $(( ))
    • recursive descent for boolean language (in parse.y)
    • ad hoc parser for variable/word language
  • dash: hand-written lexer, recursive descent parser, generated AST nodes
  • mksh: stateful lexer
  • zsh:

Alternative Shells

See ExternalResources

PowerShell?

Build Tools

  • Ninja: re2c for lexer (for speed, used to be hand-coded), recursive descent for parser.

Other Language Implementaitons

  • Python
    • hand-coded lexer with indentation stack
    • generated parser
    • generated AST, with Zephyr ASDL
  • Ruby
  • Perl
  • Julia: lexer and parser are hand-written in femtolisp! Enables Julia macros.
  • v8
  • Clang -- hand-written parser, enormous hand-written AST with C++ classes
  • Go: hand-written C that was automatically converted to Go
  • CoffeeScript: hand-coded lexer with regexes, token "fixups", JISON bottom up parser
Clone this wiki locally