Skip to content

Programming Language Theory: Parsing#

Parsing/ASTs#

  • Different AST types
  • Homogenous: one type of node class ASTNode
  • Normalized Heterogenous: class Expr : public ASTNode { vector<ASTNode*> children; };
  • Irregular Heterogenous: named nodes class BinaryExpr : public Expr { Expr* left, right; };
  • Tree walkers
  • Tree Grammar: use grammar to generate walkers
    • Forced visitation order
    • useful for when every subtree must be touched (like codegen)
  • Tree Pattern Matcher: specifies patterns we care about
    • different phases of application care about different parts of tree
    • decouples order we apply tree patterns from tree patterns themselves
    • useful for term re-writing (change multiply to simd multiply)
  • Traversal orders
  • Preorder
  • In-order
  • Postorder

Insights#

  • split Tokenizing from Parsing
  • LR/LLR/etc is overcomplication. Just write parsers like normal code

TreeSitter#