1
Architecture
erick-alcachofa edited this page 2025-12-30 02:54:02 +00:00

Compiler Architecture

Components

  • Tokenizer: Coroutine-driven lexer that emits Token values lazily, enabling lookahead and precise diagnostics for keywords, operators, literals, and comments. A trie-based keyword map plus demangling and string utilities keep error messages readable.
  • Hybrid Parser: Combines a handwritten recursive-descent parser for high-level structure (imports, modules, aliases, declarations, statements) with a Pratt (precedence-climbing) engine for expressions. Recent merges added optional slice bounds ([:end], [start:]), type-initiated expressions ([]Type { ... }), turbofish disambiguation in generics, and precedence capping so -> works for both pointer member access and match/switch cases.
  • AST: Hierarchical node definitions under lib/include/artichoke/Parser/AST model compilation units, declarations, statements, expressions, and types. Visitors such as toString (Markdown) and toDot (Graphviz) support visualization and debugging.
  • Frontend CLI: frontend/src/main.cpp normalizes file paths, invokes the parser, and prints either the structured AST or descriptive diagnostics.
  • Support Utilities: Shared helpers (Expected, trie map, string helpers, coroutine scaffolding, demangling) provide robust error propagation and ergonomics throughout the compiler.

Workflow

  1. Tokenizer lazily produces tokens via coroutines, supporting lookahead and rich diagnostics.
  2. Recursive-descent routines process declarations and statements, delegating to the Pratt engine for expressions. The parser constructs ASTs aligned with the formal grammar.
  3. Frontend emits ASTs (ast::toString) or clear error messages when parsing fails.

Future Work

  • Semantic analysis (type checking, symbol resolution) building on the expanded expression and type features already integrated.
  • Intermediate representation and code generation backend.
  • Tooling support: formatter, language server, extended automated tests.