Clone
1
Architecture
erick-alcachofa edited this page 2025-12-30 02:54:02 +00:00
Table of Contents
Compiler Architecture
Components
- Tokenizer: Coroutine-driven lexer that emits
Tokenvalues lazily, enabling lookahead and precise diagnostics for keywords, operators, literals, and comments. A trie-based keyword map plus demangling and string utilities keep error messages readable. - Hybrid Parser: Combines a handwritten recursive-descent parser for
high-level structure (imports, modules, aliases, declarations, statements) with
a Pratt (precedence-climbing) engine for expressions. Recent merges added
optional slice bounds (
[:end],[start:]), type-initiated expressions ([]Type { ... }), turbofish disambiguation in generics, and precedence capping so->works for both pointer member access andmatch/switchcases. - AST: Hierarchical node definitions under
lib/include/artichoke/Parser/ASTmodel compilation units, declarations, statements, expressions, and types. Visitors such astoString(Markdown) andtoDot(Graphviz) support visualization and debugging. - Frontend CLI:
frontend/src/main.cppnormalizes file paths, invokes the parser, and prints either the structured AST or descriptive diagnostics. - Support Utilities: Shared helpers (
Expected, trie map, string helpers, coroutine scaffolding, demangling) provide robust error propagation and ergonomics throughout the compiler.
Workflow
- Tokenizer lazily produces tokens via coroutines, supporting lookahead and rich diagnostics.
- Recursive-descent routines process declarations and statements, delegating to the Pratt engine for expressions. The parser constructs ASTs aligned with the formal grammar.
- Frontend emits ASTs (
ast::toString) or clear error messages when parsing fails.
Future Work
- Semantic analysis (type checking, symbol resolution) building on the expanded expression and type features already integrated.
- Intermediate representation and code generation backend.
- Tooling support: formatter, language server, extended automated tests.