Compare commits

..

No commits in common. "main" and "parser-dev" have entirely different histories.

13 changed files with 282 additions and 1300 deletions

View File

@ -3,8 +3,6 @@
> very unstable.
> * There will likely be breaking changes and periods where no work is done on
> the project.
> * Expect breaking changes as the compiler progresses through semantic analysis
> and code generation.
# The `artichoke` Programming Language
@ -17,8 +15,8 @@ The goal of `artichoke` is to provide a language that is simple, safe, and
productive for programming, eliminating common pitfalls without sacrificing
performance or control.
For a detailed guide to the language, grammar specification, and syntax
features, please see the [project wiki](https://git.artichoke.dev/me/artichoke-lang/wiki)
For a detailed guide to the language, please see the
[project wiki](https://git.artichoke.dev/me/artichoke-lang/wiki).
## Core Philosophy & Features
@ -28,8 +26,6 @@ productive programming experience:
* **Explicitness:** Type conversions and error handling are explicit.
* **Safety:** Non-nullable pointers, a robust type system, and deterministic
resource management are prioritized.
* **Unambiguous Design:** A grammar designed for fast, single-pass parsing and
clear error reporting.
* **Modern Ergonomics:** Features like generics, defer, and a clean module
system reduce boilerplate and improve readability.
@ -39,19 +35,9 @@ true **module system**, and **compile-time reflection**.
## Project Status
`artichoke` is currently in the **early implementation phase**. The front-end
infrastructure is not yet defined but contains a simple program for printing and
visualizing the generated AST, development has shifted now toward semantic
validation.
- [x] **Lexical Analysis:** Full tokenizer implementation.
- [x] **Syntactic Analysis:** Handwritten Recursive Descent + Pratt Expression
Parser.
- [x] **AST Infrastructure:** Complete Abstract Syntax Tree with Graphviz
and String-Graph based visualization support.
- [ ] **Semantic Analysis (In Progress):** Multi-pass symbol table generation
and type checking.
- [ ] **Backend:** Code generation and optimization.
`artichoke` is currently in the ***design and grammar-specification phase***. The
grammar is stable, and the next step is the implementation of a compiler
(parser, semantic analyzer, and code generator).
## Building from Source
@ -66,13 +52,13 @@ cmake -DCMAKE_BUILD_TYPE=Release -S . -B build
# Build the project
cmake --build build
# Run the compiler frontend binary
./build/frontend/artichoke-c <input_file>
# Run the binary
./build/frontend/artichoke-c
# Run the tests if enabled
ctest --test-dir build/tests --output-on-failure
# Install library and frontend binary if wanted
# Install if wanted
cmake --install build --prefix=/usr/local
# Run the installed binary

View File

@ -1,40 +0,0 @@
# Compiler Architecture
## Components
- **Tokenizer:** Coroutine-driven lexer that emits `Token` values lazily,
enabling lookahead and precise diagnostics for keywords, operators, literals,
and comments. A trie-based keyword map plus demangling and string utilities
keep error messages readable.
- **Hybrid Parser:** Combines a handwritten recursive-descent parser for
high-level structure (imports, modules, aliases, declarations, statements) with
a Pratt (precedence-climbing) engine for expressions. Recent merges added
optional slice bounds (`[:end]`, `[start:]`), type-initiated expressions
(`[]Type { ... }`), turbofish disambiguation in generics, and precedence capping
so `->` works for both pointer member access and `match`/`switch` cases.
- **AST:** Hierarchical node definitions under `lib/include/artichoke/Parser/AST`
model compilation units, declarations, statements, expressions, and types.
Visitors such as `toString` (Markdown) and `toDot` (Graphviz) support
visualization and debugging.
- **Frontend CLI:** `frontend/src/main.cpp` normalizes file paths, invokes the
parser, and prints either the structured AST or descriptive diagnostics.
- **Support Utilities:** Shared helpers (`Expected`, trie map, string helpers,
coroutine scaffolding, demangling) provide robust error propagation and
ergonomics throughout the compiler.
## Workflow
1. Tokenizer lazily produces tokens via coroutines, supporting lookahead and
rich diagnostics.
2. Recursive-descent routines process declarations and statements, delegating to
the Pratt engine for expressions. The parser constructs ASTs aligned with the
formal grammar.
3. Frontend emits ASTs (`ast::toString`) or clear error messages when parsing
fails.
## Future Work
- Semantic analysis (type checking, symbol resolution) building on the expanded
expression and type features already integrated.
- Intermediate representation and code generation backend.
- Tooling support: formatter, language server, extended automated tests.

View File

@ -1,19 +0,0 @@
# Sample Program Overview
This section highlights the language features exercised by the canonical
overview program distributed with the project.
- Imports: module wildcards, specific symbols, and module aliases.
- Type aliases with `using` for types and functions.
- Generics: struct definitions with `<typename T>`, turbofish instantiations (e.g., `Point::<i32>`).
- Functions: regular functions and methods (`this` parameter syntax), return types via `->`.
- Enums: tagged unions with `Result::<T, E>` usage and variant initialization (`Err{ -1 }`, `Ok{}`).
- Variables: `let`/`def` with type inference, complex pointer/optional qualifiers (`*$?` combinations).
- Slices: literals `[]Type { ... }`, slicing syntax `[start:end]`, specialized suffixes (`.*`, `.#`, `.[len]`).
- Control flow: if/else with unwrapping, while loops (condition-based and iterator-style), do/while, C-style for loops, range for loops, labeled loops.
- Pattern matching: `match` with bindings, `_` default; `switch` for value cases.
- Resource management: `defer`, `errdefer` for cleanup semantics.
- Reflection: `. @` operator to fetch metadata (`.@`, `. @alignment`, `. @size`).
These features can be explored by running the parser CLI against any `artichoke`
source file to inspect the resulting AST or diagnostics.

View File

@ -1,55 +0,0 @@
# Getting Started
Build and run the `artichoke` parser frontend to experiment with the language
features described in this documentation.
## Prerequisites
- C++23 compiler (tested with Clang 17/GCC 13).
- CMake 3.26+.
- Ninja or Make.
- Optional: `ctest` for tokenizer tests.
## Build the Toolchain
```bash
cmake -S . -B build -DCMAKE_BUILD_TYPE=Debug
cmake --build build
```
The executable `build/frontend/artichoke-c` reads a source file, parses it, and
prints a Markdown AST or diagnostics.
## Run the Parser
```bash
./build/frontend/artichoke-c path/to/program.arti
```
The CLI prints either a Markdown AST or descriptive diagnostics.
## Run Tests
Tokenizer tests live under `tests/Tokenizer/`.
```bash
cmake --build build --target tests
ctest --test-dir build/tests --output-on-failure
```
Enable testing during configuration with `-DENABLE_TESTING=ON`.
## Repository Layout
- `frontend/` CLI entry point.
- `lib/` Tokenizer, parser, AST, and utilities.
- `tests/Tokenizer/` Tokenization coverage.
- `docs/` Reference programs and supporting materials.
## Next Steps
- Review the [Sample Programs](Examples-SamplePrograms.md) and overview guides
to understand the language.
- Dive into [Language Overview](Language-Overview.md) and
[Control Flow](Language-ControlFlow.md) for targeted explanations.
- Use [Architecture](Architecture.md) if you plan to extend the compiler.

View File

@ -1,81 +1,286 @@
# `artichoke` Language Wiki
# The `artichoke` Programming Language: A Technical Overview
`artichoke` is a modern, statically-typed programming language designed to
satisfy my personal preferences and requirements for programming, combining the
low-level control and powerful modern features like a robust type system,
generics, integrated error handling, and a clean, ergonomic syntax.
## 1. Introduction
The goal of `artichoke` is to provide a language that is simple, safe, and
productive for programming, eliminating common pitfalls without sacrificing
performance or control.
`artichoke` is a statically-typed, general-purpose programming language designed
with an emphasis on performance, safety, and expressive syntax. It combines
low-level control over memory with modern, high-level features like generics,
algebraic data types, and integrated error handling. This document provides an
overview of the language's features as defined by its core grammar.
## Using This Wiki
Is highly inspired by C, C++, Rust, and mostly Zig.
Start with [Getting Started](GettingStarted.md) to build and run the parser.
Continue with the language guide and control-flow chapters for deeper dives into
syntax and semantics. The reference section contains the formal grammar and
token catalog, while the sample programs illustrate how features fit together.
Report any gaps or inconsistencies via issues or patches.
## 2. Basic Syntax & Structure
## Quick Links
### Modules, Imports, and Aliases
- **Getting Started:** [Getting Started](GettingStarted.md)
- **Language Guide:** [Language Overview](Language-Overview.md)
- **Control Flow:** [Control Flow](Language-ControlFlow.md)
- **Expressions & Operators:** [Expressions & Operators](Language-Expressions.md)
- **Pattern Unwrapping:** [Patterns](Language-Patterns.md)
- **Grammar Reference:** [Grammar Reference](Reference-Grammar.md)
- **Token Reference:** [Token Reference](Reference-Tokens.md)
- **Architecture Overview:** [Architecture](Architecture.md)
- **Sample Programs:** [Sample Programs](Examples-SamplePrograms.md)
`artichoke` code is organized into modules. The `import` statement is used to bring
symbols from other modules into the current scope.
## Core Philosophy & Features
* **Importing a specific element:** `import my_module::some_function;`
* **Importing all direct elements of a module:** `import std::*;`
* **Importing an entire submodule:** `import std::memory;`
`artichoke` is built around a few core principles to create a safer, more
productive programming experience:
The `using` keyword creates a local, more convenient alias for a type, function,
or module name.
* **Explicitness:** Type conversions and error handling are explicit.
* **Safety:** Non-nullable pointers, a robust type system, and deterministic
resource management are prioritized.
* **Unambiguous Design:** A grammar designed for fast, single-pass parsing and
clear error reporting.
* **Modern Ergonomics:** Features like generics, defer, and a clean module
system reduce boilerplate and improve readability.
```
using mem = std::memory;
using FileHandle = std::fs::File;
```
The language includes a powerful **generic type system**, first-class **error
handling**, a full suite of **control flow** statements (including match), a
true **module system**, and **compile-time reflection**.
### Comments
## Project Status
The language uses C-style block comments.
`artichoke` is currently in the **early implementation phase**. The front-end
infrastructure is not yet defined but contains a simple program for printing and
visualizing the generated AST, development has shifted now toward semantic
validation.
```
/* This is a multi-line
comment. */
```
- [x] **Lexical Analysis:** Full tokenizer implementation.
- [x] **Syntactic Analysis:** Handwritten Recursive Descent + Pratt Expression
Parser.
- [x] **AST Infrastructure:** Complete Abstract Syntax Tree with Graphviz and
String-Graph based visualization support.
- [ ] **Semantic Analysis (In Progress):** Multi-pass symbol table generation
and type checking.
- [ ] **Backend:** Code generation and optimization.
## 3. The Type System
## Contributing
`artichoke`'s type system is strong and static, with a rich set of features for
defining complex data structures.
The `artichoke` project is hosted on a personal, self-hosted Gitea instance. If
you are interested in contributing, you have two options:
### Type Qualifiers
1. **Request an Account:** Please contact support@artichoke.dev to request an
account on the Gitea instance.
2. **Submit Patches:** Alternatively, you can send patches or diffs directly to
the same email address.
Qualifiers modify the type to their immediate right, allowing for precise and
complex type definitions.
In all cases, proper attribution will be given for your contributions in the
source files and/or the project wiki.
* **`*` (Pointer):** Creates a pointer to a type. Pointers cannot be `null`.
* **`$` (Mutable):** Marks a type as mutable. This is used for function
parameters, local variables, and struct fields to allow modification.
* **`?` (Optional):** Marks a type as nullable. An optional type can hold either a
value of its underlying type or `null`.
* **`[]` (Slice):** A "fat pointer" representing a view into a contiguous
sequence of elements. It contains both a pointer to the data and a length.
## License
These qualifiers can be combined. For example, `*$?int` defines a **pointer to a
mutable optional integer**.
This project is licensed under the **GNU Affero General Public License v3.0**.
The full license text can be found in the LICENSE file in this repository.
### Generics
Generics allow for writing flexible, reusable code that can operate on multiple
types. They are defined using `<typename T>`.
```
/* A generic struct */
struct Point<typename T> {
x: T,
y: T
}
/* A generic function */
fn scale<typename T>(lhs: *Point<T>, rhs: T) -> Point {
/* ... */
}
```
## 4. Declarations
### Variables
Variables are declared using the `let` (mutable) and `def` (immutable/constant)
keywords.
* **Type inference** is supported when the type can be determined from the initializer.
* Variables must be initialized with either a type, a value, or both.
```
/* Mutable variable with explicit type */
let x: i32 = 10;
/* Immutable variable with type inference */
def do_you_get_it = meaning_of_life();
```
### Structs
Structs are composite data types that group together variables under one name.
They support generics.
```
struct Rectangle {
top: Point<i32>,
bot: Point<i32>
}
```
**Initialization:** Structs can be initialized using positional or named fields,
but not a mix of both.
```
/* Positional initialization */
def top_left = Point<i32>{ 0, 10 };
/* Named-field initialization */
def top_right = Point<i32>{ x: 10, y: 10 };
```
### Enums (Tagged Unions)
Enums define a type that can be one of several different variants. Variants can
optionally hold data.
```
enum AssetType {
Texture,
Model,
Sound,
}
enum Result<typename T, typename E> {
Ok(T),
Err(E)
}
```
**Initialization:** Enum variants are accessed using scope resolution (`::`).
```
def my_asset = AssetType::Texture;
def success = Result<i32, string>::Ok(100);
```
### Functions
Functions are defined with the fn keyword. The return type is specified after
the parameter list with `->`.
```
fn meaning_of_life() -> i32 {
return 42;
}
```
#### Member Functions (`this` parameter)
If the first parameter of a function is declared with the `this` keyword, it can
be called using "member function" syntax.
```
/* Definition */
fn add<typename T>(this *$Point<T>, other: *Point<T>) {
this->x += other->x;
this->y += other->y;
}
/* Can be called in two ways: */
/* Member function syntax */
my_point.add(&other_point);
/* Normal function syntax */
add(&my_point, &other_point);
```
## 5. Control Flow
### `if`/`else` Statements
`artichoke` supports C-style `if`/`else` and `else if` chains. It also integrates a
powerful unwrapping feature for handling `Result` and optional (`?`) types.
```
/* Standard if/else */
if (argc < 2) {
return Result::Err(-1);
}
/* Unwrapping a Result */
if (foo()) |ok| {
/* `ok` holds the success value */
}
else |err| {
/* `err` holds the error value */
}
```
### Loops
The language provides a comprehensive set of looping constructs.
* **C-Style `for`:** `for (let i \= 0; i \< 10; i \+= 1\) { ... }`
* **Range-based `for`:** `for (let e := arrSlice) { ... }`
* **`while` Loop:** Can optionally have an `else` block that executes when the loop
condition is no longer met.
* **Iterator `while`:** Supports unwrapping `Result`/optional types, executing as
long as the value is valid.
* **`do-while` Loop:** Guarantees the body executes at least once.
* **Infinite `loop`:** `loop { ... }`
#### Loop Labels and Control
Loops can be labeled. The `break` and `continue` statements can optionally specify a
label to control nested loops.
```
outer_loop := while (condition) {
inner_loop := for (...) {
break outer_loop;
}
}
```
## 6. Expressions and Operators
### Pointer and Member Access
* **`&` (Address-of):** Gets a pointer to a variable.
* **`*` (Dereference):** Accesses the value a pointer points to.
* **`.` (Member Access):** Accesses a member of a struct value.
* **`->` (Pointer Member Access):** Dereferences a pointer and accesses a member
(`p->x` is shorthand for `(*p).x`).
### Slice Operators
Slices have a dedicated set of operators for manipulation.
* **`[start:end]` (Slicing):** Creates a new slice from an existing one.
* **`.*` (Pointer Access):** Gets the underlying raw pointer of the slice.
* **`.#` (Length Access):** Gets the number of elements in the slice.
* **`.[length]` (Slice from Pointer):** Creates a slice from a raw pointer and a length.
### Assignment
The language supports simple (`=`) and compound assignment (`+=`, `*=`, etc.)
operators.
## 7. Advanced Features
### Resource Management (`defer` and `errdefer`)
`artichoke` uses `defer` for deterministic resource management.
* **`defer`:** Schedules an expression or code block to be executed when the
current scope is exited. Deferred calls are executed in Last-In, First-Out
(LIFO) order.
* **`errdefer`:** Similar to `defer`, but the code is only executed if the scope is
exited due to a function returning an error (an `Err` variant of a `Result`).
```
defer call_cleanup();
errdefer {
log("An error occurred!");
}
```
### Reflection (`.@`)
The language provides a compile-time reflection mechanism via the `.@` operator.
It can be applied to values, types, and static members to query metadata.
* **On values:** `my_variable.@type`
* **On types:** `Point<u32>.@size, Point<u32>.@alignment`
* **On static members:** `Point<u32>::x.@offset`
```
/* Gets size in bytes */
def size_bytes = Point<u32>.@size;
/* Gets string representation of the type */
def point_name = Point<u32>.@typename;
```

View File

@ -1,123 +0,0 @@
# Control Flow
This section outlines the control-flow constructs currently supported by the
`artichoke` parser, including variable declarations, loops, pattern unwrapping,
and resource management.
## Variable Declarations
```arti
let x: i32 = 10;
let answer = meaning_of_life();
def PI: f64 = 3.14159265358979;
def ptr: *i32 = &answer;
let mutable_pointer: *$i32 = &x;
let complex_pointer: *$*$i32 = &mutable_pointer;
let null_int: ?i32 = null;
```
- `let` declares mutable bindings; `def` declares immutable ones.
- Pointer/mutability/optional qualifiers (`*`, `$`, `?`) attach immediately to
the type on their right.
## `if` / `else`
```arti
if (foo()) |ok_val| {
/* success path */
} else |err_val| {
/* error path */
}
if (condition) {
/* then */
} else if (other_condition) {
/* else-if */
} else {
/* final branch */
}
```
- Unwrap clauses (`|name|`) bind `Result` or optional values for the block.
- Parentheses around conditions are required.
## `match`
```arti
match (foo()) {
Result::<i32, []u8>::Ok |v| -> {
std::io::print("Success!");
}
_ -> {
/* default */
}
}
```
- Patterns accept type expressions and optional bindings.
- `_` handles unmatched cases.
## `switch`
```arti
switch (value) {
0 -> { /* ... */ }
(1 + 2) -> { /* ... */ }
_ -> { /* ... */ }
}
```
- Value-based branching for expressions.
## Loops
```arti
while (foo()) |ok_val| {
/* loop while Ok */
} else |err_val| {
/* handles Err */
}
while (foo.next()) |item| {
/* iterator-style loop */
}
do {
/* body */
} while (true);
for (let i = 0; i < 10; i += 1) {
/* C-style loop */
}
for (let element := returns_range_function()) {
/* range loop */
}
outer_loop := while (condition) {
inner_loop := for (let i = 0; i < 10; i += 1) {
if (i == 5) { break outer_loop; }
}
}
```
- Range loops require `:=` and bind the element name using `let` or `def`.
- Labels (`outer_loop :=`) allow `break`/`continue` to target outer loops.
## Defer & errdefer
```arti
defer cleanup();
errdefer { log_failure(); }
```
- `defer` runs at scope exit in reverse order.
- `errdefer` runs only if the function returns an error variant.
## Return and Expressions
- `return expr;` or `return;` (when void-like).
- Any expression followed by `;` forms a statement.
See `docs/example.arti` for the full program showcasing these constructs.

View File

@ -1,107 +0,0 @@
# Expressions & Operators
`artichoke` uses a Pratt-style expression parser supporting rich infix, prefix,
and postfix syntax. This section summarizes the key behaviors currently
implemented.
## Literals
- Numeric literals: `42`, `3.14159`, `10`.
- Character/boolean/null: `'a'`, `true`, `false`, `null`.
- Strings follow double-quoted C-style syntax with escapes.
All literal tokens map to dedicated AST nodes (`CharLiteral`, `NullLiteral`,
`StringLiteral`, `FloatLiteral`, `IntegerLiteral`, `BooleanLiteral`).
## Identifiers and Module Access
- Simple identifiers refer to variables or functions: `x`, `meaning_of_life`.
- Namespaced access uses `::`: `Result::<void, i32>::Err`, `std::memory`.
## Function Calls and Methods
```arti
meaning_of_life();
scale(&point, 2);
block.initialize(2048);
```
- Turbofish syntax applies at call sites when generics are involved.
- Methods (declared with `this`) can be invoked as member calls (`expr.method`)
or as regular functions (`method(expr, ...)`).
## Operators and Precedence
`artichoke` uses Pratt parsing with the following precedence (lowest to highest):
1. Assignment: `=`, `+=`, `-=`, `*=`, `/=`, `%=`
2. Boolean OR: `or`, `||`
3. Boolean AND: `and`, `&&`
4. Comparisons: `==`, `!=`, `<`, `>`, `<=`, `>=`
5. Bitwise OR/XOR/AND: `|`, `^`, `&`
6. Shifts: `<<`, `>>`
7. Addition/Subtraction: `+`, `-`
8. Multiplication/Division/Modulo: `*`, `/`, `%`
9. Prefix: `!`, `-`, `~`, `&`, `*`
10. Postfix and suffix operators
The sample program demonstrates complex precedence:
```arti
let calculation = ~5 + 10 * 2 / (length - 1) % 4 << 2 >> 1;
let logic_check = (calculation >= 100 or !true) and (length != 0);
x = y = length += 10;
```
- Assignment chains associate right-to-left.
- Parentheses override precedence as expected.
- Boolean aliases (`or`, `and`) behave like `||`, `&&`.
## Postfix Operators
- `slice[index]` and `slice[start:end]` for indexing and slicing.
- `slice.*` to retrieve the raw pointer.
- `slice.#` to obtain length.
- `ptr.[len]` to form a slice from pointer + length.
- `value.member`, `value->member` for object and pointer member access.
- `value.@`, `Type::member.@attribute` for reflection.
- `Type::<T>{ ... }` for object literals (named initializers).
These suffixes can be chained, e.g.,
`optional_ptr->slice[other.# - 1].member_func(list.*, 2).data[0]`.
## Object Literals
```arti
Point::<T> {
.x = lhs->x * rhs,
.y = lhs->y * rhs
}
```
- Named initializer syntax `.field = expr` is used consistently to emphasize
readability and order independence.
## Reflection
```arti
foo.@;
Point::<u32>::x.@alignment;
Point::<u32>.@size;
```
- Reflection works on values, types, and struct members, returning metadata used
by introspection tools.
## Error Handling Expressions
- `Result` values are constructed with variant initializers (`Result::<void, i32>::Err{ -1 }`).
- Unwrapping happens in control-flow statements.
## AST Rendering
- `ast::toString` produces the Markdown AST dumps emitted by the CLI; these
align with the structures implied by the example program.
These behaviors are reflected in the AST output produced by the parser.

View File

@ -1,142 +0,0 @@
# Language Overview
Summarizes the core syntax and semantics supported in the current
parser-focused phase of the language.
## Imports and Aliases
```arti
import std::memory;
import std::*;
import my_module::some_function;
import my_module::some_typename;
using mem = std::memory;
using malloc = mem::mem_alloc;
using my_type = my_module::some_typename;
using my_func = my_module::some_function;
```
- `import module::symbol;` brings a specific symbol into scope.
- `import module::*;` imports all direct children of `module` (not recursive).
- `using` introduces aliases for modules, types, or functions.
## Structs and Generics
```arti
struct Point<typename T> {
x: T,
y: T
}
struct Rectangle {
top: Point::<i32>,
bot: Point::<i32>
}
```
- Generic definitions use `<typename T>`.
- Instantiations require `::<>` (turbofish) to disambiguate from comparisons.
- Fields use `name: Type` syntax.
## Functions and Methods
```arti
fn meaning_of_life() -> i32 {
return 42;
}
fn scale<typename T>(lhs: *Point::<T>, rhs: T) -> Point::<T> {
return Point::<T> {
.x = lhs->x * rhs,
.y = lhs->y * rhs
};
}
fn add<typename T>(this *Point::<T>, other: *Point::<T>) {
this->x += other->x;
this->y += other->y;
}
```
- Return types follow the parameter list via `->`.
- Methods use `this <type>` as the first parameter, enabling both member and
free-function call styles.
## Enums and Variants
```arti
enum Result<typename T, typename E> {
Ok(T),
Err(E)
}
return Result::<void, i32>::Err{ -1 };
return Result::<void, i32>::Ok{};
```
- `Result` demonstrates tagged unions with data payloads.
- Variants initialize with braces, optionally containing payloads.
## Variables, Pointers, Qualifiers
```arti
let x: i32 = 10;
let answer = meaning_of_life();
def PI: f64 = 3.14159265358979;
def ptr: *i32 = &answer;
let mutable_pointer: *$i32 = &x;
let complex_pointer: *$*$i32 = &mutable_pointer;
let null_int: ?i32 = null;
```
- `let` for mutable, `def` for immutable bindings.
- Qualifiers `*`, `$`, `?` apply to the immediate type to the right and can be
combined to express rich pointer semantics.
## Slices and Literals
```arti
let arrSlice: ?[]i32 = []i32 { 2, 4, 6, 8, 10 };
let full = arrSlice[:];
let range = arrSlice[1:3];
let head = arrSlice[:2];
let tail = arrSlice[2:];
let memPtr = arrSlice.*;
let memLength = arrSlice.#;
let newSlice = memPtr.[memLength];
```
- `[]Type { ... }` constructs slice literals.
- Slicing syntax mirrors Python with optional start/end.
- Specialized suffixes:
- `expr.*` raw pointer;
- `expr.#` length;
- `ptr.[len]` create slice from pointer + length.
## Reflection
```arti
def refl_info = foo.@;
def xalign = Point::<u32>::x.@alignment;
def type_size = Point::<u32>.@size;
```
- `. @` yields metadata for values, types, or struct members.
- Attributes include `@alignment`, `@size`, `@typename`, `@offset`.
## Resource Management
```arti
defer cleanup();
errdefer { log_failure(); }
```
- `defer` schedules work at scope exit (LIFO order).
- `errdefer` runs only when the function returns an error variant.
These constructs appear throughout idiomatic `artichoke` code and are supported by
the current parser.

View File

@ -1,80 +0,0 @@
# Pattern Unwrapping & Binding
`artichoke` supports unwrapping `Result` and optional values directly within
control-flow constructs. This section describes the available patterns.
## `if` / `else`
```arti
if (foo()) |ok_val| {
/* Ok branch */
} else |err_val| {
/* Err branch */
}
```
- Using `|name|` after the condition binds the success value (or error value in
the `else` branch) for the scope of that block.
- Works with any type that returns `Result` or `?` (optional) values.
## `while` Patterns
```arti
while (foo()) |ok_val| {
/* Loop continues while Ok */
} else |err_val| {
/* Executes on Err */
}
while (foo.next()) |item| {
/* Iterator-style loop until optional becomes empty */
}
```
- The first form keeps looping while the expression yields `Ok`.
- The iterator-style variant continues while the optional contains a value.
## `match` Cases
```arti
match (foo()) {
Result::<i32, []u8>::Ok |v| -> {
std::io::print("Success!");
}
_ -> { /* Default */ }
}
```
- Cases accept type expressions and optional bindings (`|v|`).
- `_` handles the default/remaining patterns.
## Range Loop Binding
```arti
for (let element := returns_range_function()) {
/* element is bound for each iteration */
}
```
- Range loops bind the element name chosen in the header.
## Labels
```arti
outer_loop := while (condition) {
inner_loop := for (let i = 0; i < 10; i += 1) {
if (i == 5) { break outer_loop; }
}
}
```
- Labels let you control nested loops using `break label;` or `continue label;`.
## Error Reporting
- When an unwrap clause is malformed (missing pipes, invalid identifier) the
parser emits diagnostics indicating the expected syntax, helping align code
with the documented patterns.
These patterns appear throughout typical `artichoke` code and are supported by the
current parser.

View File

@ -1,368 +0,0 @@
# Grammar Reference
Formal grammar aligned with the current parser implementation.
```
/*
================================================================================
| |
| The Artichoke Programming Language |
| Official EBNF Grammar |
| |
================================================================================
*/
/* --- Program Structure --- */
/* A program is a sequence of top-level declarations and statements. */
<program> =
<declaration>*
<eof>
<declaration> =
"export" <exportable_declaration>
| <non_exportable_declaration>
<exportable_declaration> =
<module_statement>
| <struct_declaration>
| <enum_declaration>
| <function_declaration>
<non_exportable_declaration> =
<import_statement>
| <alias_statement>
| <module_statement>
| <struct_declaration>
| <enum_declaration>
| <function_declaration>
<module_statement> =
"module" <namespaced_identifier> "{"
( <module_statement>
| <alias_statement>
| <struct_declaration>
| <enum_declaration>
| <function_declaration> )*
"}"
<import_statement> =
"import" <import_target> ";"
<import_target> =
<namespaced_identifier> ( "::" "*" )?
<alias_statement> =
"using" <identifier> "=" <type> ";"
/* --- Declarations --- */
/* Rules for defining functions, structs, enums, and their components. */
<function_declaration> =
"fn" <identifier> <generic_params>? "(" <fn_params_list>? ")" ( "->" <type> )? <code_block>
<fn_params_list> =
"this" <type> ("," <fn_param> ( "," <fn_param> )* )?
| <fn_param> ( "," <fn_param> )*
<fn_param> =
<identifier> ":" <type>
<struct_declaration> =
"struct" <identifier> <generic_params>? "{" <struct_members> "}"
<struct_members> =
<struct_member> ( "," <struct_member> )*
<struct_member> =
<identifier> ":" <type>
<enum_declaration> =
"enum" <identifier> <generic_params>? "{" <enum_members> "}"
<enum_members> =
<enum_member> ( "," <enum_member> )*
<enum_member> =
<identifier> ( "(" <type> ")" )?
<generic_params> =
"<" <generic_params_list> ">"
<generic_params_list> =
<generic_param> ( "," <generic_param> )*
<generic_param> =
"typename" <identifier>
/* --- Statements & Control Flow --- */
/* Rules for code blocks, variable declarations, and control structures. */
<code_block> =
"{" <statement>* "}"
<statement> =
<variable_declaration> ";"
| <if_statement>
| <defer_statement> ";"
| <errdefer_statement> ";"
| <return_statement> ";"
| <break_statement> ";"
| <continue_statement> ";"
| <match_statement>
| <switch_statement>
| <loop_statement>
| <expression> ";"
<variable_declaration> =
<variable_declarator> <identifier> <variable_declaration_tail>
<variable_declaration_tail> =
":" <type> ( "=" <expression> )?
| "=" <expression>
<variable_declarator> =
"let"
| "def"
<if_statement> =
"if" "(" <expression> ")" <variable_unwrapper>? <code_block>
<else_statement>?
<else_statement> =
"else" <else_statement_tail>
<else_statement_tail> =
<if_statement>
| <variable_unwrapper>? <code_block>
<variable_unwrapper> =
"|" <identifier> "|"
<loop_statement> =
(<identifier> ":=")? (
<c_for_statement>
| <range_for_statement>
| <while_statement>
| <do_while_statement>
| <inf_loop_statement>
)
<c_for_statement> =
"for" "(" ( <variable_declaration> | <expression> )? ";" <expression> ";" <expression>? ")"
<code_block>
<range_for_statement> =
"for" "(" <variable_declarator> <identifier> ":=" <expression> ")"
<code_block>
<while_statement> =
"while" "(" <expression> ")" <variable_unwrapper>? <code_block>
<else_statement>?
<do_while_statement> =
"do" <code_block> "while" "(" <expression> ")"
<inf_loop_statement> =
"loop" <code_block>
<match_statement> =
"match" "(" <expression> ")" "{" <match_case>* <default_case>? "}"
<switch_statement> =
"switch" "(" <expression> ")" "{" <switch_case>* <default_case>? "}"
<match_case> =
<type_name> ( "|" <identifier> "|" )? "->" <code_block>
<switch_case> =
<expression> "->" <code_block>
<default_case> =
"_" "->" <code_block>
<break_statement> =
"break" <identifier>?
<continue_statement> =
"continue" <identifier>?
<defer_statement> =
"defer" ( <expression> | <code_block> )
<errdefer_statement> =
"errdefer" ( <expression> | <code_block> )
<return_statement> =
"return" <expression>?
/* --- Expressions & Operator Precedence --- */
/* The full expression hierarchy, from lowest to highest precedence. */
<expression> =
<bool_or_expression> ( ( <assign_op> | <compound_assign_op> ) <expression> )?
<bool_or_expression> =
<bool_and_expression> ( ( "||" | "or" ) <bool_and_expression> )*
<bool_and_expression> =
<compare_expression> ( ( "&&" | "and" ) <compare_expression> )*
<compare_expression> =
<bitwise_expression> ( <compare_op> <bitwise_expression> )?
<bitwise_expression> =
<bitwise_shift_expression> ( <bitwise_op> <bitwise_shift_expression> )*
<bitwise_shift_expression> =
<addition_expression> ( <bitshift_op> <addition_expression> )*
<addition_expression> =
<multiply_expression> ( <addition_op> <multiply_expression> )*
<multiply_expression> =
<prefix_expression> ( <multiply_op> <prefix_expression> )*
<prefix_expression> =
<prefix_op>* <postfix_expression>
<postfix_expression> =
<primary_expression> ( <suffix_op> | <fn_call_arguments> )*
/* --- Primary Expressions & Literals --- */
/* The highest-precedence expressions, including literals and grouped expressions. */
<primary_expression> =
<grouped_expression>
| <literal>
| <type_initialized_literal>
| <access_expression> ( "{" <struct_literal_body> "}" )?
<access_expression> =
<identifier> ( "::" "<" <types_list> ">" )?
<type_initiated_literal> =
<type> "{" <struct_literal_body> "}"
<literal> =
<char_literal>
| <null_literal>
| <string_literal>
| <number_literal>
| <boolean_literal>
<grouped_expression> =
"(" <expression> ")"
<fn_call_arguments> =
"(" <expression_list> ")"
<expression_list> =
(<expression> ",")* <expression>?
<struct_literal_body> =
( <named_field_list> | <positional_field_list> )? ","?
<named_field_list> =
<named_field_init> ( "," <named_field_init> )*
<named_field_init> =
"." <identifier> "=" <expression>
<positional_field_list> =
<expression> ( "," <expression> )*
<null_literal> =
"null"
<boolean_literal> =
"true"
| "false"
<number_literal> = /* Assumed to be defined by the tokenizer */
<string_literal> = /* Assumed to be defined by the tokenizer */
<char_literal> = /* Assumed to be defined by the tokenizer */
/* --- Operators --- */
/* Definitions for all operator token sets. */
<assign_op> = "="
<compound_assign_op> = "+=" | "-=" | "*=" | "/=" | "%=" | "&=" | "|=" | "<<=" | ">>=" | "||=" | "&&="
<compare_op> = "==" | "!=" | ">" | "<" | ">=" | "<="
<bitwise_op> = "&" | "^" | "|"
<bitshift_op> = "<<" | ">>"
<addition_op> = "+" | "-"
<multiply_op> = "*" | "/" | "%"
<prefix_op> = "!" | "-" | "~" | "&" | "*"
<suffix_op> =
"[" <array_access_tail>
| "." <identifier>
| "::" <identifier> ( "::" "<" <types_list> ">" )?
| "->" <identifier>
| ".@" <identifier>?
| ".[" <expression> "]"
| ".#"
| ".*"
<array_access_tail> =
<expression>? <slice_or_index_tail>
| ":" <expression>? "]"
<slice_or_index_tail> =
"]"
| ":" <expression>? "]"
/* --- Type System --- */
/* Rules for defining types, type names, and type qualifiers. */
<type> =
<type_qualifier_chain>? <type_name>
<type_qualifier_chain> =
( "*" | "[]" ) <type_qualifier_chain>?
| "$" <type_qualifier_chain_after_mutable>?
| "?" <type_qualifier_chain_after_optional>?
<type_qualifier_chain_after_optional> =
( "*" | "[]" ) <type_qualifier_chain>?
| "$" <type_qualifier_chain_after_mutable>?
<type_qualifier_chain_after_mutable> =
( "*" | "[]" ) <type_qualifier_chain>?
| "?" <type_qualifier_chain_after_optional>?
<type_name> =
<access_expression> ( "::" <identifier> ( "::" "<" <types_list> ">" )? )*
<namespaced_identifier> =
<identifier> ( "::" <identifier> )*
<types_list> =
<type> ( "," <types_list> )*
/* --- Lexical Tokens & Base Definitions --- */
/* The lowest-level building blocks of the language. */
<identifier> =
<nondigit> <identifier_tail>
<identifier_tail> =
<empty>
| <nondigit> <identifier_tail>
| <digit> <identifier_tail>
<nondigit> = "_" | [a-z] | [A-Z]
<digit> = <zero> | <nonzero_digit>
<zero> = "0"
<nonzero_digit> = [1-9]
<empty> = E /* Represents an empty terminal string */
<eof> = /* End Of File */
```

View File

@ -1,31 +0,0 @@
# Token Reference
Token definitions used by the tokenizer.
## Literals
- `tkInteger` - decimal integers (`10`, `42`).
- `tkDecimal` - floating-point literals (`3.14159265358979`).
- `tkString` - double-quoted strings.
- `tkCharacter` - character literals.
- `tkIdentifier` - identifiers and names.
- `tkEOF` - end-of-file sentinel.
## Operators
Includes all operator tokens such as `::`, `->`, `.[`, `.#`, `.*`, `.@`, and the
compound assignments (`+=`, etc.).
## Keywords
`import`, `using`, `struct`, `enum`, `fn`, `return`, `let`, `def`, `if`, `else`,
`for`, `while`, `do`, `loop`, `match`, `switch`, `defer`, `errdefer`, `true`,
`false`, `null`, `typename`, `this`, `_` (default pattern), and logical aliases
`or`, `and`.
## Notes
- Keyword recognition uses a trie to ensure single-pass tokenization.
- Multi-character operators (`::`, `.[`, `.#`, `.*`, `.@`, `>>=`) rely on
lookahead logic.
- Generic parsing may split `>>` into `>` + `>` to disambiguate template closers.

View File

@ -1,240 +0,0 @@
/* ============================================================================
* `artichoke` - Language Overview & Technical Specification
* ============================================================================
* This file serves as an exhaustive showcase of the Artichoke syntax and
* grammar rules as of the 2025 parser-stabilization phase.
*/
/* * --- Imports ---
* Imports allow the compiler to pull symbols from other modules into the
* current scope.
*/
import std::memory; /* Import the entire 'memory' module from 'std' */
import std::*; /* Wildcard: Import all direct child elements of 'std'.
Note: This does not recursively import submodules. */
import my_module::some_function; /* Import a specific function symbol */
import my_module::some_typename; /* Import a specific type symbol */
/* --- Type & Symbol Aliasing ---
* The `using` keyword creates an alias for an existing symbol. This is
* useful for shortening long namespaced paths or creating domain-specific
* names for primitive types.
*/
using mem = std::memory;
using malloc = mem::mem_alloc;
using my_type = my_module::some_typename;
using my_func = my_module::some_function;
/* --- Structs & Generics (Definition) ---
* Generics at the definition level use a "clean" `<typename T>` syntax.
* The Turbofish is NOT used here, only at the call site/instantiation.
*/
struct Point<typename T> {
x: T,
y: T
}
struct Rectangle {
/* Usage of a generic type REQUIRES the Turbofish `::<>`.
This disambiguates between the 'less-than' operator and a generic list
within the Pratt expression parser. */
top: Point::<i32>,
bot: Point::<i32>
}
/* --- Functions ---
* Return types follow the `->` operator.
*/
fn meaning_of_life() -> i32 {
return 42;
}
/* --- Named Initializers ---
* Structs/Objects are initialized using the designated initializer syntax:
* `.field = expression`. This makes the code self-documenting and order-independent.
*/
fn scale<typename T>(lhs: *Point::<T>, rhs: T) -> Point::<T> {
return Point::<T> {
.x = lhs->x * rhs,
.y = lhs->y * rhs
};
}
/* --- Member Functions (Methods) ---
* A function becomes a "member function" if the first parameter follows
* the specific syntax: `this` <type>.
* Note: No colon is required after the `this` keyword.
* These can be called as: `expr.fn_name(params...)`
* or as standard functions: `fn_name(expr, params...)`.
*/
fn add<typename T>(this *Point::<T>, other: *Point::<T>) {
this->x += other->x;
this->y += other->y;
}
/* --- Tagged Enums (Sum Types) ---
* Enums can hold data. The standard library provides a `Result<T, E>`
* for robust, explicit error handling.
*/
enum Result<typename T, typename E> {
Ok(T),
Err(E)
}
/* --- The Main Entry Point ---
* Slices are represented as `[]type`. Argv here is a slice of slices of chars.
*/
fn main(argc: i32, argv: [][]char) -> Result::<void, i32> {
if (argc < 2) {
/* To initialize an enum with data, use an positional initializer. */
return Result::<void, i32>::Err{ -1 };
}
/* --- Variable Declarations ---
* `let` -> Mutable variable (can be reassigned).
* `def` -> Constant variable (immutable after initialization).
* Type inference is supported when the right-hand side is unambiguous.
*/
let x: i32 = 10;
let do_you_get_it = meaning_of_life();
def PI: f64 = 3.14159265358979;
/* --- Pointers, Mutability, and Optionals ---
* The following type qualifiers are used to "extend" the types
* `*` -> Pointer (Non-nullable by default).
* `$` -> Mutability qualifier.
* `?` -> Nullable/Optional qualifier.
* Qualifiers apply to the immediate element to their right.
*/
def ptr: *i32 = &do_you_get_it; /* Const pointer to immutable i32 */
let mutable_pointer: *$i32 = &x; /* Mutable pointer to mutable i32 */
/* A mutable (`let`) pointer (*) to a mutable pointer ($*) to a mutable i32 ($i32) */
let complex_pointer: *$*$i32 = &mutable_pointer;
/* Nullable types allow the `null` literal.
Mixing qualifiers allows for complex types like "Nullable pointer to mutable i32". */
let null_int: ?i32 = null;
def nullable_float: ?f32 = PI;
/* --- Slices and Array Literals ---
* Slices are the primary way to handle contiguous memory.
* `<type> { ... }` is an Type-ObjectInitiated Literal (Anonymous construction).
* If the type is an slice type, then is a standard Array Literal.
*/
let arrSlice: ?[]i32 = []i32 { 2, 4, 6, 8, 10 };
/* --- Slicing Suffixes ---
* `artichoke` supports Python-like slicing with optional boundaries.
*/
let full = arrSlice[:]; /* Full range copy (can be omitted) */
let range = arrSlice[1:3]; /* From index 1 to 2 (exclusive of 3) */
let head = arrSlice[:2]; /* Everything before index 2 */
let tail = arrSlice[2:]; /* Everything from index 2 to the end */
/* --- Specialized Suffix Operators ---
* `.*` -> Unwraps a slice into its raw underlying pointer.
* `.#` -> Retrieves the length of the slice.
* `.[length]` -> Convers a raw pointer into a slice of the given length.
*/
def memPtr = arrSlice.*;
def memLength = arrSlice.#;
def newSlice = memPtr.[memLength];
/* --- Control Flow & Error Unwrapping ---
* If statements can "unwrap" Result types using the `|val|` syntax.
*/
if (foo()) |ok_val| {
/* If foo() returned Ok(T), ok_val is available here as type T */
} else |err_val| {
/* If foo() returned Err(E), err_val is available here as type E */
}
/* --- While patterns ---
* While statements can also "unwrap" Result types using the `|val|` syntax.
* The while look will continue running until an Err result is resturned
* which will be unrapped into the `err_val` variable if specified.
*/
while (foo()) |ok_val| {
/* If foo() returned Ok(T), ok_val is available here as type T and the loop
* will continue */
} else |err_val| {
/* If foo() returned Err(E), err_val is available here as type E */
}
/* Normal condition based while loops are also supported
* The other powerful while pattern that `artichoke` provides is a "iterator"
* based while loop, if the return type of the expression provided at the
* condition returns an optional (`?`), then it can unraps the variable by
* using the `|val|` syntax and will continue until an empty optional value is
* returned.
*/
while (foo.next()) |ok_val| {
/* If foo() returned Ok(T), ok_val is available here as type T and the loop
* will continue */
}
/* --- Match & Switch ---
* `match` -> Used for complex Pattern Matching (Types, Enums).
* `switch` -> Used for Value-based branching.
* Both must be exhaustive or include a default case `_`.
*/
match (foo()) {
Result::<i32, []u8>::Ok |v| -> {
std::io::print("Success!");
}
_ -> { /* Default case */ }
}
/* --- Loops & Labels ---
* Loops can be named using the `name :=` syntax to allow breaking/continuing
* from nested contexts.
* The loop types that `artichoke` provides are
* * `while` loops
* * `do-while` loops
* * `for` loops with a C-style syntax
* * range `for` loops
*/
outer_loop := while (condition) {
inner_loop := for (let i = 0; i < 10; i += 1) {
if (i == 5) { break outer_loop; }
}
}
/* `do-while` loop, no else branch is allowed here */
do {
/* Statements */
} while (true);
/* `for` loop with a C-style syntax */
for (let i = 0; i < 10; i += 1) {
/* Statements */
}
/* range `for` loop, type is deduced from expression */
for (let element := returns_range_function()) {
/* Statements */
}
/* * --- Resource Management ---
* `defer` -> Executes at the end of the scope in reverse order.
* `errdefer` -> Executes only if the function returns an Err variant.
*/
defer cleanup();
errdefer { log_failure(); }
/* * --- Compile-Time Reflection ---
* The `.@` operator provides access to metadata.
*/
/* Returns an struct with compile time information */
def refl_info = foo.@;
/* Specific properties like alighment, size, offset, name are allowed too */
def xalign = Point::<u32>::x.@alignment;
def type_size = Point::<u32>.@size;
return Result::<void, i32>::Ok{};
}

View File

@ -24,7 +24,7 @@
| <enum_declaration>
| <function_declaration>
<non_exportable_declaration> =
non_exportable_declaration =
<import_statement>
| <alias_statement>
| <module_statement>
@ -234,14 +234,10 @@
<primary_expression> =
<grouped_expression>
| <literal>
| <type_initialized_literal>
| <access_expression> ( "{" <struct_literal_body> "}" )?
<access_expression> =
<identifier> ( "::" "<" <types_list> ">" )?
<type_initiated_literal> =
<type> "{" <struct_literal_body> "}"
<identifier> ( "<" <types_list> ">" )?
<literal> =
<char_literal>
@ -266,7 +262,7 @@
<named_field_init> ( "," <named_field_init> )*
<named_field_init> =
"." <identifier> "=" <expression>
<identifier> ":" <expression>
<positional_field_list> =
<expression> ( "," <expression> )*
@ -298,7 +294,7 @@
<suffix_op> =
"[" <array_access_tail>
| "." <identifier>
| "::" <identifier> ( "::" "<" <types_list> ">" )?
| "::" <identifier> ( "<" <types_list> ">" )?
| "->" <identifier>
| ".@" <identifier>?
| ".[" <expression> "]"
@ -306,7 +302,7 @@
| ".*"
<array_access_tail> =
<expression>? <slice_or_index_tail>
<expression> <slice_or_index_tail>
| ":" <expression>? "]"
<slice_or_index_tail> =
@ -333,10 +329,10 @@
| "?" <type_qualifier_chain_after_optional>?
<type_name> =
<access_expression> ( "::" <identifier> ( "::" "<" <types_list> ">" )? )*
<access_expression> ( "::" <identifier> ( "<" <types_list> ">" )? )*
<namespaced_identifier> =
<identifier> ( "::" <identifier> )*
<identifier> ( "::" identifier )*
<types_list> =
<type> ( "," <types_list> )*