ll-lang Compiler Developer Guide¶

For contributors working on the ll-lang compiler itself. Assumes you know F# well enough to read it; the compiler is ~3.8k lines across the source tree and keeps HM inference handwritten while using a FParsec-based parser front-end.

Contents¶

Architecture overview — pipeline, project layout, F# compile order
Lexer — tokens, INDENT/DEDENT synthesis, position tracking
Parser — recursive descent, expression precedence, quirks
Elaborator — declared-type checking, E001-E005, exhaustiveness
H-M inference — Algorithm W, Subst, unify, generalize/instantiate, occurs check
Code generation — F# emission, [<EntryPoint>], temp-project dotnet run execution path
Testing — xUnit layout, helpers, corpus drivers
Adding an error code — end-to-end walkthrough
Self-hosting roadmap — historical Phase 7 record
MCP server — embedded compiler tooling for LLM clients
Multi-target codegen — backend contracts and target semantics
v2 language architecture — target design for pure ll-lang core
v2 implementation roadmap — tracked execution plan with done/not-done checklists
v2 canonical compiler boundaries — subsystem ownership map and migration targets
v2 pass contracts — explicit input/output contracts for canonical compiler phases
v2 project system execution — implementer-facing breakdown of the canonical manifest/resolver/lock/vendor lifecycle
v2 stdlib foundation execution — implementer-facing breakdown of the self-hosting foundation stdlib, including post-milestone clarification questions
v2 compiler boundaries execution — implementer-facing breakdown of canonical subsystem ownership and pass-boundary enforcement
v2 syntax ergonomics execution — implementer-facing breakdown of operator, precedence, and compiler-heavy syntax cleanup
v2 self-host transition execution — implementer-facing breakdown of promoting ll-lang to canonical compiler path
v2 llm operating system execution — implementer-facing breakdown of MCP, prompt packs, and machine-readable authoring workflows
v2 benchmarks and release gates execution — implementer-facing breakdown of evidence, benchmarks, and release-blocking gates

Repository layout¶

ll-lang/
├── spec/
│   ├── grammar.ebnf              formal grammar
│   ├── type-system.md            H-M rules, tag system, phantom types
│   ├── error-codes.md            E001..E008 catalog
│   └── examples/
│       ├── valid/                corpus of working programs
│       └── invalid/              programs with expected error codes
├── src/
│   ├── LLLangCompiler/           compiler library (F#)
│   │   ├── Token.fs              Tok type
│   │   ├── Lexer.fs              tokenizer with layout
│   │   ├── AST.fs                untyped surface AST
│   │   ├── Parser.fs             recursive-descent parser
│   │   ├── FParsecParser.fs      strict parser (primary path)
│   │   ├── Elaborator.fs         name resolution, E001-E005, exhaustiveness
│   │   ├── Types.fs              TypeScheme, Subst, generalize, instantiate
│   │   ├── TypedAST.fs           typed AST after inference
│   │   ├── HMInfer.fs            Algorithm W, unify, trait dispatch
│   │   ├── Codegen.fs            F# source emitter
│   │   ├── Compiler.fs           pipeline entry point
│   │   └── LLLangCompiler.fsproj
│   └── LLLangTool/               lllc CLI (build/run commands)
│       ├── Program.fs
│       └── LLLangTool.fsproj
├── tests/
│   └── LLLangTests/              xUnit suite (see CI for current count)
│       ├── LexerTests.fs           RealLexerTests.fs
│       ├── ParserTests.fs          ArithmeticParserTests.fs
│       │                           TypeParserTests.fs   FnParserTests.fs
│       │                           ExprParserTests.fs   ModuleParserTests.fs
│       ├── ElaboratorTests.fs      ElaboratorRealTests.fs
│       ├── HMInferTests.fs         HMInferRealTests.fs
│       ├── CodegenTests.fs         CodegenRealTests.fs
│       ├── PipelineRealTests.fs
│       ├── StdlibTests.fs
│       └── BootstrapCompilerTests.fs  -- bootstrap compiler corpus
├── docs/                         user guide + compiler-dev guide (this tree)
└── README.md

Build and test¶

dotnet build                      # all three projects
dotnet test                       # run xUnit suite (see CI for current count)

The compiler library targets net10.0 with LangVersion=preview and Nullable=enable, and depends on FParsec for strict parsing. Tests depend on xunit 2.6.3 and Microsoft.NET.Test.Sdk 17.8.0.

Conventions¶

Parser stack: strict mode uses FParsecParser; legacy recursive-descent parser is retained for parity/fallback diagnostics.
Operator defaults (self-host path): canonical baseline fixities are declared in stdlib/src/Operators.lll (Std.Operators) and consumed by the table-driven self-host parser/resolver flow.
No mutable global state. Inference uses a small InferState record passed through the tree walk.
Errors are collected, not raised. Compiler functions return Result<T, LLError list>, never throw on a type error.
Examples are the source of truth. Every feature must have a valid corpus entry in spec/examples/valid/ and each error code must have an invalid corpus entry in spec/examples/invalid/.