Elaborator¶
File: src/LLLangCompiler/Elaborator.fs (~630 lines).
The elaborator is a pre-inference pass that does:
- Module-level rewrites:
TyApp(T, TyVar t)for any declaredtag tis rewritten toTyTagged(T, UName t). - Name resolution (module-level).
- Declared-type checking (
E001,E002,E004,E005). - Exhaustiveness of pattern matches (
E003).
It runs after parsing and before H-M inference. Its signature is:
val elaborate : PosMap -> LLModule -> Result<LLModule * TypeEnv, LLError list>
On success it returns both the rewritten module and the enriched
TypeEnv : Map<string, TypeExpr>. The TypeEnv becomes the initial
environment for HMInfer.fromElaboratorEnv, and the PosMap is
threaded in so error emitters can look up real line:col.
Core types¶
type TypeEnv = Map<string, TypeExpr>
type ErrorCode = E001 | E002 | E003 | E004 | E005 | E006 | E008
type LLError = {
Code: ErrorCode
Line: int
Col: int
Message: string // compact format
}
LLError is shared with the H-M pass. E007 PlatformMismatch is
reserved in the design spec but has no runtime path.
Three passes¶
Pass 1: collectDecls¶
Walks every Decl and populates the TypeEnv. Starts with builtinEnv,
which covers all of the Phase 6 stdlib as well as the built-in operators:
let arithOps = [ "+"; "-"; "*"; "/" ]
let cmpOps = [ "=="; "!="; "<"; ">"; "<="; ">=" ]
// Each arith op: a -> a -> a
// Each cmp op: a -> a -> Bool
// printfn / print: Str -> Unit
// math: abs, absf, sqrt, min, max
// list: listLen, listMap, listFilter, listFold,
// listHead, listTail, listReverse, listAppend,
// listConcat, listIsEmpty, listAt
// maybe: maybeMap, maybeBind, maybeWithDefault
// result: resultMap, resultBind, resultMapErr
// str / char: strLen, strConcat, strTrim, strContains,
// strToInt, strChars, charToInt, intToChar,
// intToStr, strSlice, strIndexOf, strSplit,
// strFromChars, strReverse,
// charIsDigit, charIsAlpha, charIsSpace
// file IO: readFile, writeFile, fileExists
// process: exit
Arith and comparison ops are wildcard-polymorphic (TyVar "a"), which
lets the elaborator accept them without triggering E002 — they aren't
declared in source. The matching runtime bindings for the stdlib live
in Codegen.fsharpPreludeCore (plus the conditional fsharpPreludeMaybe
and fsharpPreludeResult blocks, emitted only when the user declares
type Maybe / type Result).
For each declaration:
DFn(sig, _body)— add the curried function type to the env.DLet(name, expr)— inspect the literal-only initializer to infer a shallow type (int/float/str/char, bool via the KwTrue/KwFalse surface, or their tagged variants). Anything more complex becomesTyVar "?"(a wildcard) and is refined by HMInfer.DLetPat(pat, _expr)— bind every name introduced by the pattern asTyVar "?"; HMInfer refines to concrete types.DType(name, params, TBSum ctors)— add one entry per constructor, with a curried function type from the constructor args to the fully applied type (e.g.type Maybe A = Some A | NonegivesSome : A -> Maybe[A]andNone : Maybe[A]).DImpl(trait, type, fns)— add each impl method under a mangled namemethodName_TypeName.DTag,DUnit,DTrait, record types, and wrapped types do not add entries at this stage.
Pass 2: checkDecls (via typeOf)¶
For each DFn, extend the env with the function's parameters (typed
from the signature, normalized via normalizeFnTy which converts
single-uppercase TyName A to TyVar A), then walk the body with
typeOf.
typeOf : Expr -> TypeEnv -> TypeExpr * LLError list walks the
expression and:
- Returns the literal type for each
ELit. - Looks up
EVar/EConand emitsE002 UnboundVaron miss. - At each
EApp(f, arg), extracts the parameter type fromf'sTyFn(paramType, returnType)and compares it againstargType. - On mismatch, calls
classifyMismatch.
The key equality check is tyEqual, which treats TyVar _ as a
wildcard — so fully polymorphic arithmetic ops (+, -, etc.) accept
anything. This is intentional: the elaborator only catches mismatches
involving fully known types.
classifyMismatch — triaging E001 / E004 / E005¶
Given a parameter type and an argument type, decide which error code applies:
match asTagged paramType, asTagged argType with
| Some(pBase, pTag), Some(aBase, aTag) when tyEqual pBase aBase && pTag <> aTag ->
// Same base, different unit/tag
match pBase with
| TyName "Float" | TyName "Int" -> e004 ... // UnitMismatch
| _ -> e001 ... // TypeMismatch (e.g. Str[UserId] vs Str[Email])
| Some(pBase, _), None when tyEqual argType pBase ->
// Arg has the right base but no tag → E005
e005 ...
| _ ->
e001 ...
Priorities:
- Numeric tag mismatch →
E004 UnitMismatch. - Non-numeric tag mismatch →
E001 TypeMismatch. - Untagged value where tagged expected →
E005 TagViolation. - Everything else →
E001.
Pass 3: exhaustivenessCheck¶
Builds a map typeToCtors : Map<string, string list> from every
DType(name, _, TBSum ctors). For each DFn whose first parameter is
a TyName T in that map, recursively collects every EMatch block in
the body via collectMatches and checks whether each block covers
every constructor of T.
for c in requiredCtors do
if not (List.contains c coveredCtors) then
yield e003 0 0 typeName c
collectMatches is a structural recursion that descends into EApp,
EIf, ELet, EPipe, EList, ETuple, ETagged, and other
EMatch bodies — but explicitly not into ELam (lambdas get their own
scope and the first-param heuristic doesn't apply).
Known limitations¶
- The first-param heuristic only catches exhaustiveness when the sum type is the first parameter of the function. Multi-scrutinee matches are not handled.
- Wildcard
_is not yet treated as a catch-all in exhaustiveness — if you write| _ -> default, the check still complains about missing constructors unless one of them is also named. This is a known bug tracked for fix. - Line/col info for most elaborator errors is now threaded through the
parser's
PosMapside-table (look forposOf pm nodeintypeOf). A few synthesized error sites still fall back to0:0when the AST node has no recorded position.
Bridge to H-M¶
The elaborator output is the input to HMInfer.fromElaboratorEnv,
which converts TypeEnv (raw TypeExpr) into H-M Env
(Map<Ident, TypeScheme>). Each type is inspected for rigid
(non-$-prefixed) TyVars, which are collected into the scheme's
quantifier list. Flex vars are not expected in elaborator output.
let fromElaboratorEnv (e3: LLLang.Elaborator.TypeEnv) : Env =
let collectRigid t = ...
Map.map (fun _ ty ->
let rigids = collectRigid ty |> Set.toList |> List.sort
{ Vars = rigids; Body = ty }
) e3
This is how type Maybe A = Some A | None ends up with
Some : ∀A. A -> Maybe[A] in the H-M environment.
Tests¶
tests/LLLangTests/ElaboratorTests.fs covers:
- Each error code has a positive test that expects it from a specific invalid snippet.
- Valid programs from
spec/examples/valid/are elaborated without errors. classifyMismatchtriage: numeric vs non-numeric tag mismatches, and untagged-where-tagged cases.
Run:
dotnet test --filter ElaboratorTests