Elaborator¶

File: src/LLLangCompiler/Elaborator.fs (~630 lines).

The elaborator is a pre-inference pass that does:

Module-level rewrites: TyApp(T, TyVar t) for any declared tag t is rewritten to TyTagged(T, UName t).
Name resolution (module-level).
Declared-type checking (E001, E002, E004, E005).
Exhaustiveness of pattern matches (E003).

It runs after parsing and before H-M inference. Its signature is:

val elaborate : PosMap -> LLModule -> Result<LLModule * TypeEnv, LLError list>

On success it returns both the rewritten module and the enriched TypeEnv : Map<string, TypeExpr>. The TypeEnv becomes the initial environment for HMInfer.fromElaboratorEnv, and the PosMap is threaded in so error emitters can look up real line:col.

Core types¶

type TypeEnv = Map<string, TypeExpr>

type ErrorCode = E001 | E002 | E003 | E004 | E005 | E006 | E008

type LLError = {
    Code: ErrorCode
    Line: int
    Col: int
    Message: string  // compact format
}

LLError is shared with the H-M pass. E007 PlatformMismatch is reserved in the design spec but has no runtime path.

Three passes¶

Pass 1: `collectDecls`¶

Walks every Decl and populates the TypeEnv. Starts with builtinEnv, which covers all of the Phase 6 stdlib as well as the built-in operators:

let arithOps = [ "+"; "-"; "*"; "/" ]
let cmpOps   = [ "=="; "!="; "<"; ">"; "<="; ">=" ]
// Each arith op: a -> a -> a
// Each cmp op:   a -> a -> Bool
// printfn / print:      Str -> Unit
// math:                 abs, absf, sqrt, min, max
// list:                 listLen, listMap, listFilter, listFold,
//                       listHead, listTail, listReverse, listAppend,
//                       listConcat, listIsEmpty, listAt
// maybe:                maybeMap, maybeBind, maybeWithDefault
// result:               resultMap, resultBind, resultMapErr
// str / char:           strLen, strConcat, strTrim, strContains,
//                       strToInt, strChars, charToInt, intToChar,
//                       intToStr, strSlice, strIndexOf, strSplit,
//                       strFromChars, strReverse,
//                       charIsDigit, charIsAlpha, charIsSpace
// file IO:              readFile, writeFile, fileExists
// process:              exit

Arith and comparison ops are wildcard-polymorphic (TyVar "a"), which lets the elaborator accept them without triggering E002 — they aren't declared in source. The matching runtime bindings for the stdlib live in Codegen.fsharpPreludeCore (plus the conditional fsharpPreludeMaybe and fsharpPreludeResult blocks, emitted only when the user declares type Maybe / type Result).

For each declaration:

DFn(sig, _body) — add the curried function type to the env.
DLet(name, expr) — inspect the literal-only initializer to infer a shallow type (int/float/str/char, bool via the KwTrue/KwFalse surface, or their tagged variants). Anything more complex becomes TyVar "?" (a wildcard) and is refined by HMInfer.
DLetPat(pat, _expr) — bind every name introduced by the pattern as TyVar "?"; HMInfer refines to concrete types.
DType(name, params, TBSum ctors) — add one entry per constructor, with a curried function type from the constructor args to the fully applied type (e.g. type Maybe A = Some A | None gives Some : A -> Maybe[A] and None : Maybe[A]).
DImpl(trait, type, fns) — add each impl method under a mangled name methodName_TypeName.
DTag, DUnit, DTrait, record types, and wrapped types do not add entries at this stage.

Pass 2: `checkDecls` (via `typeOf`)¶

For each DFn, extend the env with the function's parameters (typed from the signature, normalized via normalizeFnTy which converts single-uppercase TyName A to TyVar A), then walk the body with typeOf.

typeOf : Expr -> TypeEnv -> TypeExpr * LLError list walks the expression and:

Returns the literal type for each ELit.
Looks up EVar / ECon and emits E002 UnboundVar on miss.
At each EApp(f, arg), extracts the parameter type from f's TyFn(paramType, returnType) and compares it against argType.
On mismatch, calls classifyMismatch.

The key equality check is tyEqual, which treats TyVar _ as a wildcard — so fully polymorphic arithmetic ops (+, -, etc.) accept anything. This is intentional: the elaborator only catches mismatches involving fully known types.

`classifyMismatch` — triaging E001 / E004 / E005¶

Given a parameter type and an argument type, decide which error code applies:

match asTagged paramType, asTagged argType with
| Some(pBase, pTag), Some(aBase, aTag) when tyEqual pBase aBase && pTag <> aTag ->
    // Same base, different unit/tag
    match pBase with
    | TyName "Float" | TyName "Int" -> e004 ...   // UnitMismatch
    | _                              -> e001 ...  // TypeMismatch (e.g. Str[UserId] vs Str[Email])
| Some(pBase, _), None when tyEqual argType pBase ->
    // Arg has the right base but no tag → E005
    e005 ...
| _ ->
    e001 ...

Priorities:

Numeric tag mismatch → E004 UnitMismatch.
Non-numeric tag mismatch → E001 TypeMismatch.
Untagged value where tagged expected → E005 TagViolation.
Everything else → E001.

Pass 3: `exhaustivenessCheck`¶

Builds a map typeToCtors : Map<string, string list> from every DType(name, _, TBSum ctors). For each DFn whose first parameter is a TyName T in that map, recursively collects every EMatch block in the body via collectMatches and checks whether each block covers every constructor of T.

for c in requiredCtors do
    if not (List.contains c coveredCtors) then
        yield e003 0 0 typeName c

collectMatches is a structural recursion that descends into EApp, EIf, ELet, EPipe, EList, ETuple, ETagged, and other EMatch bodies — but explicitly not into ELam (lambdas get their own scope and the first-param heuristic doesn't apply).

Known limitations¶

The first-param heuristic only catches exhaustiveness when the sum type is the first parameter of the function. Multi-scrutinee matches are not handled.
Wildcard _ is not yet treated as a catch-all in exhaustiveness — if you write | _ -> default, the check still complains about missing constructors unless one of them is also named. This is a known bug tracked for fix.
Line/col info for most elaborator errors is now threaded through the parser's PosMap side-table (look for posOf pm node in typeOf). A few synthesized error sites still fall back to 0:0 when the AST node has no recorded position.

Bridge to H-M¶

The elaborator output is the input to HMInfer.fromElaboratorEnv, which converts TypeEnv (raw TypeExpr) into H-M Env (Map<Ident, TypeScheme>). Each type is inspected for rigid (non-$-prefixed) TyVars, which are collected into the scheme's quantifier list. Flex vars are not expected in elaborator output.

let fromElaboratorEnv (e3: LLLang.Elaborator.TypeEnv) : Env =
    let collectRigid t = ...
    Map.map (fun _ ty ->
        let rigids = collectRigid ty |> Set.toList |> List.sort
        { Vars = rigids; Body = ty }
    ) e3

This is how type Maybe A = Some A | None ends up with Some : ∀A. A -> Maybe[A] in the H-M environment.

Tests¶

tests/LLLangTests/ElaboratorTests.fs covers:

Each error code has a positive test that expects it from a specific invalid snippet.
Valid programs from spec/examples/valid/ are elaborated without errors.
classifyMismatch triage: numeric vs non-numeric tag mismatches, and untagged-where-tagged cases.

Run:

dotnet test --filter ElaboratorTests