Prompting an LLM to write ll-lang¶
ll-lang is designed to be generated by an LLM agent and checked by a fast compiler. The language was shaped around the constraints of token-by-token generators — small surface area, explicit types, stable error codes — so that a tight generate → compile → fix loop produces correct programs with minimal supervision.
Why ll-lang is LLM-ergonomic¶
The design decisions below are deliberate bets that a model with no prior knowledge of ll-lang can still produce valid code on the first or second attempt.
Small surface area¶
The entire keyword set fits on one line:
let tag unit trait impl import export module
external opaque infix infixl infixr if else true false match
That is 18 keywords. There is no fn, type, in, then, with. A model
does not need to guess which of several equivalent syntaxes a user prefers —
there is only one canonical form.
Explicit, annotatable types¶
Every parameter carries its type inline: add(a Int)(b Int).
LLMs don't have to infer types across multi-file modules; each signature
declares its contract up front. Return types are optional but always
welcome. When in doubt, a model can annotate — the extra tokens cost
nothing and make the error surface smaller.
Minimal, predictable stdlib¶
The implicit prelude exposes ~50 functions grouped by naming prefix:
list*—listLen,listMap,listFilter,listFold,listHead, ...maybe*—maybeMap,maybeBind,maybeDefaultresult*—resultMap,resultBind,resultIsOkstr*—strLen,strConcat,strTrim,strSplit,strChars, ...char*—charToInt,intToChar,charIsDigit,charIsAlpha, ...- IO —
printfn,print,readFile,writeFile,exit
A model that has seen the naming prefix can guess the rest. listMap,
maybeMap, and resultMap are three separate names — there is no
type-class-driven map that dispatches on its argument. This removes the
"which overload did the compiler pick?" confusion class entirely.
Unique-dispatch, stable error codes¶
Every diagnostic uses stable fixed codes (E001–E008, E020, E024–E030).
Each code maps to exactly one checker stage:
| Code | Stage | Typical fix |
|---|---|---|
| E001 | Inference | Change literal or annotation so types match |
| E002 | Elaboration | Declare the missing name or check spelling |
| E003 | Exhaustiveness | Add a missing match arm |
| E004 | Unit algebra | Produce the value in the expected unit |
| E005 | Tag checker | Add [Tag] to the untagged value |
| E006 | Trait dispatch | Add the missing impl or use a different type |
| E008 | Occurs check | Rewrite the recursive self-application |
| E020 | Module path | Rename file or fix module X.Y declaration |
| E024 | Module cycle | Break the import cycle |
| E025 | No project | Add lll.toml or use only Std.* imports |
| E026 | External map | Map external name in Platform SDK |
| E027 | Fixity assoc | Fix malformed fixity declaration form |
| E028 | Fixity prec | Keep precedence in range 1..9 |
| E029 | Fixity dup | Keep one fixity declaration per operator |
| E030 | Fixity op | Keep operator symbolic, non-reserved, and safe |
Because the number → meaning map is fixed, an LLM system prompt can ship with a cheat sheet ("if you see E00N, do X") and the model will apply mechanical fixes without re-reasoning about the whole program.
Case-based visual disambiguation¶
Lowercase starts values and variables; Uppercase starts types,
constructors, and module segments. There is no ambiguity at the token
level — a generator scanning its own output can tell maybe from
Maybe with zero semantic analysis.
Layout by newlines, not braces¶
ll-lang has no { / }. Blocks are introduced by = followed by an
indented body. A model does not have to track brace nesting depth,
and it cannot produce a file with an unbalanced {.
No hidden side effects¶
The only IO builtins are printfn, print, readFile, writeFile,
fileExists, and exit. Anything else the model generates must be a
pure function of its arguments. This makes it safe to recompile
thousands of candidate programs in a loop without worrying about
network or disk state.
Give the model the shape of the language, not English prose¶
Most LLMs do not know ll-lang a priori. Include a short, syntactically-dense priming prompt — not a paragraph of explanation:
ll-lang v2 syntax reference:
module Path.To.Mod -- required, first non-comment line
-- line comments start with --
name = expr -- top-level value binding
name(a Int)(b Int) = expr -- curried function; each param (name Type)
\x. x * 2 -- lambda
if cond -- no 'then'; body is indented
body
else alt
match expr -- no 'with'; arms indented below
| Pat -> body
| _ -> default
-- literals
42 3.14 "hi" true false 'c' '\n'
Maybe A = Some A | None -- type declaration (Uppercase, no 'type' kw)
Shape = Circle Float | Rect Float Float | Empty
add(a Int)(b Int) = a + b -- function (no 'fn' keyword)
area(s Shape) = -- match on last param: arms as body
| Circle r -> 3.14 * r * r
| Rect w h -> w * h
| Empty -> 0.0
tag UserId -- zero-cost newtype
"u1"[UserId] -- tag application (postfix brackets)
-- local bindings chain via layout (no 'let-in')
example =
x = 42
y = x + 1
y * 2
-- sequential effects
main =
_ = printfn "step 1"
_ = printfn "step 2"
0
Errors: E001 TypeMismatch, E002 UnboundVar, E003 NonExhaustiveMatch,
E004 UnitMismatch, E005 TagViolation, E008 InfiniteType
The examples do double duty as grammar and as reminders of common idioms.
Use the compiler as the oracle¶
The compiler is deterministic and fast. The LLM's loop should be:
- Generate
.lllcode. lllc build file.lll(orlllc runif the goal includes execution).- On error: parse the compact
EXXX line:col Name detailsmessage, apply a targeted fix, retry. - On success: ship.
You do not need natural-language tests for type, tag, or exhaustiveness errors — the compiler catches them. Only use tests for logic.
What LLMs get wrong most often¶
1. Using removed keywords¶
Wrong:
fn add(a Int)(b Int) Int = a + b
type Shape = Circle Float | Rect Float Float
Right:
add(a Int)(b Int) = a + b
Shape = Circle Float | Rect Float Float
There is no fn or type keyword. Functions start with a lowercase name
and parenthesised parameters. Types start with an uppercase name.
2. Using then after if¶
Wrong:
result = if x > 0 then "positive" else "non-positive"
Right:
result =
if x > 0
"positive"
else "non-positive"
if must be followed by an indented body on the next line, then else.
3. Using with after match¶
Wrong:
match m with
| Some n -> n
| None -> 0
Right:
match m
| Some n -> n
| None -> 0
match expr is followed directly by pattern arms — no with.
4. Using let ... in for local scope¶
Wrong:
f(x Int) =
let y = x * 2 in
y + 1
Right:
f(x Int) =
y = x * 2
y + 1
Local bindings chain via layout. let is only needed at the top level for
value constants (without parameters): let pi = 3.14159.
5. Comma-separated parameters¶
Wrong:
add(a Int, b Int) = a + b
Right:
add(a Int)(b Int) = a + b
Each parameter has its own parentheses.
6. Tags as constructors¶
Wrong:
let uid = UserId "user-42"
Right:
uid = "user-42"[UserId]
Tags are postfix brackets applied to a value, not constructors.
7. Forgetting exhaustive patterns¶
Every match must cover all constructors. The compiler emits E003 for
missing cases. Add | _ -> ... if an explicit catch-all is intended.
Telling the model what error codes mean¶
Because error codes are fixed, add a short recovery recipe directly to the system prompt:
If compiler returns E003 NonExhaustiveMatch Type missing:Ctor,
add a branch `| Ctor ... -> default` to the match expression.
If compiler returns E002 UnboundVar name:foo,
check that foo is declared above the use site or imported.
If compiler returns E001 TypeMismatch expected:T got:U,
add an annotation to the binding or fix the literal.
The LLM then applies mechanical fixes without re-reasoning from scratch.
A minimal loop script¶
Rough shape of an agent loop, in shell pseudocode:
while true; do
llm generate > prog.lll
if lllc build prog.lll 2> errors.txt; then
echo "shipped"
break
else
llm fix --errors errors.txt --in prog.lll > prog.lll
fi
done
Because lllc build is pure (no side effects beyond writing the target
output), you can run it thousands of times safely.
Using MCP for structured feedback¶
For LLM agents that support Model Context Protocol, lllc mcp exposes
structured tool calls that return machine-readable JSON instead of stderr:
| Task | Tool |
|---|---|
| Does this snippet type-check? | check_source { "source": "..." } |
| What does E003 mean with a repro? | lookup_error { "code": "E003" } |
| What list functions exist? | stdlib_search { "query": "list" } |
| What is the syntax for Pattern? | grammar_lookup { "rule": "Pattern" } |
| Compile and show F# output | compile_source { "source": "...", "target": "fs" } |
See 09-mcp.md for the full tool reference.
Prefer small files, flat structure¶
Keep each .lll file self-contained unless your project has a lll.toml
manifest. In single-file mode, Std.* imports are resolved automatically
by lllc run. A flat top-level structure also minimises the tokens the
model needs to regenerate on each fix iteration.