Kenpali Code Specification
Tokens
Kenpali code is first split into tokens.
The following regular expressions are used for number and string literals:
NUMBER = -?(0|[1-9](\d*))(.\d+)?([Ee][+-]?\d+)?
STRING = "(\\"|[^"])*"
RAW_STRING = `[^`]*`
The following regular expression is used for names:
NAME = [A-Za-z][A-Za-z0-9]*
The keywords null, false, and true are treated as literal tokens even though they match the name pattern.
The following character sequences are tokens:
( ) [ ] { } , ; : => = |. | @ . $ ** * _ /
All spaces, tabs, carriage returns, and linefeeds are considered whitespace and discarded.
Comments consist of the characters // and all following text until the end of the line. They are also discarded when parsing.
Any characters not matching the above token patterns cause parsing to fail.
Literals
literal ::= "null" | "false" | "true" | NUMBER | STRING | RAW_STRING
A literal parses to a literal expression.
In addition to standard JSON string escapes, Kenpali supports 5- and 6-digit Unicode escape sequences using the syntax \u{<code>}.
Kenpali supports “raw string” syntax, delimited using backticks instead of quotes. Raw strings treat all backslashes as literal backslashes, rather than creating escape sequences, which can make backslash-heavy strings (e.g. regexes) easier to write and read. Raw strings parse to ordinary literal expressions.
Comments
A comment can appear on its own line or at the end of a line.
Names
name ::= [NAME "/"] NAME
A name normally parses to a name expression, though some other syntactic structures use names for other purposes.
The form with a slash indicates that the name is found in a module. The module name is added as the from property of the name expression.
Kenpali uses camelCase for names by convention.
Arrays
array ::= "[" [array_element ("," array_element)* [","]] "]"
array_element ::= assignable | array_spread
array_spread ::= "*" assignable
An array parses to an array expression.
An array can optionally have a comma after the last element, though this is normally only done if the array spans multiple lines.
Any of the array’s elements can be spreads instead of normal expressions, indicated by a * prefix. These are parsed into spread nodes in the JSON representation.
Objects
object ::= "{" [object_element ("," object_element)* [","]] "}"
object_element ::= object_entry | object_key_name | object_spread
object_entry ::= assignable ":" assignable
object_key_name ::= NAME ":"
object_spread ::= "**" assignable
An object parses to an object expression.
If a key is a valid Kenpali name, the quotes can be omitted.
If the key is meant to actually reference a name from the surrounding scope, enclose it in parentheses.
If the value is omitted, it defaults to reading the property name from the surrounding scope: {foo:} is equivalent to {foo: foo}.
Any of the object’s entries can be a spread instead of a normal entry, indicated by a ** prefix. These are parsed into entries with a {"type": "spread"} key in the JSON representation.
Groups
group ::= "(" expression ")"
Any expression can be enclosed in parentheses to force it to be parsed first, overriding precedence rules and special processing rules. A group parses to the same JSON as the expression it contains would if parsed on its own.
Scopes
expression ::= scope
scope ::= statement* assignable
statement ::= [name_pattern "="] assignable ";"
name_pattern ::= NAME | ignore | array_pattern | object_pattern
ignore ::= "_"
array_pattern ::= "[" [array_pattern_element ("," array_pattern_element)* [","]] "]"
array_pattern_element ::= name_pattern ["=" assignable] | array_rest
array_rest ::= "*" name_pattern
object_pattern ::= "{" [object_pattern_element ("," object_pattern_element)* [","]] "}"
object_pattern_element ::= object_pattern_simple ["=" assignable] | object_rest
object_pattern_simple ::= object_pattern_entry | object_pattern_key_name
object_pattern_entry ::= assignable ":" name_pattern
object_pattern_key_name ::= NAME ":"
object_rest ::= "**" name_pattern
A scope parses to a block expression
In its simplest form, a scope is a list of assignments followed by an expression to evaluate, separated by semicolons.
The semicolon operator has the lowest precedence, so blocks typically need to be enclosed in parentheses when nested in other expressions.
The left-hand side of an assignment can be a pattern instead of a single name.
An array pattern has a similar syntax to an array, and parses to an array pattern node in the JSON representation.
In an array pattern, elements can be ignored by using an underscore instead of a name. The underscore is parsed as an ignore node in the JSON representation. Since the underscore is special syntax, rather than a valid name with a conventional meaning, it can be used repeatedly in the same scope.
An object pattern has a similar syntax to an object, and parses to an object pattern node in the JSON representation. The names to the left of the colon are the keys to look up in the object, with the name pattern to bind them to appearing on the right.
If the name pattern to the right of the colon is omitted, the value is assigned to the same name in the current scope: {foo:} is equivalent to {foo: foo}.
An expression can be used directly as a statement. This is equivalent to assigning to _; the resulting JSON has an ignore node as the assignment target, so the expression is evaluated and its result is discarded.
On the other hand, an assignment isn’t a valid expression, so it can’t be used as the final expression in a scope.
Similarly, an assignment can’t itself be assigned to another name.
Tight Pipelines
tight_pipeline_call ::= atomic [tight_pipeline]
atomic ::= group | array | object | literal | name
tight_pipeline ::= tight_pipeline_step*
tight_pipeline_step ::= argument_list | property_access
argument_list ::= "(" [argument ("," argument)* [","]] ")"
argument ::= object_element | array_element
property_access ::= "." NAME
Kenpali Code syntax relies heavily on pipelines—sequences of operations where the output from each operation is the input to the next. There are two kinds of pipelines, called tight and loose to reflect their different precedence levels. This section describes tight pipelines, which consist of function calls and property accesses.
Function Calls
Function call steps parse to call expressions.
Function arguments can contain any syntax that works in an array, including spreads.
Named arguments are passed using object-like syntax, and the same shorthand syntax available in objects is also available when passing named arguments.
A tight pipeline can contain several function calls in a row, with the result of each call itself being called with the next arguments.
Property Access
A single property can be extracted from an object by putting the property name after a dot. This parses to an index expression.
Property access can be chained, and associates left to right.
A tight pipeline can have any combination of function calls and property accesses.
Function Definitions
arrow_function ::= parameter_list "=>" assignable
parameter_list ::= "(" [parameter ("," parameter)* [","]] ")"
parameter ::= array_pattern_element | object_pattern_element
A function definition parses to a function expression.
Positional parameters support all the same syntax available in array patterns, including optional and rest elements.
Named parameters support all the same syntax available in object patterns, including optional elements, rest elements, and recursively binding to a name pattern.
The body of a function is often a scope, which must be enclosed in parentheses because the semicolon has a lower precedence than the arrow.
Loose Pipelines
assignable ::= arrow_function | loose_pipeline_call | loose_pipeline | constant_function
loose_pipeline_call ::= tight_pipeline_call [loose_pipeline]
loose_pipeline ::= loose_pipeline_step*
loose_pipeline_step ::= pipe | pipe_dot | at
pipe ::= "|" tight_pipeline_call
pipe_dot ::= "|." name (tight_pipeline_step)*
at ::= "@" tight_pipeline_call
Loose pipelines have a lower precedence than tight pipelines, and consist of pipes, loose property accesses, and indexing steps.
Pipes
Pipe steps are transformed into ordinary function calls, producing call expressions.
If the target of the pipe is a tight pipeline ending in a function call, the input is injected as the first positional argument to that call.
Any named arguments in the call are retained alongside the injected positional argument.
Pipes can be chained together, with the output of each pipe becoming the input to the next.
Argument injection can be blocked by enclosing the target in parentheses.
The body of a function can be a loose pipeline without any parentheses—pipeline operators have higher precedence than the arrow.
This precedence also means that a function inside a loose pipeline must be enclosed in parentheses.
Loose property access
The loose property access operator |. does the same thing as the tight property access operator ., but its precedence is that of a loose pipeline. If tight property access is used alongside loose pipeline operators, the tight property access happens first.
Using loose property access makes the operations happen from left to right instead.
Further tight pipeline steps can be chained onto the end of a loose property access; the whole sequence of operations happens from left to right.
Compare this to the case where the loose pipeline uses only pipes.
Indexing
Indexing steps parse to index expressions.
Indexing steps can be used alongside pipes, with the operations happening from left to right.
Like other loose pipeline operators, indexing has a lower precedence than tight pipeline operators.
Function Definition Shorthand
Kenpali has two kinds of shorthand syntax for compactly defining common function types.
A constant function ignores any arguments passed to it. The shorthand syntax is a $ followed by an expression for the function’s return value.
constant_function ::= "$" assignable
As with ordinary functions, the body of a constant function can be a loose pipeline without the need for parentheses.
A point-free pipeline is written as a loose pipeline missing the initial value. It parses to a function with one positional parameter, which becomes the missing initial value.
point_free_pipeline ::= loose_pipeline
Point-free pipelines can start with any loose pipeline operator.