Ink Syntax Reference
This documents is an attempt at formalising Ink syntax, it still in elaboration, and details will surely evolve. However, I think I manage to come up with a global "syntactic feel" for Ink, which borrows things from Python, Io, Lisp, Haskell and a little bit of C++.
Blocks and expressions
Basically, a program can be considered as a structured sequence of expressions: expressions can be grouped in blocks, and these blocks can be contained in other blocks. As for documents the program source code can be represented as a tree (syntactically), which is then represented as a graph once the tree is evaluated (nodes may be references to other nodes, etc).
Ink expressions respect the following rule:
- They span on one line
- Or, if they contain parentheses, or are enclosed in parentheses, that do not end on the same line, can span the same number of line as the parenthese that contain it.
Here are some examples:
a := 1 a := (1, 2,3,4)
these examples are syntactically correct, but the following is not:
a := 1
Expressions can be grouped in blocks using parentheses, and expressions are separated by commas:
(a := 1, a = a + 1, print a)
which can also be written as:
( a := 1 a = a + 1 print a )
Here, we immediatly recognized that we wrote a block, and that we can use newlines instead of commas. Note that indentation is only here for presentation purpose, but the parentheses have to be preceded by a space and followed either by space or newline.
We now know how to write expressions and blocks, but we do not really know what makes up an expression. Let's investigate the details of an expression.
Detail of Ink expressions
Ink expressions follow a relatively simple and uniform model. Ink expressions can be of different types :
- Computation: a computation expression is similar to a mathematic computation in the sense that operations are applied to values, and these operation yield a result or may produce side-effects.
- Declaration: a declaration is an expression that declares a new symbol in the current computation context. This symbol may be bound with a value.
- Annotation: an annotation is an expression that "decorates" the following block or expressions. This is like source code meta information, as it does not produce any computational effect.
The most simple computations are the computations that involve primitive values and operations, like the ones that come from the mathematic algebra. Simple computations involve primitive values and primitive operations, which we will detail in the following section:
Simple computation expressions
As we said, simple computation expressions are made of primitive value and primitive operations. Here are the syntax of primitive values in Ink:
| Integer | -?[0-9]+ | 0, 1, -10, 0112, 009232 |
| Real | -?[0-9]+\.[0-9]+ | 0.0, -0.1, 0.023, 00.230 |
| String | \"[^"]*\" | "", "Hello World!" |
We will see later how to write multi-line strings. Also, note that characters can be escaped in strings with the \ symbol as in many other languages.
These values can be used with primitive operations, which are the following :
| Addition | <v> + <v> | 1 + 2 |
| Substraction | <v> - <v> | 1 - 2 |
| Division | <v> / <v> | 1 / 2 |
| Multiplication | <v> * <v> | 1 * 2 |
| Modulo | <v> % <v> | 1 % 2 |
| Times | <v> ^ <v> | 1 ^ 2 |
These primitive operations do not work with all values, but this is a matter of language semantics, and not syntax. You should only notice that spaces are required before and after each operator. This may sound very strange and limiting, but it enforces consistency of expressions presentation.
In addition to these primitive operations, there are also primitive predicates listed below :
| Equal | <v> == <v> | 1 == 2 |
| Greater | <v> > <v> | 1 > 2 |
| Greater or equal | <v> >= <v> | 1 >= 2 |
| Smaller | <v> < <v> | 1 < 2 |
| Smaller or equal | <v> <= <v> | 1 <= 2 |
We will see later that there are also primitive declarations, but this is specific to declaration expressions.
Computation expression can contain sub-expressions, which are always computation expressions. These subexpressions can be expressed with parentheses, which must be preceded with a space and be followed by either a space or newline.
Here are example of computation expressions with or without subexpressions:
1 + 2 + 3 + 4 1 * (2 + 3) / ( (2 + 4) / (4 ^ 16) ) 10 ^ ( 100 * 2 / (3 + 4) )
I hope that you will find that these expressions appear to be clear and regular, which is partly thanks to forcing whitespace presence around operators and parentheses. For the anecdote, I realised that not having spaces around operators cause problems when the Io language considered adding an := operator for assignment, while symbols could contain :, which lead to some inconsistencies in the grammar, which I disliked.
Before presenting more complex computation expression, we will first introduce the other kind of expressions.
Declaration expressions
Declaration expressions allow to declare a symbol in the current computation context, and assign it a value.
All declarations syntax respect the following form:
<symbol> := <computation expression>
which stands for a declaration and assignation, and
<symbol> <-- <computation expression>
which stands for an assignation, and implies that the symbol was previously declared.
A symbol is basically a sequence of characters matching [0-9\w\-_]+, in which case it is a direct symbol.
A direct symbol is a symbol that directly belongs to the current computation context. In some cases, symbols can belong to a different computational contexts, in which case some operators allow to specify how they can be accessed:
/is the namespace operator, and allows to find a symbol in a namespace relative to the current one, or by accessing a namespace from the root namespace. This works exactly like a Unix filesystem..is the property operator, which allows to reference an internal part of a value.
Both operators are part of indirect symbols (which are not expressions, BTW): an indirect operator match (([.]+|[0-9\w\-_])/)*([0-9\w\-_]+)(\.[0-9\w\-_]+)*, where the first part determines the namespace to which the symbol should be resolved, the second part being the symbol itself, and the last part being the property operator.
Here are valid examples:
myString.encoding
../thisOtherString.encoding
/Resources/Strings/FR/fr/HelloWorld?
Now you know enough about declaration expressions, we can now present the annotation expressions.
Annotation expressions
Annotation expressions are a specialy kind of expressions that allow to express meta information on the following or preceding expression, depending on their type.
Annotations expression are of the following form:
<space> / <text> <space> | <text>
The first form is to specify an annotation for the following expression, and the the second form is to specify an annotation to the previous expression. Annotations contain text, which is written in a specific (very simple) text markup language, which is not detailed here. Annotations are the equivalent of Java-doc, but are embedded into the program and can be accessed at runtime, pretty much like waht Python docstrings offer.
Comments are also considered as Annotation expressions. Comments are all of the following form:
<space> # <text>
Annotations all fit on a single line, but when annotations are contiguous, they are interpreted as a single annotation composed of the various contiguous annotations.
The cartesian product
References
Constructing a Tabular Lexer Pragmatic Parsing in Common Lisp
Open Questions
Is this a language that's downloadable anywhere?
