Shake Lexer Token Specification
§ 1 Definition
§ 1.1 Tokens
A token is a sequence of characters that form a meaningful unit in a program. It holds the following information:
- The type of the token
- The position of the token in the source code (Start and End index in the source code)
- The value of the token (if applicable)
§ 1.2 Token Types
There are two variants of token types, there are types that always have the same value and therefore do not need to store the value in the token (e.g. SEMICOLON
or any keyword) and there are types that can have different values (e.g. INTEGER
or IDENTIFIER
). The latter ones store the value in the token. They are implemented as an enum in the lexer
package.
§ 2 Token Types
§ 2.1 "fixed" token types:
Token Type | Description | Value |
---|---|---|
ADD | Addition | + |
ADD_ASSIGN | Addition Assignment | += |
ASSIGN | Assignment | = |
BIT_AND | Bitwise And | & |
BIT_AND_ASSIGN | Bitwise And Assignment | &= |
BIT_NAND | Bitwise Not And | ~& |
BIT_NOR | Bitwise Not Or | ~| |
BIT_NOT | Bitwise Not | ~ |
BIT_OR | Bitwise Or | | |
BIT_OR_ASSIGN | Bitwise Or Assignment | |= |
BIT_XNOR | Bitwise Not Xor | ~^ |
BIT_XOR | Bitwise Xor | ^ |
BIT_XOR_ASSIGN | Bitwise Xor Assignment | ^= |
COLON | Colon | : |
COMMA | Comma | , |
DECR | Decrement | -- |
DIV | Division | / |
DIV_ASSIGN | Division Assignment | /= |
DOT | Dot | . |
EOF | End of File | |
EQ | Equals | == |
GT | Greater Than | > |
GTE | Greater Than or Equal | >= |
INCR | Increment | ++ |
KEYWORD_ABSTRACT | Abstract Keyword | abstract |
KEYWORD_AS | As Keyword | as |
KEYWORD_CLASS | Class Keyword | class |
KEYWORD_CONST | Const Keyword | const |
KEYWORD_CONSTRUCTOR | Constructor Keyword | constructor |
KEYWORD_DO | Do Keyword | do |
KEYWORD_ELSE | Else Keyword | else |
KEYWORD_ENUM | Enum Keyword | enum |
KEYWORD_FALSE | False Keyword | false |
KEYWORD_FINAL | Final Keyword | final |
KEYWORD_FOR | For Keyword | for |
KEYWORD_FUN | Fun Keyword | fun |
KEYWORD_IF | If Keyword | if |
KEYWORD_IMPORT | Import Keyword | import |
KEYWORD_IN | In Keyword | in |
KEYWORD_INLINE | Inline Keyword | inline |
KEYWORD_INSTANCEOF | Instanceof Keyword | instanceof |
KEYWORD_INTERFACE | Interface Keyword | interface |
KEYWORD_NATIVE | Native Keyword | native |
KEYWORD_NULL | Null Keyword | null |
KEYWORD_OBJECT | Object Keyword | object |
KEYWORD_OPERATOR | Operator Keyword | operator |
KEYWORD_OVERRIDE | Override Keyword | override |
KEYWORD_PACKAGE | Package Keyword | package |
KEYWORD_PRIVATE | Private Keyword | private |
KEYWORD_PROTECTED | Protected Keyword | protected |
KEYWORD_PUBLIC | Public Keyword | public |
KEYWORD_RETURN | Return Keyword | return |
KEYWORD_STATIC | Static Keyword | static |
KEYWORD_SUPER | Super Keyword | super |
KEYWORD_SYNCHRONIZED | Synchronized Keyword | synchronized |
KEYWORD_THIS | This Keyword | this |
KEYWORD_TRUE | True Keyword | true |
KEYWORD_VAL | Val Keyword | val |
KEYWORD_VAR | Var Keyword | var |
KEYWORD_WHILE | While Keyword | while |
LCURL | Left Curly Brace | { |
LINE_SEPARATOR | Line Separator | \n |
LOGICAL_AND | Logical And | && |
LOGICAL_NOT | Logical Not | ! |
LOGICAL_OR | Logical Or | || |
LOGICAL_XOR | Logical Xor | ^ |
LPAREN | Left Parenthesis | ( |
LSQBR | Left Square Bracket | [ |
LT | Less Than | < |
LTE | Less Then or Equal | <= |
MOD | Modulo | % |
MOD_ASSIGN | Modulo Assignment | %= |
MUL | Multiplication | * |
MUL_ASSIGN | Multiplication Assignment | *= |
NEQ | Assert not equal | != |
POW | Power | ** |
POW_ASSIGN | Power Assignment | **= |
RCURL | Right Curly Brace | } |
RPAREN | Right Parenthesis | ) |
RSQBR | Right Square Bracket | ] |
SEMICOLON | Semicolon | ; |
SUB | Subtraction | - |
SUB_ASSIGN | Subtraction Assignment | -= |
§ 2.2 "variable" token types:
Token Type | Description | Value |
---|---|---|
CHARACTER | Character | e.g. 'a' |
FLOAT | Float | e.g. 1.0 |
IDENTIFIER | Identifier | e.g. a |
INTEGER | Integer | e.g. 1 |
STRING | String | e.g. "a" |