Sentinel Language Specification
This is the specification for the Sentinel policy language.
The Sentinel language is designed with policy enforcement in mind. It is dynamically typed and garbage collected and has explicit support for rule construction representing boolean logic.
The language is designed to be easy to learn and use by non-programmers. It is expected to be embedded within applications.
Table of Contents
- Source code representation
- Declarations and Scope
- Blocks
- Lexical Elements
- Variables
- Undefined
- Expressions
- Operand
- Primary Expressions
- Null
- Booleans
- Boolean Literals
- List Literals
- Map Literals
- Function Literals
- Rule Expressions
- Index Expressions
- Selectors
- Slice Expressions
- Calls
- Operators
- Arithmetic Operators
- Comparison Operators
- Logical Operators
- Set Operators
- Matches Operator
- Else Operator
- Quantifier Expressions (any, all, filter, map)
- Statements
- Imports
- Parameters
- Built-in Functions
- Program Execution
Source code representation
Source code is Unicode text encoded in UTF-8. A "character" referenced in this document refers to a Unicode code point. Each code point is distinct: upper and lower case letters are different characters.
The underscore character _ (U+005F) is considered a "letter".
Declarations and Scope
A declaration binds an identifier to a value. An identifier is declared at the point it is first assigned. An assignment is considered a first assignment if the identifier hasn't already been previously declared in the current scope or any parent scopes.
The scope of an identifier is the extent of source text in which the identifier denotes the specified value. Sentinel is lexically scoped using blocks.
An identifier declared in a block may be redeclared in an inner block. While the identifier of the inner declaration is in scope, it denotes the value assigned in the inner declaration.
Blocks
A block is a possibly empty sequence of statements within matching brace brackets.
Blocks nest and affect scoping.
In addition to explicit blocks in the source code, there are implicit blocks:
- The universe block encompasses all source text.
- Each program has a program block containing all source text for that program.
- Each "any", "all", and "for" statements is considered to be in its own implicit block. "if" does not create an implicit block.
Lexical Elements
Comments
Comments are sections of source text used for documentation.
Three forms of comments are supported: two single-line forms and one multi-line form. A single-line comment begins with the token //
or #
. Everything between the starting token and the end of the line is ignored.
A multi-line comment begins with the token /*
and ends with the token */
. Everything between /*
and */
is ignored.
A comment may not start inside a string literal or inside another comment.
A multi-line comment containing no newlines acts like a space.
Identifiers
Identifiers name program entities such as rules, variables, and functions. An identifier is a sequence of one or more letters and digits. The first character in an identifier must be a letter.
Pre-Declared Identifiers
The following identifiers are implicitly declared in the universe block:
Keywords
The following keywords are reserved and may not be used as identifiers:
Operators and Delimiters
The following character sequences represent operators, delimiters, and other special tokens:
As a special case, else
is both a keyword and operator, depending on the context. See Else Operator for more details.
Integer Literals
An integer literal is a sequence of digits representing an integer constant. An optional prefix sets a non-decimal base: 0
for octal, 0x
or 0X
for hexadecimal. In hexadecimal literals, letters a-f
and A-F
represents values 10 through 15.
Integers are signed 64-bit values (-9223372036854775808 to 9223372036854775807).
Floating-point Literals
A floating-point literal is a decimal representation of a floating-point constant. It has an integer part, a decimal point, a fractional part, and an exponent part. The integer and fractional part comprise decimal digits; the exponent part is an e or E followed by an optionally signed decimal exponent. One of the integer part or the fractional part may be elided; one of the decimal point or the exponent may be elided.
Floating-point numbers are IEEE-754 64-bit floating numbers.
String Literals
String literals are character sequences between double quotes, as in "bar". Within the quotes, any character may appear except newline and unescaped double quote. The text between the quotes forms the value of the literal.
A multi-character sequence beginning with a backslash encode values in various formats.
Backslash escapes allow arbitrary values to be encoded as ASCII text. There are four ways to represent the integer value as a numeric constant: \x
followed by exactly two hexadecimal digits; \u
followed by exactly four hexadecimal digits; \U
followed by exactly eight hexadecimal digits, and a plain backslash \
followed by exactly three octal digits. In each case the value of the literal is the value represented by the digits in the corresponding base.
After a backslash, certain single-character escapes represent special values:
The three-digit octal (\nnn) and two-digit hexadecimal (\xnn) escapes represent individual bytes of the resulting string; all other escapes represent the (possibly multi-byte) UTF-8 encoding of individual characters. Thus inside a string literal \377 and \xFF represent a single byte of value 0xFF=255, while ÿ, \u00FF, \U000000FF and \xc3\xbf represent the two bytes 0xc3 0xbf of the UTF-8 encoding of character U+00FF.
A string value is a (possibly empty) sequence of bytes. Strings are immutable: once created, it is impossible to change the contents of a string.
The length of a string s (its size in bytes) can be discovered using the built-in function length
. An individual character (of type string) can be accessed by integer indices 0 through length(s)-1
.
Implicit Line Joining
Expressions can be split over more than one line. For example:
Implicitly continued lines can have trailing comments. Blank continued lines are allowed.
Whitespace
Whitespace is needed to separate tokens, but no distinction is made between the number and combination of whitespace characters. Whitespace characters can be space (U+0020), horizontal tabs (U+0009), carriage returns (U+000D), and newlines (U+000A).
Semicolons
The formal grammar uses semicolons ";" as terminators in a number of productions. It is idiomatic Sentinel source to omit semicolons in most cases.
A semicolon is automatically inserted into the token stream immediately after a line's final token if that token is:
- An identifier
- An integer, float, or string literal
- The keyword
break
,continue
, orreturn
- The delimiter
)
,]
, or}
Variables
A variable is a storage location for holding a value.
A variable has a dynamic type, which is the concrete type of the value assigned to the variable at run time. The dynamic type may vary during execution and change as a result of further assignments.
A variable's value is retrieved by referring to the variable in an expression; it is the most recent value assigned to the variable. If a variable has not yet been declared (initially assigned), it is an error.
Undefined
The value denoted by the keyword undefined
represents undefined behavior or values. It can be created directly using the keyword undefined
. It is also returned as a result of expressions in specified cases.
undefined
is a valid operand for any operations. Only undefined or true
will result in true. All other operations result in undefined
. An exception is if undefined
is not reached as a result of short-circuit operations.
If the result of the main rule is undefined
, it is treated as false
but is indicative of erroneous logic.
Expressions
Operand
Operands denote the elementary values in an expression. An operand may be a literal, an identifier denoting a variable, rule, or function, or a parenthesized expression.
Primary Expressions
Primary expressions are the operands for unary and binary expressions.
Null
The reserved word null
denote the singleton value null
. Null represents the explicit absence of a value. Behavior of null
within expressions is specified explicitly for each expression.
Booleans
Boolean Literals
The reserved words true
and false
denote objects that represent the boolean values true
and false
, respectively. These are boolean literals.
Boolean Expressions
Boolean expressions are expressions that must result in a boolean value. Any other value type becomes the undefined
value.
List Literals
A list literal denotes a list, which is an integer indexed collection of values.
A list may contain zero or more values. The number of values in a list is its length.
A list has a set of indices. An empty list has an empty set of indices. A non-empty list has the index set {0...n - 1}
where n
is the length of the list. Attempting to access a list using an index that is not a member of its set of indices results in the undefined
value.
Map Literals
A map literal denotes a map, which is an unordered group of elements indexed by a set of unique keys.
Keys can only be a boolean, numeric, or string type.
The value of a non-existent key is the undefined
value.
Function Literals
A function literal represents a function.
Function literals are only allowed in the file scope. They may refer to variables defined in surrounding blocks.
A function must terminate with a return statement. If a function has no meaningful return value, it should return undefined
.
Rule Expressions
A rule is an expression that is evaluated lazily and the result is memoized.
If the optional "when" predicate is present, the rule is evaluated only when the "when" boolean expression results in true. If the predicate is false, the rule is not evaluated and returns true. The predicate is evaluated when the rule would be evaluated; it is also lazy and memoized in the same way.
Index Expressions
A primary expression of the form a[x]
denotes the element of a list or map indexed by x. The value x is called the index or key.
For a
of type map:
- If
a
contains a keyx
,a[x]
is the map value with keyx
. - If
a
does not contain keyx
,a[x]
is theundefined
value.
For a
of type list:
x
must be an integerx
must be in the range[-1 * length(a), length(a)-1]
- If
x
is contained in the set of indices ofa
,a[x]
is the list element at indexx
. Ifx
is negative,a[x]
is equivalent toa[length(a)+x]
. - If
x
is not contained in the set of indices ofa
,a[x]
is theundefined
value.
For a
of value null
:
a[x]
is theundefined
value.
Otherwise a[x]
is an error.
Selectors
For a primary expression x, the selector expression x.f
denotes the field f
of the value x
. The identifier f
is called the selector. The type of the selector expression is the type of the selector.
As a special case, selectors can be reserved words and keyword operators, but cannot be any other non-identifier element.
Selectors are used to access data from imports. The first primary expression x
in x.f
denotes the import name. The field f
is the selector to access data from the import.
Selectors may also be used to access map data with static keys. They are syntactic sugar over index expressions. math.pi
is equivalent to math["pi"]
and exhibit the same limitations and behavior as an index expression. Selectors cannot, however, be used to assign values to map data.
Selectors on undefined result in undefined.
Slice Expressions
Slice expressions construct a substring or list from a string or list.
The primary expression
constructs a substring or list. The indices low
and high
select which elements of operand a
appear in the result. The result has indices starting at 0 and length equal to high - low
.
For convenience, any of the indices may be omitted. A missing low
index defaults to zero; a missing high
index defaults to the length of the sliced operand:
The indices are in range if 0 <= low <= high <= length(a)
, otherwise they are out of range. If the indices are out of range at run time, the result is the undefined
value.
If a
is the value null
, the result is the undefined
value.
If a
is any other value type, it is an error.
Calls
Given an expression f
where f
is a function value:
calls f
with arguments a1, a2, ... an
. The type of the expression is the result of f
. The arguments are evaluated left to right. Arguments are passed by value.
Operators
Operator Precedence
Unary operators have the highest precedence.
Binary operators of the same precedence associate from left to right. For instance, x / y z is the same as (x / y) z.
Arithmetic Operators
Arithmetic operators apply to numeric values and yield a result of the same type when both operands are the same.
Arithmetic operations between integer and floating-point types result in a floating-point value with the integer operand treated as a floating-point value for purposes of calculation.
Arithmetic operations between numeric values (integer, floating-point) and non-numeric values (strings, lists) are not permitted.
All five supported arithmetic operators (+, -, *, /, %) apply to both integer and floating-point types; + also applies to strings.
Integer operators
For two integer values x and y, the integer quotient q = x / y
and remainder r = x % y
satisfy the following relationships:
with x / y truncated towards zero.
As an exception to this rule, if the dividend x is the most negative value for the int type of x (-9223372036854775808), the quotient q = x / -1
is equal to x (and r = 0).
If the divisor is a constant, it must not be zero. If the divisor is zero at run time, an error occurs.
Integer overflow
For signed integers, the operations +
, -
, and *
may legally overflow and the resulting value exists and is deterministically defined by the signed integer representation, the operation, and its operands. No exception is raised as a result of overflow.
Floating-point operators
For floating-point numbers, +x is the same as x, while -x is the negation of x. The result of a floating-point division by zero is not specified beyond the IEEE-754 standard.
String Concatenation
Strings can be concatenated using the +
operator or the +=
assignment operator:
String addition creates a new string by concatenating the operands.
List Concatenation
Lists can be concatenated using the +
operator or the +=
assignment operator:
List addition creates a new list by concatenating the operands.
Comparison Operators
Comparison operators compare two operands and yield a boolean value.
In any comparison, the two operands must be equivalent types. The only exception are integers and floats, which are considered numeric and may be compared. Any comparison between other non-matching types results in the undefined
value.
The equality operators ==
, !=
, is
, and is not
apply to operands that are comparable. The ordering operators <
, <=
, >
, and >=
apply to operands that are ordered. The behavior of is
with ==
and is not
with !=
is identical. These terms and the result of the comparisons are defined as follows:
- Boolean values are comparable. Two boolean values are equal if they are either both true or both false.
- Integer values are comparable and ordered, in the usual way.
- Floating point values are comparable and ordered, as defined by the IEEE-754 standard.
- An integer compared with floating point value treats the integer as the converted floating point value.
- String values are comparable and ordered, lexically byte-wise.
- Lists are comparable. Lists are equal if they are of equal length and their corresponding elements are comparable and equal.
- Maps are comparable. Maps are equal if they are of equal length and both their corresponding keys and values are comparable and equal.
If either operand is the undefined
value, the result of the expression is the undefined
value.
Emptiness Comparisons
Checking the emptiness of an object can be achieved by using one of the emptiness expressions, is empty
or is not empty
. Although similar to Comparison Operators in that they yield a boolean value and read as though a comparison is taking place, both is empty
and is not empty
are evaluated as a multi-word expression.
An emptiness comparison can only be performed against collections, strings or undefined
, with the same rules as the built-in length function.
Logical Operators
Logical operators apply to boolean values and yield a boolean result.
Logical operators are evaluated left-to-right and perform short-circuit logic. The right-hand side is not guaranteed to be evaluated if a short-circuit can be performed.
Set Operators
The set operators contains
and in
test for set inclusion for lists
and maps, and substring inclusion for strings.
Set operators may be negated by prefixing the operator with not
: not contains
and not in
. This is equivalent to wrapping the binary expression in a unary not
but results in a more readable form.
contains
tests if the left-hand collection contains the right-hand value. For lists, this tests equality of any value. For maps, this tests equality of any key. For strings, this tests for substring existence.
in
tests if the right-hand collection contains the left-hand value. The behavior is equivalent to contains
.
The collection must be a list, map, or string. If it is the undefined
value, the result is the undefined
value. For any other value, the result is an error.
Matches Operator
The matches operator tests if a string matches a regular expression.
The matches operators may be negated by prefixing the operator with not
: not matches
. This is equivalent to wrapping the binary expression in a unary not
but results in a more readable form.
The left-hand value is the string to test. The right-hand value is a string value representing a regular expression.
The syntax of the regular expression is the same general syntax used by Python, Ruby, and other languages. More precisely, it is the syntax accepted by RE2. The regular expression is not anchored by default; any anchoring must be explicitly specified.
If either operand is undefined
, the result is the undefined
value. For any other non-string value, the result is an error.
Else Operator
Else operators can be used to provide a default value for an expression that may evaluate to the undefined
value. Else operators return their left-hand value unless it is undefined
, otherwise they return their right-hand value.
Quantifier Expressions (any, all, filter, map)
Quantifiers are expressions that apply a predicate to a collection (list or map). The purpose of the quantifier is to produce a result based on its particular type.
any
and all
expressions are existential and universal quantifiers, respectively. any
expressions are equivalent to a chain of or
and all
expressions are equivalent to a chain of and
. Both expressions implement short-circuiting equivalent to logical operators.
The map
expression is an apply-to-all quantifier and returns a new list for all values in the input collection. The output list will be the same length as the input collection.
The filter
expression is a subset quantifier and returns the subset of all values in the input collection for which the supplied predicate applies. This will be zero or more elements, up to the length of the input.
The body of a quantifier expression is a boolean expression.
any
returns the boolean value true
if any value in the collection expression results in the body expression evaluating to true
. If the body expression evaluates to false
for all values, the any expression returns false
.
all
returns the boolean true
if all values in the collection expression result in the body expression evaluating to true
. If any value in the collection expression result in the body expression evaluating to false
, the all expression returns false
.
For empty collections, any
returns false and all
returns true
.
map
always returns a list, regardless of if the input collection is a list itself; maps (as in the data type) also return lists when processed through map
. Each element is the result of the supplied expression body.
filter
returns a list or map, the same type as its input collection. Only the elements for which the expression body evaluated to true
will be returned. If an iteration of the expression body results in undefined
, the entire result set is undefined
.
When the collection is a list, the first identifier is assigned the index and the second identifier is assigned the value. If only one identifier is specified, it is assigned the value.
When the collection is a map, the first identifier is assigned the key and the second identifier is assigned the value. If only one identifier is specified, it is assigned the key.
Statements
Expression Statements
Function call expressions may exist in the statement context.
Assignments
Assignments set the value of an expression to the value of another expression.
The assignment x op= y
is equivalent to x = x op (y)
for supported values of op
.
Assignments to lists and maps can be carried out using index expressions, with the operands of the index expression being evaluated alongside the right hand side expression in a right-to-left (RHS, LHS) order.
In addition to the rules detailed in the section describing their use, the following rules apply when assigning to maps or lists using index expressions:
- The variable must exist and be a list or map. Attempts to use an index assignment on an unknown variable, or a variable that not a list or map, results in a runtime error.
- Assigning to a list or map index that already exists overwrites that index's data with evaluated right hand side's value.
- Assigning to an unknown key in a map creates a key for that map with the evaluated right hand side's value assigned to that key.
- Attempting to assign a value to a list index that is out of range results in a runtime error.
If Statements
If statements specify the conditional execution of two branches according to the value of a boolean expression. If the expression evaluates to true, the "if" branch is executed, otherwise, if present, the "else" branch is executed.
Case Statements
Case statements specify a selection control mechanism to conditionally execute a branch based on expression equality.
Each clause that wishes to provide an expression must use the "when" keyword. Multiple expressions can be provided, separated with a comma (",").
There can be one, optional, "else" clause within a case
statement. The "else" clause is executed if all other clauses failed to run.
The clauses are evaluated in the order they are listed, with the first successful evaluation being executed.
A case
statement can optionally provide an expression. If no expression is provided, it is the equivalent of writing case true {
.
For Statements
A for
statement specifies repeated execution of a block.
The expression must evaluate to a list or a map.
When iterating over a list, the first identifier is assigned the index and the second identifier is assigned the value. If only one identifier is specified, it is assigned the value.
When iterating over a map, the first identifier is assigned the key and the second identifier is assigned the value. If only one identifier is specified, it is assigned the key.
Break Statements
A break
statement terminates execution of the innermost for
statement
within the same function.
Continue Statements
A continue
statement begins the next iteration of the innermost for
loop.
The for
loop must be within the same function.
Return Statements
A return statement in a function F terminates the execution of F and provides the result value.
A function must terminate with a return statement.
The return statement can be invoked at any time during a function to terminate execution.
Imports
An import adds an assignment to the global scope to external definitions.
The import declarations must only appear at the top of the source file, before any other statements. Comments may appear before imports.
An import name and identifier must be distinct. Two names may not repeated even if they're declared with different identifiers.
If an identifier is not specified, the identifier is the name of the import itself.
An import is not a valid value. The identifier representing the import may not be used as an expression, such as in assignment statements or call expressions.
Values in imports are not assignable directly via their selectors. Function calls may be used to alter the state of the external data represented by the import. Whether or not this is supported is specific to the import implementation. The exception to this assignment restriction is the memoized data returned by a call to an import (see below).
Data returned by imports can be memoized, meaning that data is returned by the import in the form of a map which is then used as much as possible without having to return to the import for more data. For example, t = time.now
returns a map so that t.hour
will be able to be looked up without having to go back to the import for that data.
Data returned by imports may be callable, meaning that methods may be called on the memoized data returned by an import. For example, given t = time.now
, t.after(some_previous_time)
is valid, with the receiver data being set to the local memoized data stored in t
. It is a valid case to accept modifications to the receiver data (example: assignment to the memoized data via index expressions), as long as all data in the receiver continues to be string-keyed. It is an invalid case to accept receiver data with non-string-typed keys. Methods may also alter the contents of receiver data so that it is different after the call is finished.
As with methods, imports may also have callable keys where the data may not be memoized. Example: in the event of v = foo.bar
, if v.baz
is not included in the memoized data, a call will be made to the import to fetch it. The same rules and restrictions to receiver data apply to callable keys as do to method calls.
The support for callable methods and non-memoized keys on data returned by imports is specific to the import implementation.
The handling of import loading is specific to the runtime implementation.
Implicit imports may be added by the runtime, for example for standard library definitions. An explicit import of the same name should override a matching implicit import. For example, if "time" is an implicit import and the program specifies import "time" as cal
, then only cal
is defined.
Parameters
A parameter is a way for a policy to describe variables that are expected to be supplied by the calling application, in addition to any default value.
Parameter declarations must appear at the top of the source file, following imports.
Like imports, comments may appear before parameters; this is mainly pertinent when the policy is not importing anything, which would make parameters the first declarations that appear in a policy.
A parameter declaration describes a valid variable that is added to the scope of a policy at runtime. This variable has the same properties and rules as other variables assigned during the course of policy evaluation as defined by the specification. This includes having a dynamic type and being able to be re-assigned during execution.
Parameters may only be strings, integers, floating point numbers, booleans, lists, or maps. More complex types such as rules or functions are not allowed.
A parameter can specify a default value by adding one in the declaration via use of the "default" keyword, and supplying a literal for the default value. The literal for a default is a very limited expression, comprising of:
- For strings, a simple string literal;
- For integers and floating-point numbers, either a literal denoting the value, or a unary expression with the literal, and the operators - or +;
- For booleans, the pre-declared identifiers true or false;
- For lists and maps, their respective literals composed of values and keys (for maps) of the above values only.
Any other type of expression or identifier is not allowed.
A parameter that does not have a default value defined is required by the policy. Evaluating a policy with a required parameter that has not been supplied at execution results in a runtime error.
Parameters must not conflict with imports or any other identifiers currently defined in the global scope, including any reserved or pre-declared identifiers. Attempting to assign a parameter to an existing value results in a runtime error.
Built-in Functions
Built-in functions are predeclared. They are called like any other function.
Length
The built-in function length
returns the length of a collection of string.
The length of a string is the number of bytes in the string. It is not necessarilly the number of characters.
The length of a collection is the number of elements in that collection.
The length of undefined
is undefined
.
Collections
List Append
The built-in function append
appends a value to the end of a list in-place.
Appending to a non-list value results in an immediate fail
.
The return value of append
is always the undefined
value.
Map Delete
The built-in function delete
deletes elements from a map by key. The map is modified in-place.
Deleting a key that does not exist does nothing.
Calling delete for a non-map value results in an immediate fail
.
The return value of delete
is always the undefined
value.
Keys and Values
The built-in function keys
and values
return the keys and values of a map, respectively. The return value is a list. Both functions return values in an unspecified order.
The keys
or values
of undefined
is undefined
.
Range
The built-in function range
returns a list of numbers in a range.
There are three ways to call this function:
The start
is inclusive, the end
is exclusive.
If start
is not provided, it defaults to 0
. If step
is not provided, it defaults to 1
.
Type Conversion
The built-in functions int
, float
, string
, and bool
convert a value to a value of that type according to the rules below.
For int
:
- Integer values are unchanged
- String values are converted according to the syntax of integer literals
- Float values are rounded down to their nearest integer value
- Boolean values are converted to
1
fortrue
, and0
forfalse
For float
:
- Float values are unchanged
- Integer values are converted to the nearest equivalent floating point value
- String values are converted according to the syntax of float literals
- Boolean values are converted to
1.0
fortrue
, and0.0
forfalse
For string
:
- String values are unchanged
- Integer values are converted to the base 10 string representation
- Float values are converted to a string formatted
xxx.xxx
with a precision of 6. This is equivalent to%f
for C's sprintf. - Boolean values are converted to
"true"
fortrue
, and"false"
forfalse
For bool
:
- The following string values convert to
true
:"1"
,"t"
,"T"
,"TRUE"
,"true"
, and"True"
- The following string values convert to
false
:"0"
,"f"
,"F"
,"FALSE"
,"false"
, and"False"
- Any non-zero integer or float value converts to
true
- Any zero integer or float value converts to
false
For any other unspecified type, the result is the undefined
value.
Printing
The built-in function print
can be used to output formatted debug information. The runtime may omit print function calls depending on its execution mode. The runtime may choose where the output of a print function goes.
print
is a variadic function, accepting one or more values to be printed; values are separated by a single whitespace character (" "
).
The return value of print
is always true
to allow it to be used in boolean expressions.
Errors
The built-in function error
is equivalent to print
but immediately causes execution to halt and the main rule to evaluate to false
. The program execution is considered to have failed.
Program Execution
Program execution begins by evaluating the source top to bottom. If evaluation is successful, the main
rule is evaluated and its result is returned. Execution can be interrupted at any time by an error, which can be called explicitly or implicitly through illegal or undefined behavior.
The content of this page is licensed under the CC BY 3.0.
Based on The Go Programming Language Specification licensed under CC BY 3.0.