Riff is a dynamically-typed general-purpose programming language designed primarily for prototyping and command-line usage. Riff offers a familiar syntax to many C-style languages as well as some extra conveniences, aiming to be a useful supplementary tool for programmers.
Riff is offered as a standalone interpreter riff
.
Synopsis/Options
riff [options] program.rf [argument ...]
By default, riff
opens and runs the file
program.rf
.
-e
'program'
Interpret and execute the stringprogram
as a Riff program.-h
Print usage information and exit.-l
Produce a listing of the compiled bytecode and associated mnemonics.-v
Print version information and exit.--
Stop processing command-line options.-
Stop processing command-line options and executestdin
.
Overview
Riff is dynamically-typed. Identifiers/variables do not contain explicit type information and the language has no syntactic constructs for specifying types. Values, however, are implicitly typed; carrying their own type information.
All Riff values are first-class, meaning values of any type can be stored in variables, passed around as function arguments or returned as results from function calls.
Internally, a Riff value can be any of the following types:
null
- Integer
- Float
- String
- Regular expression
- Range
- Table
- File handle
- Riff function (user-defined)
- C function (built-in functions)
null
is a special value in Riff, typically representing the
absence of a value. null
is different than 0
,
0.0
or the empty string (""
).
Numbers in Riff can be integers or floats. Integers in Riff are signed 64-bit
by default (int64_t
in C). Floats in Riff are identical to a C
double
by default. Integer to float conversion (and vice versa) is
performed implicitly depending on the operation and is designed to be completely
transparent to the user.
Strings in Riff are immutable sequences of 8-bit character literals.
Regular expressions in Riff define patterns which are used for performing various string-searching operations.
Ranges are a special “subtype” in Riff that allow the user to define a range of integral values with an optional specified interval. Ranges can be used in for loops to iterate through a sequence of numbers or in string subscripting to easily extract different types of substrings.
Tables are the single compound data structure available in Riff. Table
elements can be any type of Riff value. Storing null
as a table
element effectively deletes that key/value pair.
Tables in Riff are associative arrays.
Any type of Riff value (even null
) is a valid key for a given table
element.
User-defined and built-in functions are treated just as any other value.
Language
Basic Concepts
A Riff program is a sequence of statements. Riff has no concept of statement terminators. The lexical analysis phase does not perform implicit semicolon insertion. A statement ends when the next lexical token in the token stream is not applicable to the current statement.
Variables are global by default. Riff allows local variable usage by
explicitly declaring a variable with the local
keyword. Riff also allows the use/access of uninitialized variables. When an
uninitialized variable is used, Riff reserves the variable with global scope and
initializes it to null
. Depending on the context, the variable may
also be initialized to 0
or an empty table. Riff does not allow
uninitialized variables to be called as functions1.
Comments
Riff supports C++-style line comments with //
, signaling to the
interpreter to ignore everything starting from //
to the end of the
current line. Riff also supports C-style block comments in the form of
/*...*/
; Riff will ignore everything following /*
until it reaches */
.
// This is a comment
/* This is also
a comment
*/
Constants and Literals
Numerals
Any string of characters beginning with a number
(0
..9
) will be interpreted as a numeric constant. A
string of characters will be interpreted as part of a single numeral until an
invalid character is reached. Numerals can be integers or floating-point numbers
in decimal or hexadecimal form. Numbers with the prefix 0x
or
0X
will be interpreted as hexadecimal. Valid hexadecimal characters
can be any mix of lowercase and uppercase digits A
through
F
.
23 // Decimal integer constant
6.7 // Decimal floating-point constant
.5 // Also a decimal floating-point constant (0.5)
9. // 9.0
0xf // Hexadecimal integer constant
0XaB // Valid hexadecimal integer (mixed lowercase and uppercase)
0x.8 // Hexdecimal floating-point constant
Riff supports numbers written in exponent notation. For decimal numbers, an
optional decimal exponent part (marked by e
or E
) can
follow an integer or the optional fractional part. For hexadecimal numbers, a
binary exponent part can be indicated with p
or P
.
45e2 // 4500
0xffP3 // 2040
0.25e-4 // 0.000025
0X10p+2 // 64
Riff supports integers in binary form. Numeric literals with the prefix
0b
or 0B
will be interpreted as base-2. Riff does not
support floating point numbers with the binary (0b
) prefix.
0b1101 // 13 in binary
Additionally, Riff supports arbitrary underscores in numeric literals. Any number of underscores can appear between digits.
Some valid examples:
1_2
12_
1_2_
1__2_
300_000_000
0x__80
45_e2
0b1101_0011_1010_1111
Some invalid examples:
// Will be parsed as an indentifier
_12 0_x80 // Underscore cannot be between `0` and `x`
Characters
Riff supports character literals enclosed in single quotation marks
('
). Riff currently interprets character literals strictly as
integer constants.
'A' // 65
'π' // 960
Multicharacter literals are also supported. The multicharacter sequence creates an integer where successive bytes are right-aligned and zero-padded in big-endian form.
'abcd' // 0x61626364
'abcdefgh' // 0x6162636465666768
'\1\2\3\4' // 0x01020304
In the event of overflow, only the lowest 64 bits will remain in the resulting integer.
'abcdefghi' // 0x6263646566676869 ('a' overflows)
Similar to strings, Riff supports the use of the
backslash character (\
) to denote C-style escape sequences.
Character | ASCII code (hex) | Description |
---|---|---|
a |
07 |
Bell |
b |
08 |
Backspace |
e |
1B |
Escape |
f |
0C |
Form feed |
n |
0A |
Newline/Line feed |
r |
0D |
Carriage return |
t |
09 |
Horizontal tab |
v |
0B |
Vertical tab |
' |
27 |
Single quote |
\ |
5C |
Backslash |
Riff also supports arbitrary escape sequences in decimal and hexadecimal forms.
Sequence | Description |
---|---|
\nnn |
Octal escape sequence with up to three octal digits |
\xnn |
Hexadecimal escape sequence with up to two hexadecimal digits |
Strings
String literals are denoted by matching enclosing double quotation marks
("
). String literals spanning multiple lines will have the newline
characters included. Alternatively, a single backslash (\
) can be
used in a string literal to indicate that the following newline be ignored.
"Hello, world!"
"String spanning
multiple
lines"
"String spanning \
multiple lines \
without newlines"
In addition to the escape sequences outlined in the characters section, Riff also supports escaped Unicode literals in the following forms.
Sequence | Description |
---|---|
\uXXXX |
Unicode escape sequence with up to 4 hexadecimal digits |
\UXXXXXXXX |
Unicode escape sequence with up to 8 hexadecimal digits |
"\u3c0" // "π"
"\U1d11e" // "𝄞"
Riff also supports interpolation of variables/expressions in string literals
(aka string
interpolation. Expressions can be delimited by either braces
({}
) or parentheses (()
). The full expression grammar
is supported within an interpolated expression.
"world"
x = "Hello #x!" // "Hello, world!"
str =
"#{1+2} == 3"
sum = "square root of 4 is #(sqrt(4))" mul =
Regular Expressions
Regular expression (or “regex”) literals are denoted by enclosing forward
slashes (/
) followed immediately by any options.
/pattern/
Riff implements Perl Compatible Regular
Expressions via the PCRE2 library. The pcre2syntax
and pcre2pattern
man pages outline the full syntax and semantics of regular expressions supported
by PCRE2. Riff enables the PCRE2_DUPNAMES
and
PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL
options when compiling regular
expressions, which allows duplicated names in capture groups and ignores invalid
or malformed escape sequences, treating them as literal single characters.
Regular expression literals in Riff also support the same Unicode escape
sequences as string literals (\uXXXX
or
\UXXXXXXXX
).
Compile-time options are specified as flags immediately following the closing forward slash. Riff will consume flags until it reaches a non-flag character. Available options are outlined below.
Flag | Description |
---|---|
A |
Force pattern to become anchored to the start of the search, or the end of the most recent match |
D |
$ matches only the end of the subject
string; ignored if m is enabled |
J |
Allow names in named capture groups to be duplicated within the same pattern (enabled by default) |
U |
Invert the greediness of quantifiers. Quantifiers
become ungreedy by default, unless followed by a ? |
i |
Case-insensitive matching |
m |
^ and $ match newlines
in the subject string |
n |
Disable numbered capturing in parenthesized subpatterns (named ones still available) |
s |
Treat the entire subject string as a single line;
. matches newlines |
u |
Enable Unicode properties for character classes |
x |
Ignore unescaped whitespace in patterns, except
inside character classes, and allow line comments starting with
# |
xx |
Same as x , but ignore unescaped
whitespace inside character classes |
/PaTtErN/i // Caseless matching
// Extended forms - whitespace and line comments ignored
// Equivalent to /abc/
/abc # match "abc"/x
// Equivalent to /add|sub|mul|div/
/ add # Addition
| sub # Subtraction
| mul # Multiplication
| div # Division
/x
Keywords
The following keywords are reserved for syntactic constructs and not re-definable by the user.
and fn not
break for null
continue if or
do in return
elif loop while
else local
Variables
A variable represents a place to store a value in a Riff program. Variables can be global or local in scope.
A valid identifier is a string of characters beginning with a lowercase
letter (a
..z
), uppercase letter
(A
..Z
) or underscore (_
). Numeric
characters (0
..9
) are valid in identifiers, but not as
a starting character.
Statements
break
break
is a control-flow construct which will immediately exit
the current loop when reached. break
is invalid outside of a loop
structure; riff
will throw an error when trying to compile a
break
statement outside of a loop.
while 1 {
print("This will print")
break
print("This will not print")
}// program control transfers here
continue
A continue
statement causes the program to skip the remaining
portion of the current loop, jumping to the end of the of the loop body. Like break
, continue
is invalid outside
of a loop structure; riff
will throw an error when trying to
compile a continue
statement outside of a loop.
do {
// ...
continue
// ...
// `continue` jumps here
while 1
}
for x in y {
// ...
continue
// ...
// `continue` jumps here
}
while 1 {
// ...
continue
// ...
// `continue` jumps here
}
do
do_stmt = 'do' stmt 'while' expr
| 'do' '{' stmt_list '}' 'while' expr
A do
statement declares a do-while loop structure,
which repeatedly executes the statement or brace-enclosed list of statements
until the expression following the while
keywords evaluates to
0
.
Like all loop structures in Riff, the statement(s) inside a loop body
establish their own local scope. Any locals declared inside the loop body are
not accessible outside of the loop body. The while
expression in a
do-while loop is considered to be outside the loop body.
A do
statement declared without a while
condition
is invalid and will cause an error to be thrown upon compilation.
elif
Syntactic sugar for else if
. See if
statements.
else
See if
statements.
fn
fn_stmt = 'fn' id ['(' [id {',' id}] ')'] '{' stmt_list '}'
A function statement declares the definition of a named function. This is in contrast to an anonymous function, which is parsed as part of an expression statement.
fn f(x) {
return x ** 2
}
fn g() {
return 23.4
}
// Parentheses not required for functions without parameters
fn h {
return "Hello"
}
More information on user-defined functions in Riff can be found in the Functions section.
for
for_stmt = 'for' id [',' id] 'in' expr stmt
| 'for' id [',' id] 'in' expr '{' stmt_list '}'
A for
statement declares a generic loop structure which iterates
over the item(s) in the expr
result value. There are two general
forms of a for
loop declaration:
for v in s {...}
for k,v in s {...}
In the first form, the value s
is iterated over. Before each
iteration, the variable v
is populated with the value of
the next item in the set.
In the second form, the value s
is iterated over. Before each
iteration, the variable k
is populated with the key, while
variable v
is populated with the value of the next item in
a set.
In both forms, the variables k
and v
are local to
the inner loop body. Their values cannot be accessed once the loop
terminates.
"foo", "bar", "baz" }
table = {
// This iterates over each item in `table`, populating `k` with the current
// table index, and `v` with the corresponding table element
for k,v in table {
// First iteration: k = 0, v = "foo"
// Second iteration: k = 1, v = "bar"
// Third iteration: k = 2, v = "baz"
}
Note that the value to be iterated over is evaluated exactly once. A copy of the value is made upon initialization of a given iterator. This avoids an issue where a user continually adds items to a given set, effectively causing an infinite loop.
The order in which tables are iterated over is not guaranteed to be in-order for integer keys due to the nature of the table implementation. However, in most cases, tables will be traversed in order for integer keys \(0..n\) where \(n\) is the last element in a contiguous table. If a table is constructed using the constructor syntax, it is guaranteed to be traversed in-order, so long as no other keys were added. Even if keys were added, tables are typically traversed in-order. Note that negative indices will always come after integer keys \(\geqslant 0\).
The value to be iterated over can be any Riff value, except functions. For
example, iterating over an integer n
will populate the provided
variable with the numbers \([0..n]\) (inclusive
of n
). n
can be negative.
// Equivalent to `for (i = 0; i <= 10; ++i)`
for i in 10 {
// ...
}
// Equivalent to `for (i = 0; i >= -10; --i)`
for i in -10 {
// ...
}
Iterating over an integer n
while using the k,v
syntax will populate v
with \([0..n]\), while leaving k
as
null
.
Currently, floating-point numbers are truncated to integers when used as the expression to iterate over.
Iterating over a string is similar to iterating over a table.
for k,v in "Hello" {
// k = 0, v = "H"
// k = 1, v = "e"
// ...
// k = 4, v = "o"
}
if
if_stmt = 'if' expr stmt {'elif' expr ...} ['else' ...]
| 'if' expr '{' stmt_list '}' {'elif' expr ...} ['else' ...]
An if
statement conditionally executes code based on the result
of expr
. If the expr
evaluates to non-zero or
non-null
, the succeeding statement or list of statements is
executed. Otherwise, the code is skipped.
If an else
statement is provided following an if
statement, the code in the else
block is only executed if the
if
condition evaluated to zero or null
. An
else
statement always associates to the closest preceding
if
statement.
Any statements between an if
and elif
or
else
statements is invalid; Riff will throw an error when compiling
an else
statement not attached to an if
or
elif
.
elif
is syntactic sugar for else if
. Riff allows
either syntax in a given if
construct.
// `elif` and `else if` used in the same `if` construct
2
x = if x == 1 {
...
elif x == 2 {
} ...
else if x == 3 {
} ...
else {
} ...
}
local
local_stmt = 'local' expr {',' expr}
| 'local' fn_stmt
local
declares a variable visible only to the current block and
any descending code blocks. Multiple variables can be declared as
local
with a comma-delimited expression list, similar to expression
lists in expression statements.
A local variable can reference a variable in an outer scope of the same name without altering the outer variable.
25
a = if 1 {
local a = a // Newly declared local `a` will be 25
5
a += print(a) // Prints 30
}print(a) // Prints 25
loop
loop_stmt = 'loop' stmt
| 'loop' '{' stmt_list '}'
A loop
statement declares an unconditional loop
structure, where statement(s) inside the body of the loop are executed
repeatedly. This is in contrast to conditional loop structures in Riff,
such as do
, for
or while
, where some
condition is evaluated before each iteration of the loop.
return
ret_stmt = 'return' [expr]
A return
statement is used for returning control from a function
with an optional value.
The empty return
statement highlights a pitfall with Riff’s
grammar. Consider the following example.
if x == 1
return
x++
At first glance, this code indicates to return control with no value if
x
equals 1
or increment x
and continue
execution. However, when Riff parses the stream of tokens above, it will consume
the expression x++
as part of the return
statement.
This type of pitfall can be avoided by appending a semicolon (;
) to
return
or enclosing the statement(s) following the if
conditional in braces.
if x == 1
return;
x++
if x == 1 {
return
} x++
while
while_stmt = 'while' expr stmt
| 'while' expr '{' stmt_list '}'
A while
statement declares a simple loop structure where the
statement(s) following the expression expr are repeatedly executed
until expr evaluates to 0
.
Like all loop structures in Riff, the statement(s) inside a loop body
establish their own local scope. Any locals declared inside the loop body are
not accessible outside of the loop body. The expression following
while
has no access to any locals declared inside the loop
body.
Expression Statements
Any expression not part of another syntactic structure such as
if
or while
is an expression statement. Expression
statements in Riff are simply standalone expressions which will invoke some
side-effect in the program.
Expression statements can also be a comma-delimited list of expressions.
Expressions
Operator(s) | Description | Associativity | Precedence |
---|---|---|---|
= |
Assignment | Right | 1 |
?: |
Ternary conditional | Right | 2 |
.. |
Range constructor | Left | 3 |
|| or |
Logical OR |
Left | 4 |
&&
and |
Logical AND |
Left | 5 |
== != |
Relational equality, inequality | Left | 6 |
~ !~ |
Match, negated match | Left | 6 |
< <=
> >= |
Relational comparison \(<\), \(\leqslant\), \(>\) and \(\geqslant\) | Left | 7 |
| |
Bitwise OR |
Left | 8 |
^ |
Bitwise XOR |
Left | 9 |
& |
Bitwise AND |
Left | 10 |
<<
>> |
Bitwise left shift, right shift | Left | 11 |
# |
Concatenation | Left | 11 |
+ - |
Addition, subtraction | Left | 12 |
* /
% |
Multiplication, division, modulus | Left | 13 |
! not |
Logical NOT |
Right | 13 |
# |
Length | Right | 13 |
+ - |
Unary plus, minus | Right | 13 |
~ |
Bitwise NOT |
Right | 13 |
** |
Exponentiation | Right | 15 |
++ -- |
Prefix increment, decrement | Right | 15 |
() |
Function call | Left | 16 |
[] |
Subscripting | Left | 16 |
. |
Member access | Left | 16 |
++ -- |
Postfix increment, decrement | Left | 16 |
$ |
Field table subscripting | Right | 17 |
Riff also supports the following compound
assignment operators, with the same precedence and associativity as simple
assignment (=
)
+= |=
&= **=
#= <<=
/= >>=
%= -=
*= ^=
Arithmetic Operators
Operator | Type(s) | Description |
---|---|---|
+ |
Prefix, Infix | Numeric coercion, Addition |
- |
Prefix, Infix | Negation, Subtraction |
* |
Infix | Multiplication |
/ |
Infix | Division |
% |
Infix | Modulus |
** |
Infix | Exponentiation |
++ |
Prefix, Postfix | Increment by 1 |
-- |
Prefix, Postfix | Decrement by 1 |
Bitwise Operators
Operator | Type | Description |
---|---|---|
& |
Infix | Bitwise AND |
| |
Infix | Bitwise OR |
^ |
Infix | Bitwise XOR |
<< |
Infix | Bitwise left shift |
>> |
Infix | Bitwise right shift |
~ |
Prefix | Bitwise NOT |
Logical Operators
Operator | Type | Description |
---|---|---|
! not |
Prefix | Logical NOT |
&& and |
Infix | Logical AND |
|| or |
Infix | Logical OR |
The operators ||
and &&
are short-circuiting.
For example, in the expression lhs && rhs
, rhs
is evaluated only if lhs
is “truthy.” Likewise, in the expression
lhs || rhs
, rhs
is evaluated only if lhs
is not “truthy.”
Values which evaluate as “false” are null
, 0
and
the empty string (""
).
Relational Operators
Operator | Type | Description |
---|---|---|
== |
Infix | Equality |
!= |
Infix | Inequality |
< |
Infix | Less-than |
<= |
Infix | Less-than or equal-to |
> |
Infix | Greater-than |
>= |
Infix | Greater-than or equal-to |
Assignment Operators
The following assignment operators are all binary infix operators.
Operator | Description |
---|---|
= |
Simple assignment |
+= |
Assignment by addition |
-= |
Assignment by subtraction |
*= |
Assignment by multiplication |
/= |
Assignment by division |
%= |
Assignment by modulus |
**= |
Assignment by exponentiation |
&= |
Assignment by bitwise AND |
|= |
Assignment by bitwise OR |
^= |
Assignment by bitwise XOR |
<<= |
Assignment by bitwise left shift |
>>= |
Assignment by bitwise right shift |
#= |
Assignment by concatenation |
Ternary Conditional Operator
The ?:
operator performs similarly to other C-style
languages.
condition ?
expr-if-true :
expr-if-false
The expression in between ?
and :
in the ternary
conditional operator is treated as if parenthesized. You can also omit the
middle expression entirely.
// Equivalent to x ? x : y x ?: y
Note that if the middle expression is omitted, the leftmost expression is only evaluated once.
1
x = // a = 1; x = 2 a = x++ ?: y
Pattern Matching
Operator | Type | Description |
---|---|---|
~ |
Infix | Match |
!~ |
Infix | Negated match |
Pattern match operators can be performed using the infix matching operators. The left-hand side of the expression is the subject and always treated as a string. The right-hand side is the pattern and always treated as a regular expression.
The result of a standard match (~
) is 1
is the
subject matches the pattern and 0
if it doesn’t. The negated match
(!~
) returns the inverse.
"abcd" ~ /a/ // 1
"abcd" !~ /a/ // 0
See the section on regular expressions for more information on regular expression syntax.
Ranges
The ..
operator defines an integral range, which is a subtype in
Riff. Ranges can contain an optional interval, denoted by an expression
following a colon (:
). Operands can be left blank to denote the
absence of a bound, which will be interpreted differently based on the
operation. There are 8 total permutations of valid ranges in Riff.
Syntax | Range |
---|---|
x..y |
\([x..y]\) |
x.. |
\([x..\)INT_MAX \(]\) |
..y |
\([0..y]\) |
.. |
\([0..\)INT_MAX \(]\) |
x..y:z |
\([x..y]\) on interval \(z\) |
x..:z |
\([x..\)INT_MAX \(]\) on interval \(z\) |
..y:z |
\([0..y]\) on interval \(z\) |
..:z |
\([0..\)INT_MAX \(]\) on interval \(z\) |
All ranges are inclusive. For example, the range 1..7
will
include both 1
and 7
. Riff also infers the direction
of the range if no z
value is provided.
Ranges can be used in for
loops to iterate
over a range of numbers.
Ranges can also extract arbitrary substrings when used in a subscript expression with a string. When subscripting a
string with a range such as x..
, Riff will truncate the range to
the end of the string to return the string’s suffix starting at index
x
.
"Helloworld"
hello = 5..] // "world"
hello[..4] // "Hello"
hello[..] // "Helloworld" hello[
Specifying an interval \(n\) allows you to extract a substring with every \(n\) characters.
"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
abc = ..:2] // "ACEGIKMOQSUWY" abc[
Reversed strings can be easily extracted with a downward range.
"forwardstring"
a = 1..0] // "gnirtsdrawrof" a[#a-
As mentioned in the overview, a range is a type of Riff value. This means ranges can be stored in variables as well as passed as function parameters and returned from function calls.
Concatenation
The #
(infix) operator concatenates two values together. The
result of the operation is a string with the left-hand expression and right-hand
concatenated together.
"Hello" # "World" // "HelloWorld"
"str" # 123 // "str123"
Length Operator
When used as a prefix operator, #
returns the length of a value.
When performed on string values, the result of the expression is the length of
the string in bytes. When performed on tables, the result of the
expression is the number of non-null
values in the table.
"string"
s = 1, 2, 3, 4 }
a = {
// 6
#s; // 4 #a;
The length operator can be used on numeric values as well; returning the length of the number in decimal form.
123; // 3
#230; // 4
#-0.6345; // 6
#0x1f; // 2 #
Subscripting
The []
operator is used to subscript a Riff value. All Riff
values can be subscripted except for functions. Subscripting any value with an
out-of-bounds index will evaluate to null
.
Subscripting a numeric value with expression \(i\) will retrieve the \(i\)th character of that number as if it
were a string in its base-10 form (index starting at 0
).
34[0] // "3"
0.12[1] // "."
45)[0] // "-" (-
Subscripting a string with expression \(i\) retrieves the character at index \(i\), as if the string were a contiguous table of characters.
"Hello"[1] // "e"
Note that any subscripting or indexing into string values will only be treated as if the characters in the string were byte-sized. I.e. You cannot arbitrarily subscript a string value with an integer value and extract a substring containing a Unicode character larger than one byte.
Naturally, subscripting a table with expression \(i\) will perform a table lookup with the key \(i\).
pedals = {"Fuzz",
"Wah-Wah",
"Uni-Vibe"
}
0] // "Fuzz" pedals[
Member Access
The .
operator can be used to access table elements for string
keys. The member access syntax t.key
is syntactic sugar for
t["key"]
. Unlike the subscripting syntax, member access syntax is
only supported for tables.
Field Table Operator
$
is a special prefix operator used for accessing the field table. $
has the highest precedence
of all Riff operators and can be used to read from or write to field table.
Function Calls
()
is a postfix operator used to execute function calls.
Arguments are passed as a comma-delimited list of expressions inside the
parentheses.
fn max(x,y) {
return x > y ? x : y
}
1+4, 3*2) max(
Functions
There are two basic “forms” of defining functions in Riff. The first is defining a “named” function, which populates either the global or local namespace with the function.
fn f(x) {
return x + 1
}
local fn g(x) {
return x - 1
}
The second is anonymous functions, which are parsed as part of an expression statement.
fn(x) {
f = return x + 1
}
local g = fn(x) {
return x - 1
}
A key difference between the two forms is that named functions can reference themselves recursively, whereas anonymous functions cannot.
Riff allows all functions to be called with fewer arguments, or more
arguments than the specified arity of a given function. The virtual machine will
compensate by passing null
for any insufficient arguments, or by
discarding extraneous arguments. Note that this is not true variadic function
support.
// Arity of the function is 3
fn f(x, y, z) {
...
}
1,2,3) // x = 1 y = 2 z = 3
f(1,2) // x = 1 y = 2 z = null
f(1,2,3,4) // x = 1 y = 2 z = 3 (4 is discarded)
f(// x = null y = null z = null f()
Additionally, many included library functions are designed to accept a
varying number of arguments, such as atan()
and
fmt()
.
Scoping
Currently, functions only have access to global variables and their own
parameters and local variables. Functions cannot access any local variables
defined outside of their scope, even if a local
is defined at the
higher scope than the function.
Built-in Tables
arg
Table
Whenever riff
is invoked, it collects all the command-line
arguments and stores them as string literals in a Riff table named
arg
. arg[1]
will always be the first user-provided
argument following the program text or program filename. For example, when
invoking riff
on the command-line like this:
$ riff -e 'arg[1] << arg[2]' 2 3
The arg
table will be populated as follows:
arg[-2]: "riff"
arg[-1]: "-e"
arg[0]: "arg[1] << arg[2]"
arg[1]: "2"
arg[2]: "3"
Another example, this time with a Riff program stored in a file name
prog.rf
:
$ riff prog.rf 43 22
The arg
table would be populated:
arg[-1]: "riff"
arg[0]: "prog.rf"
arg[1]: "43"
arg[2]: "22"
Field Table
The field table is used to access substrings resulting from pattern matches
and captured subexpressions in regular expressions. When a match is found, it is
stored as a string in $0
. Each subsequent capture group is stored
in $n
, starting from 1
.
// $1 = "fish"
if "one fish two fish" ~ /(fish)/
print("red", $1, "blue", $1)
// $1 = "foo"
// $2 = "bar"
gsub("foo bar", /(\w+) (\w+)/, "$2 $1") // "bar foo"
Currently, Riff does not purge the field table upon each regex operation. Old captures will be only ever be overwritten by new ones.
Standard I/O Streams
Riff provides predefined variables corresponding to the standard I/O file descriptors.
Variable | I/O stream |
---|---|
stderr |
Standard error |
stdin |
Standard input |
stdout |
Standard output |
Basic Functions
assert(e[,s])
Raises an error if the expression e
evaluates as false, with the
error message s
if provided. This function does not return when the
assertion fails.
error([s])
Unconditionally raises an error with the message s
if provided.
This function does not return.
eval(s)
Compiles and executes the string s
as Riff code. Global state is
inherited and can be altered by the code s
.
num(s[,b])
Returns a number interpreted from the string s
on base (or
radix) b
. If no base is provided, the default is 0
.
When the base is 0
, num()
will convert to string to a
number using the same lexical conventions of the language itself.
num()
can return an integer or float depending on the string’s
structure (see lexical conventions) or if the number is too large to be stored
as a signed 64-bit integer.
Valid values for b
are 0
or integers 2
through 36
. Bases outside this range will default back to
0
. Providing bases other than 0
, 10
or
16
will force s
to only be interpreted as an integer
value (current implementation limitation).
num("76") // 76
num("0x54") // 84
num("54", 16) // 84
num("0b0110") // 6
num("0110", 2) // 6
num("abcxyz", 36) // 623741435
print(...)
Takes any number of arguments and prints the values separated by a space,
followed by a newline. print()
returns the number of arguments
passed.
type(x)
Returns the type of value x
in the form of a string.
type(null) // "null"
type(0xF) // "int"
type(1.4) // "float"
type("str") // "string"
type(/re/) // "regex"
type(0..1) // "range"
type({1,2}) // "table"
type(stdin) // "file"
type(sin) // "function"
Arithmetic Functions
abs(x)
Returns the absolute value of x
(i.e. \(|x|\)).
atan(y[,x])
When called with a single argument y
, atan(y)
returns \(\arctan(y)\) in radians. When called
with two arguments y
and x
, atan(y,x)
returns \(\arctan(\frac{y}{x})\) in radians.
atan(y)
is equivalent to atan(y,1)
.
ceil(x)
Returns the smallest integer not less than x
(i.e. \(\lceil{x}\rceil\))
ceil(2.5) // 3
ceil(2) // 2
cos(x)
Returns \(\cos(x)\) in radians.
exp(x)
Returns \(e\) raised to the power x
(i.e. \(e^x\)).
int(x)
Returns x
truncated to an integer.
int(16.34) // 16
log(x[,b])
Returns \(\log_b(x)\). If b
is
not provided, log(x)
returns the natural log of x
(i.e. \(\ln(x)\) or \(\log_e(x)\)).
sin(x)
Returns \(\sin(x)\) in radians.
sqrt(x)
Returns \(\sqrt{x}\).
tan(x)
Returns \(\tan(x)\) in radians.
I/O Functions
close(f)
Closes the file f
.
flush(f)
Flushes or saves any written data to file f
.
getc([f])
Returns a single character as an integer from file f
(stdin
by default).
open(s[,m])
Opens the file indicated by the file name s
in the mode
specified by the string m
, returning the resulting file handle.
Flag | Mode |
---|---|
r |
Read |
w |
Write |
a |
Append |
r+ |
Read/write |
w+ |
Read/write |
a+ |
Read/write |
The flag b
can also be used to specify binary files on non-POSIX
systems.
printf(s, ...)
Builds a format string and prints it directly to
stdout
.
putc(...)
Takes zero or more integers and prints a string composed of the character
codes of each respective argument in order. putc()
returns the
number of argmuents passed.
read([a[,b]])
Reads data from a file stream, returning the data as a string if successful.
Syntax | Description |
---|---|
read([f]) |
Read a line from file f |
read([f,]m) |
Read input from file f according to
the mode specified by m |
read([f,]n) |
Read at most n bytes from file
f |
read([f,]0) |
Returns 0 if end-of-file has been
reached in file f ; 1 otherwise |
When a file f
is not provided, read()
will operate
on stdin
. The default behavior is to read a single line from
stdin
. Providing a mode string allows control over the read
operation. Providing an numeric value n
specifies that
read()
should read up to n
bytes from the file.
read([f,]0)
is a special case to check if the file still has data
left to be read.
Mode | Description |
---|---|
a / A |
Read until EOF is reached |
l / L |
Read a line |
write(v[,f])
Writes the value v
to file handle f
(stdout
by default).
Pseudo-Random Numbers
Riff implements the xoshiro256**
generator to
produce pseudo-random numbers. When the virtual machine registers the built-in
functions, the PRNG is initialized once with time(0)
. Riff provides
an srand()
function documented below to allow control over the
sequence of the generated pseudo-random numbers.
rand([m[,n]])
Syntax | Type | Range |
---|---|---|
rand() |
Float | \([0,1)\) |
rand(0) |
Integer | \([\)INT_MIN \(..\)INT_MAX \(]\) |
rand(n) |
Integer | \([0 .. n]\) |
rand(m,n) |
Integer | \([m .. n]\) |
rand( range) |
Integer | See ranges |
When called without arguments, rand()
returns a pseudo-random
floating-point number in the range \([0,1)\).
When called with 0
, rand(0)
returns a pseudo-random
Riff integer (signed 64-bit). When called with an integer n
,
rand(n)
returns a pseudo-random Riff integer in the range \([0 .. n]\). n
can be negative. When
called with 2 arguments m
and n
,
rand(m,n)
returns a pseudo-random integer in the range \([m .. n]\). m
can be greater than
n
.
srand([x])
Initializes the PRNG with seed x
. If x
is not
given, time(0)
is used. When the PRNG is initialized with a seed
x
, rand()
will always produce the same sequence of
numbers.
The following is only an example and may not accurately reflect the expected output for any particular version of Riff.
srand(3) // Initialize PRNG with seed "3"
rand() // 0.783235
rand() // 0.863673
srand()
returns the value used to seed the PRNG.
String Functions
byte(s[,i])
Returns the integer value of byte i
in string s
.
i
is 0
unless specified by the user.
"hello"
s = byte(s) // 104
byte(s,2) // 108
char(...)
Takes zero or more integers and returns a string composed of the character
codes of each argument in order. char()
accepts valid Unicode code
points as arguments.
char(104, 101, 108, 108, 111) // "hello"
fmt(f, ...)
Returns a formatted string of the arguments following the format string
f
. This functions largely resembles the C function
sprintf()
without support for length modifiers such as
l
or ll
.
Each conversion specification in the format string begins with a
%
character and ends with a character which determines the
conversion of the argument. Each specification may also contain one or more
flags following the initial %
character.
Flag | Description |
---|---|
0 |
For numeric conversions, leading zeros are used to pad the string instead of spaces, which is the default. |
+ |
The sign is prepended to the resulting conversion.
This only applies to signed conversions (d , f ,
g , i ). |
space | If the result of a signed conversion is
non-negative, a space is prepended to the conversion. This flag is ignored if
+ is specified. |
- |
The resulting conversion is left-justified instead of right-justified, which is the default. |
A minimum field width can be specified following any flags (or
%
if no flags specified), provided as an integer value. The
resulting conversion is padded with spaces on to the left by default, or to the
right if left-justified. A *
can also be specified in lieu of an
integer, where an argument will be consumed (as an integer) and used to specify
the minimum field width.
The precision of the conversion can be specified with .
and an integer value or *
, similar to the minimum field width
specifier. For numeric conversion, the precision specifies the minimum number of
digits for the resulting conversion. For strings, it specifies the maximum
number of characters in the conversion. Precision is ignored for character
conversions (%c
).
The table below outlines the available conversion specifiers.
Specifier | Description |
---|---|
% |
A literal % |
a / A |
A number in hexadecimal exponent notation (lowercase/uppercase). |
b |
An unsigned binary integer |
c |
A single character. |
d / i |
A signed decimal integer. |
e / E |
A number in decimal exponent notation (lowercase
e /uppercase E used). |
f / F |
A decimal floating-point number. |
g / G |
A decimal floating-point number, either in
standard form (f /F ) or exponent notation
(e /E ); whichever is shorter. |
m |
A multi-character string (from an integer, similar
to %c ). |
o |
An unsigned octal integer. |
s |
A character string. |
x / X |
An unsigned hexadecimal integer (lowercase/uppercase). |
Note that the %s
format specifier will try to handle arguments
of any type, falling back to %d
for integers and %g
for floats.
gsub(s,p[,r])
Returns a copy of s
, as a string, where all occurrences of
p
, treated as a regular expression, are replaced with
r
. If r
is not provided, all occurrences of
p
will be stripped from the return string.
If r
is a string, dollar signs ($
) are parsed as
escape characters which can specify the insertion of substrings from capture
groups in p
.
Format | Description |
---|---|
$$ |
Insert a literal dollar sign character
($ ) |
$x ${x} |
Insert a substring from capture group
x (Either name or number) |
$*MARK
${*MARK} |
Insert a control verb name |
// Simple find/replace
gsub("foo bar", /bar/, "baz") // "foo baz"
// Strip whitespace from a string
gsub("a b c d", /\s/) // "abcd"
// Find/replace with captures
gsub("riff", /(\w+)/, "$1 $1") // "riff riff"
hex(x)
Returns a string with the hexadecimal representation of x
as an
integer. The string is prepended with “0x
.” Riff currently converts
all arguments to integers.
hex(123) // "0x7b"
hex(68.7) // "0x44"
hex("45") // "0x2d"
lower(s)
Returns a copy of string s
with all uppercase ASCII letters
converted to lowercase ASCII. All other characters in string s
(including non-ASCII characters) are copied over unchanged.
split(s[,d])
Returns a table with elements being string s
split on delimiter
d
, treated as a regular expression. If no delimiter is provided,
the regular expression /\s+/
(whitespace) is used. If the delimiter
is the empty string (""
), the string is split into a table of
single-byte strings.
// Default behavior, split on whitespace
split("A quick brown fox")
sentence =
// Print the words on separate lines in order
for word in sentence {
print(word)
}
// Split string on regex delimiter
split("foo1bar2baz", /\d/)
words = 0] // "foo"
words[1] // "bar"
words[2] // "baz"
words[
// Split string into single-byte strings
split("Thiswillbesplitintochars","")
chars = 0] // "T"
chars[23] // "s" chars[
sub(s,p[,r])
Exactly like gsub()
, except only the first
occurrence of p
is replaced in s
.
upper(s)
Returns a copy of string s
with all lowercase ASCII letters
converted to uppercase ASCII. All other characters in string s
(including non-ASCII characters) are copied over unchanged.
System Functions
clock()
Returns the approximate CPU time (in seconds) since the program began
execution. This is implemented as the ISO C function clock()
divided by CLOCKS_PER_SEC
.
exit([s])
Terminates the program by calling the ISO C function exit()
returning status s
. If s
is not provided, the programs
terminates with 0
.
Unless someone else has a really good idea how to handle that[return]