The Riff Programming Language

Last updated 2022/11/18

Riff is a dynamically-typed general-purpose programming language designed primarily for prototyping and command-line usage. Riff offers a familiar syntax to many C-style languages as well as some extra conveniences, aiming to be a useful supplementary tool for programmers.

Riff is offered as a standalone interpreter riff.

Synopsis/Options

riff [options] program.rf [argument ...]

By default, riff opens and runs the file program.rf.

  • -e 'program'
    Interpret and execute the string program as a Riff program.

  • -h
    Print usage information and exit.

  • -l
    Produce a listing of the compiled bytecode and associated mnemonics.

  • -v
    Print version information and exit.

  • --
    Stop processing command-line options.

  • -
    Stop processing command-line options and execute stdin.

Overview

Riff is dynamically-typed. Identifiers/variables do not contain explicit type information and the language has no syntactic constructs for specifying types. Values, however, are implicitly typed; carrying their own type information.

All Riff values are first-class, meaning values of any type can be stored in variables, passed around as function arguments or returned as results from function calls.

Internally, a Riff value can be any of the following types:

  • null
  • Integer
  • Float
  • String
  • Regular expression
  • Range
  • Table
  • File handle
  • Riff function (user-defined)
  • C function (built-in functions)

null is a special value in Riff, typically representing the absence of a value. null is different than 0, 0.0 or the empty string ("").

Numbers in Riff can be integers or floats. Integers in Riff are signed 64-bit by default (int64_t in C). Floats in Riff are identical to a C double by default. Integer to float conversion (and vice versa) is performed implicitly depending on the operation and is designed to be completely transparent to the user.

Strings in Riff are immutable sequences of 8-bit character literals.

Regular expressions in Riff define patterns which are used for performing various string-searching operations.

Ranges are a special “subtype” in Riff that allow the user to define a range of integral values with an optional specified interval. Ranges can be used in for loops to iterate through a sequence of numbers or in string subscripting to easily extract different types of substrings.

Tables are the single compound data structure available in Riff. Table elements can be any type of Riff value. Storing null as a table element effectively deletes that key/value pair.

Tables in Riff are associative arrays. Any type of Riff value (even null) is a valid key for a given table element.

User-defined and built-in functions are treated just as any other value.

Language

Basic Concepts

A Riff program is a sequence of statements. Riff has no concept of statement terminators. The lexical analysis phase does not perform implicit semicolon insertion. A statement ends when the next lexical token in the token stream is not applicable to the current statement.

Variables are global by default. Riff allows local variable usage by explicitly declaring a variable with the local keyword. Riff also allows the use/access of uninitialized variables. When an uninitialized variable is used, Riff reserves the variable with global scope and initializes it to null. Depending on the context, the variable may also be initialized to 0 or an empty table. Riff does not allow uninitialized variables to be called as functions1.

Comments

Riff supports C++-style line comments with //, signaling to the interpreter to ignore everything starting from // to the end of the current line. Riff also supports C-style block comments in the form of /*...*/; Riff will ignore everything following /* until it reaches */.

// This is a comment
/* This is also
   a comment
*/

Constants and Literals

Numerals

Any string of characters beginning with a number (0..9) will be interpreted as a numeric constant. A string of characters will be interpreted as part of a single numeral until an invalid character is reached. Numerals can be integers or floating-point numbers in decimal or hexadecimal form. Numbers with the prefix 0x or 0X will be interpreted as hexadecimal. Valid hexadecimal characters can be any mix of lowercase and uppercase digits A through F.

23      // Decimal integer constant
6.7     // Decimal floating-point constant
.5      // Also a decimal floating-point constant (0.5)
9.      // 9.0
0xf     // Hexadecimal integer constant
0XaB    // Valid hexadecimal integer (mixed lowercase and uppercase)
0x.8    // Hexdecimal floating-point constant

Riff supports numbers written in exponent notation. For decimal numbers, an optional decimal exponent part (marked by e or E) can follow an integer or the optional fractional part. For hexadecimal numbers, a binary exponent part can be indicated with p or P.

45e2    // 4500
0xffP3  // 2040
0.25e-4 // 0.000025
0X10p+2 // 64

Riff supports integers in binary form. Numeric literals with the prefix 0b or 0B will be interpreted as base-2. Riff does not support floating point numbers with the binary (0b) prefix.

0b1101  // 13 in binary

Additionally, Riff supports arbitrary underscores in numeric literals. Any number of underscores can appear between digits.

Some valid examples:

1_2
12_
1_2_
1__2_
300_000_000
0x__80
45_e2
0b1101_0011_1010_1111

Some invalid examples:

_12     // Will be parsed as an indentifier
0_x80   // Underscore cannot be between `0` and `x`

Characters

Riff supports character literals enclosed in single quotation marks ('). Riff currently interprets character literals strictly as integer constants.

'A'     // 65
'π'     // 960

Multicharacter literals are also supported. The multicharacter sequence creates an integer where successive bytes are right-aligned and zero-padded in big-endian form.

'abcd'      // 0x61626364
'abcdefgh'  // 0x6162636465666768
'\1\2\3\4'  // 0x01020304

In the event of overflow, only the lowest 64 bits will remain in the resulting integer.

'abcdefghi' // 0x6263646566676869 ('a' overflows)

Similar to strings, Riff supports the use of the backslash character (\) to denote C-style escape sequences.

Character ASCII code (hex) Description
a 07 Bell
b 08 Backspace
e 1B Escape
f 0C Form feed
n 0A Newline/Line feed
r 0D Carriage return
t 09 Horizontal tab
v 0B Vertical tab
' 27 Single quote
\ 5C Backslash

Riff also supports arbitrary escape sequences in decimal and hexadecimal forms.

Decimal/hexadecimal escape sequence formats
Sequence Description
\nnn Octal escape sequence with up to three octal digits
\xnn Hexadecimal escape sequence with up to two hexadecimal digits

Strings

String literals are denoted by matching enclosing double quotation marks ("). String literals spanning multiple lines will have the newline characters included. Alternatively, a single backslash (\) can be used in a string literal to indicate that the following newline be ignored.

"Hello, world!"

"String spanning
multiple
lines"

"String spanning \
multiple lines \
without newlines"

In addition to the escape sequences outlined in the characters section, Riff also supports escaped Unicode literals in the following forms.

Unicode escape sequence formats
Sequence Description
\uXXXX Unicode escape sequence with up to 4 hexadecimal digits
\UXXXXXXXX Unicode escape sequence with up to 8 hexadecimal digits
"\u3c0"     // "π"
"\U1d11e"   // "𝄞"

Riff also supports interpolation of variables/expressions in string literals (aka string interpolation. Expressions can be delimited by either braces ({}) or parentheses (()). The full expression grammar is supported within an interpolated expression.

x = "world"
str = "Hello #x!"   // "Hello, world!"

sum = "#{1+2} == 3"
mul = "square root of 4 is #(sqrt(4))"

Regular Expressions

Regular expression (or “regex”) literals are denoted by enclosing forward slashes (/) followed immediately by any options.

/pattern/

Riff implements Perl Compatible Regular Expressions via the PCRE2 library. The pcre2syntax and pcre2pattern man pages outline the full syntax and semantics of regular expressions supported by PCRE2. Riff enables the PCRE2_DUPNAMES and PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL options when compiling regular expressions, which allows duplicated names in capture groups and ignores invalid or malformed escape sequences, treating them as literal single characters.

Regular expression literals in Riff also support the same Unicode escape sequences as string literals (\uXXXX or \UXXXXXXXX).

Compile-time options are specified as flags immediately following the closing forward slash. Riff will consume flags until it reaches a non-flag character. Available options are outlined below.

Regular expression modifiers
Flag Description
A Force pattern to become anchored to the start of the search, or the end of the most recent match
D $ matches only the end of the subject string; ignored if m is enabled
J Allow names in named capture groups to be duplicated within the same pattern (enabled by default)
U Invert the greediness of quantifiers. Quantifiers become ungreedy by default, unless followed by a ?
i Case-insensitive matching
m ^ and $ match newlines in the subject string
n Disable numbered capturing in parenthesized subpatterns (named ones still available)
s Treat the entire subject string as a single line; . matches newlines
u Enable Unicode properties for character classes
x Ignore unescaped whitespace in patterns, except inside character classes, and allow line comments starting with #
xx Same as x, but ignore unescaped whitespace inside character classes
/PaTtErN/i      // Caseless matching

// Extended forms - whitespace and line comments ignored

// Equivalent to /abc/
/abc # match "abc"/x

// Equivalent to /add|sub|mul|div/
/ add   # Addition
| sub   # Subtraction
| mul   # Multiplication
| div   # Division
/x

Keywords

The following keywords are reserved for syntactic constructs and not re-definable by the user.

and         fn        not
break       for       null
continue    if        or
do          in        return
elif        loop      while
else        local

Variables

A variable represents a place to store a value in a Riff program. Variables can be global or local in scope.

A valid identifier is a string of characters beginning with a lowercase letter (a..z), uppercase letter (A..Z) or underscore (_). Numeric characters (0..9) are valid in identifiers, but not as a starting character.

Statements

break

break is a control-flow construct which will immediately exit the current loop when reached. break is invalid outside of a loop structure; riff will throw an error when trying to compile a break statement outside of a loop.

while 1 {
  print("This will print")
  break
  print("This will not print")
}
// program control transfers here

continue

A continue statement causes the program to skip the remaining portion of the current loop, jumping to the end of the of the loop body. Like break, continue is invalid outside of a loop structure; riff will throw an error when trying to compile a continue statement outside of a loop.

do {
  // ...
  continue
  // ...
  // `continue` jumps here
} while 1

for x in y {
  // ...
  continue
  // ...
  // `continue` jumps here
}

while 1 {
  // ...
  continue
  // ...
  // `continue` jumps here
}

do

do_stmt = 'do' stmt 'while' expr
        | 'do' '{' stmt_list '}' 'while' expr

A do statement declares a do-while loop structure, which repeatedly executes the statement or brace-enclosed list of statements until the expression following the while keywords evaluates to 0.

Like all loop structures in Riff, the statement(s) inside a loop body establish their own local scope. Any locals declared inside the loop body are not accessible outside of the loop body. The while expression in a do-while loop is considered to be outside the loop body.

A do statement declared without a while condition is invalid and will cause an error to be thrown upon compilation.

elif

Syntactic sugar for else if. See if statements.

else

See if statements.

fn

fn_stmt = 'fn' id ['(' [id {',' id}] ')'] '{' stmt_list '}'

A function statement declares the definition of a named function. This is in contrast to an anonymous function, which is parsed as part of an expression statement.

fn f(x) {
  return x ** 2
}

fn g() {
  return 23.4
}

// Parentheses not required for functions without parameters
fn h {
  return "Hello"
}

More information on user-defined functions in Riff can be found in the Functions section.

for

for_stmt = 'for' id [',' id] 'in' expr stmt
         | 'for' id [',' id] 'in' expr '{' stmt_list '}'

A for statement declares a generic loop structure which iterates over the item(s) in the expr result value. There are two general forms of a for loop declaration:

  • for v in s {...}
  • for k,v in s {...}

In the first form, the value s is iterated over. Before each iteration, the variable v is populated with the value of the next item in the set.

In the second form, the value s is iterated over. Before each iteration, the variable k is populated with the key, while variable v is populated with the value of the next item in a set.

In both forms, the variables k and v are local to the inner loop body. Their values cannot be accessed once the loop terminates.

table = { "foo", "bar", "baz" }

// This iterates over each item in `table`, populating `k` with the current
// table index, and `v` with the corresponding table element
for k,v in table {
  // First iteration:  k = 0, v = "foo"
  // Second iteration: k = 1, v = "bar"
  // Third iteration:  k = 2, v = "baz"
}

Note that the value to be iterated over is evaluated exactly once. A copy of the value is made upon initialization of a given iterator. This avoids an issue where a user continually adds items to a given set, effectively causing an infinite loop.

The order in which tables are iterated over is not guaranteed to be in-order for integer keys due to the nature of the table implementation. However, in most cases, tables will be traversed in order for integer keys \(0..n\) where \(n\) is the last element in a contiguous table. If a table is constructed using the constructor syntax, it is guaranteed to be traversed in-order, so long as no other keys were added. Even if keys were added, tables are typically traversed in-order. Note that negative indices will always come after integer keys \(\geqslant 0\).

The value to be iterated over can be any Riff value, except functions. For example, iterating over an integer n will populate the provided variable with the numbers \([0..n]\) (inclusive of n). n can be negative.

// Equivalent to `for (i = 0; i <= 10; ++i)`
for i in 10 {
  // ...
}

// Equivalent to `for (i = 0; i >= -10; --i)`
for i in -10 {
  // ...
}

Iterating over an integer n while using the k,v syntax will populate v with \([0..n]\), while leaving k as null.

Currently, floating-point numbers are truncated to integers when used as the expression to iterate over.

Iterating over a string is similar to iterating over a table.

for k,v in "Hello" {
  // k = 0, v = "H"
  // k = 1, v = "e"
  // ...
  // k = 4, v = "o"
}

if

if_stmt = 'if' expr stmt {'elif' expr ...} ['else' ...]
        | 'if' expr '{' stmt_list '}' {'elif' expr ...} ['else' ...]

An if statement conditionally executes code based on the result of expr. If the expr evaluates to non-zero or non-null, the succeeding statement or list of statements is executed. Otherwise, the code is skipped.

If an else statement is provided following an if statement, the code in the else block is only executed if the if condition evaluated to zero or null. An else statement always associates to the closest preceding if statement.

Any statements between an if and elif or else statements is invalid; Riff will throw an error when compiling an else statement not attached to an if or elif.

elif is syntactic sugar for else if. Riff allows either syntax in a given if construct.

// `elif` and `else if` used in the same `if` construct
x = 2
if x == 1 {
  ...
} elif x == 2 {
  ...
} else if x == 3 {
  ...
} else {
  ...
}

local

local_stmt = 'local' expr {',' expr}
           | 'local' fn_stmt

local declares a variable visible only to the current block and any descending code blocks. Multiple variables can be declared as local with a comma-delimited expression list, similar to expression lists in expression statements.

A local variable can reference a variable in an outer scope of the same name without altering the outer variable.

a = 25
if 1 {
  local a = a     // Newly declared local `a` will be 25
  a += 5
  print(a)        // Prints 30
}
print(a)          // Prints 25

loop

loop_stmt = 'loop' stmt
          | 'loop' '{' stmt_list '}'

A loop statement declares an unconditional loop structure, where statement(s) inside the body of the loop are executed repeatedly. This is in contrast to conditional loop structures in Riff, such as do, for or while, where some condition is evaluated before each iteration of the loop.

return

ret_stmt = 'return' [expr]

A return statement is used for returning control from a function with an optional value.

The empty return statement highlights a pitfall with Riff’s grammar. Consider the following example.

if x == 1
  return
x++

At first glance, this code indicates to return control with no value if x equals 1 or increment x and continue execution. However, when Riff parses the stream of tokens above, it will consume the expression x++ as part of the return statement. This type of pitfall can be avoided by appending a semicolon (;) to return or enclosing the statement(s) following the if conditional in braces.

if x == 1
  return;
x++
if x == 1 {
  return
}
x++

while

while_stmt = 'while' expr stmt
           | 'while' expr '{' stmt_list '}'

A while statement declares a simple loop structure where the statement(s) following the expression expr are repeatedly executed until expr evaluates to 0.

Like all loop structures in Riff, the statement(s) inside a loop body establish their own local scope. Any locals declared inside the loop body are not accessible outside of the loop body. The expression following while has no access to any locals declared inside the loop body.

Expression Statements

Any expression not part of another syntactic structure such as if or while is an expression statement. Expression statements in Riff are simply standalone expressions which will invoke some side-effect in the program.

Expression statements can also be a comma-delimited list of expressions.

Expressions

Operators (increasing in precedence)
Operator(s) Description Associativity Precedence
= Assignment Right 1
?: Ternary conditional Right 2
.. Range constructor Left 3
||
or
Logical OR Left 4
&&
and
Logical AND Left 5
==
!=
Relational equality, inequality Left 6
~
!~
Match, negated match Left 6
<
<=
>
>=
Relational comparison \(<\), \(\leqslant\), \(>\) and \(\geqslant\) Left 7
| Bitwise OR Left 8
^ Bitwise XOR Left 9
& Bitwise AND Left 10
<<
>>
Bitwise left shift, right shift Left 11
# Concatenation Left 11
+
-
Addition, subtraction Left 12
*
/
%
Multiplication, division, modulus Left 13
!
not
Logical NOT Right 13
# Length Right 13
+
-
Unary plus, minus Right 13
~ Bitwise NOT Right 13
** Exponentiation Right 15
++
--
Prefix increment, decrement Right 15
() Function call Left 16
[] Subscripting Left 16
. Member access Left 16
++
--
Postfix increment, decrement Left 16
$ Field table subscripting Right 17

Riff also supports the following compound assignment operators, with the same precedence and associativity as simple assignment (=)

+=      |=
&=      **=
#=      <<=
/=      >>=
%=      -=
*=      ^=

Arithmetic Operators

Operator Type(s) Description
+ Prefix, Infix Numeric coercion, Addition
- Prefix, Infix Negation, Subtraction
* Infix Multiplication
/ Infix Division
% Infix Modulus
** Infix Exponentiation
++ Prefix, Postfix Increment by 1
-- Prefix, Postfix Decrement by 1

Bitwise Operators

Operator Type Description
& Infix Bitwise AND
| Infix Bitwise OR
^ Infix Bitwise XOR
<< Infix Bitwise left shift
>> Infix Bitwise right shift
~ Prefix Bitwise NOT

Logical Operators

Operator Type Description
! not Prefix Logical NOT
&& and Infix Logical AND
|| or Infix Logical OR

The operators || and && are short-circuiting. For example, in the expression lhs && rhs, rhs is evaluated only if lhs is “truthy.” Likewise, in the expression lhs || rhs, rhs is evaluated only if lhs is not “truthy.”

Values which evaluate as “false” are null, 0 and the empty string ("").

Relational Operators

Operator Type Description
== Infix Equality
!= Infix Inequality
< Infix Less-than
<= Infix Less-than or equal-to
> Infix Greater-than
>= Infix Greater-than or equal-to

Assignment Operators

The following assignment operators are all binary infix operators.

Operator Description
= Simple assignment
+= Assignment by addition
-= Assignment by subtraction
*= Assignment by multiplication
/= Assignment by division
%= Assignment by modulus
**= Assignment by exponentiation
&= Assignment by bitwise AND
|= Assignment by bitwise OR
^= Assignment by bitwise XOR
<<= Assignment by bitwise left shift
>>= Assignment by bitwise right shift
#= Assignment by concatenation

Ternary Conditional Operator

The ?: operator performs similarly to other C-style languages.

condition ? expr-if-true : expr-if-false

The expression in between ? and : in the ternary conditional operator is treated as if parenthesized. You can also omit the middle expression entirely.

x ?: y  // Equivalent to x ? x : y

Note that if the middle expression is omitted, the leftmost expression is only evaluated once.

x = 1
a = x++ ?: y    // a = 1; x = 2

Pattern Matching

Operator Type Description
~ Infix Match
!~ Infix Negated match

Pattern match operators can be performed using the infix matching operators. The left-hand side of the expression is the subject and always treated as a string. The right-hand side is the pattern and always treated as a regular expression.

The result of a standard match (~) is 1 is the subject matches the pattern and 0 if it doesn’t. The negated match (!~) returns the inverse.

"abcd"  ~ /a/   // 1
"abcd" !~ /a/   // 0

See the section on regular expressions for more information on regular expression syntax.

Ranges

The .. operator defines an integral range, which is a subtype in Riff. Ranges can contain an optional interval, denoted by an expression following a colon (:). Operands can be left blank to denote the absence of a bound, which will be interpreted differently based on the operation. There are 8 total permutations of valid ranges in Riff.

Syntax Range
x..y \([x..y]\)
x.. \([x..\)INT_MAX\(]\)
..y \([0..y]\)
.. \([0..\)INT_MAX\(]\)
x..y:z \([x..y]\) on interval \(z\)
x..:z \([x..\)INT_MAX\(]\) on interval \(z\)
..y:z \([0..y]\) on interval \(z\)
..:z \([0..\)INT_MAX\(]\) on interval \(z\)

All ranges are inclusive. For example, the range 1..7 will include both 1 and 7. Riff also infers the direction of the range if no z value is provided.

Ranges can be used in for loops to iterate over a range of numbers.

Ranges can also extract arbitrary substrings when used in a subscript expression with a string. When subscripting a string with a range such as x.., Riff will truncate the range to the end of the string to return the string’s suffix starting at index x.

hello = "Helloworld"
hello[5..]              // "world"
hello[..4]              // "Hello"
hello[..]               // "Helloworld"

Specifying an interval \(n\) allows you to extract a substring with every \(n\) characters.

abc = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
abc[..:2]               // "ACEGIKMOQSUWY"

Reversed strings can be easily extracted with a downward range.

a = "forwardstring"
a[#a-1..0]              // "gnirtsdrawrof"

As mentioned in the overview, a range is a type of Riff value. This means ranges can be stored in variables as well as passed as function parameters and returned from function calls.

Concatenation

The # (infix) operator concatenates two values together. The result of the operation is a string with the left-hand expression and right-hand concatenated together.

"Hello" # "World"   // "HelloWorld"
"str" # 123         // "str123"

Length Operator

When used as a prefix operator, # returns the length of a value. When performed on string values, the result of the expression is the length of the string in bytes. When performed on tables, the result of the expression is the number of non-null values in the table.

s = "string"
a = { 1, 2, 3, 4 }

#s; // 6
#a; // 4

The length operator can be used on numeric values as well; returning the length of the number in decimal form.

#123;       // 3
#-230;      // 4
#0.6345;    // 6
#0x1f;      // 2

Subscripting

The [] operator is used to subscript a Riff value. All Riff values can be subscripted except for functions. Subscripting any value with an out-of-bounds index will evaluate to null.

Subscripting a numeric value with expression \(i\) will retrieve the \(i\)th character of that number as if it were a string in its base-10 form (index starting at 0).

34[0]       // "3"
0.12[1]     // "."
(-45)[0]    // "-"

Subscripting a string with expression \(i\) retrieves the character at index \(i\), as if the string were a contiguous table of characters.

"Hello"[1]  // "e"

Note that any subscripting or indexing into string values will only be treated as if the characters in the string were byte-sized. I.e. You cannot arbitrarily subscript a string value with an integer value and extract a substring containing a Unicode character larger than one byte.

Naturally, subscripting a table with expression \(i\) will perform a table lookup with the key \(i\).

pedals = {
  "Fuzz",
  "Wah-Wah",
  "Uni-Vibe" 
}

pedals[0]   // "Fuzz"

Member Access

The . operator can be used to access table elements for string keys. The member access syntax t.key is syntactic sugar for t["key"]. Unlike the subscripting syntax, member access syntax is only supported for tables.

Field Table Operator

$ is a special prefix operator used for accessing the field table. $ has the highest precedence of all Riff operators and can be used to read from or write to field table.

Function Calls

() is a postfix operator used to execute function calls. Arguments are passed as a comma-delimited list of expressions inside the parentheses.

fn max(x,y) {
  return x > y ? x : y
}

max(1+4, 3*2)

Functions

There are two basic “forms” of defining functions in Riff. The first is defining a “named” function, which populates either the global or local namespace with the function.

fn f(x) {
  return x + 1
}

local fn g(x) {
  return x - 1
}

The second is anonymous functions, which are parsed as part of an expression statement.

f = fn(x) {
  return x + 1
}

local g = fn(x) {
  return x - 1
}

A key difference between the two forms is that named functions can reference themselves recursively, whereas anonymous functions cannot.

Riff allows all functions to be called with fewer arguments, or more arguments than the specified arity of a given function. The virtual machine will compensate by passing null for any insufficient arguments, or by discarding extraneous arguments. Note that this is not true variadic function support.

// Arity of the function is 3
fn f(x, y, z) {
  ...
}

f(1,2,3)    // x = 1        y = 2       z = 3
f(1,2)      // x = 1        y = 2       z = null
f(1,2,3,4)  // x = 1        y = 2       z = 3       (4 is discarded)
f()         // x = null     y = null    z = null

Additionally, many included library functions are designed to accept a varying number of arguments, such as atan() and fmt().

Scoping

Currently, functions only have access to global variables and their own parameters and local variables. Functions cannot access any local variables defined outside of their scope, even if a local is defined at the higher scope than the function.

Built-in Tables

arg Table

Whenever riff is invoked, it collects all the command-line arguments and stores them as string literals in a Riff table named arg. arg[1] will always be the first user-provided argument following the program text or program filename. For example, when invoking riff on the command-line like this:

$ riff -e 'arg[1] << arg[2]' 2 3

The arg table will be populated as follows:

arg[-2]: "riff"
arg[-1]: "-e"
arg[0]:  "arg[1] << arg[2]"
arg[1]:  "2"
arg[2]:  "3"

Another example, this time with a Riff program stored in a file name prog.rf:

$ riff prog.rf 43 22

The arg table would be populated:

arg[-1]: "riff"
arg[0]:  "prog.rf"
arg[1]:  "43"
arg[2]:  "22"

Field Table

The field table is used to access substrings resulting from pattern matches and captured subexpressions in regular expressions. When a match is found, it is stored as a string in $0. Each subsequent capture group is stored in $n, starting from 1.

// $1 = "fish"
if "one fish two fish" ~ /(fish)/
  print("red", $1, "blue", $1)

// $1 = "foo"
// $2 = "bar"
gsub("foo bar", /(\w+) (\w+)/, "$2 $1") // "bar foo"

Currently, Riff does not purge the field table upon each regex operation. Old captures will be only ever be overwritten by new ones.

Standard I/O Streams

Riff provides predefined variables corresponding to the standard I/O file descriptors.

Variable I/O stream
stderr Standard error
stdin Standard input
stdout Standard output

Basic Functions

assert(e[,s])

Raises an error if the expression e evaluates as false, with the error message s if provided. This function does not return when the assertion fails.

error([s])

Unconditionally raises an error with the message s if provided. This function does not return.

eval(s)

Compiles and executes the string s as Riff code. Global state is inherited and can be altered by the code s.

num(s[,b])

Returns a number interpreted from the string s on base (or radix) b. If no base is provided, the default is 0. When the base is 0, num() will convert to string to a number using the same lexical conventions of the language itself. num() can return an integer or float depending on the string’s structure (see lexical conventions) or if the number is too large to be stored as a signed 64-bit integer.

Valid values for b are 0 or integers 2 through 36. Bases outside this range will default back to 0. Providing bases other than 0, 10 or 16 will force s to only be interpreted as an integer value (current implementation limitation).

num("76")           // 76
num("0x54")         // 84
num("54", 16)       // 84
num("0b0110")       // 6
num("0110", 2)      // 6
num("abcxyz", 36)   // 623741435

print(...)

Takes any number of arguments and prints the values separated by a space, followed by a newline. print() returns the number of arguments passed.

type(x)

Returns the type of value x in the form of a string.

type(null)  // "null"
type(0xF)   // "int"
type(1.4)   // "float"
type("str") // "string"
type(/re/)  // "regex"
type(0..1)  // "range"
type({1,2}) // "table"
type(stdin) // "file"
type(sin)   // "function"

Arithmetic Functions

abs(x)

Returns the absolute value of x (i.e. \(|x|\)).

atan(y[,x])

When called with a single argument y, atan(y) returns \(\arctan(y)\) in radians. When called with two arguments y and x, atan(y,x) returns \(\arctan(\frac{y}{x})\) in radians. atan(y) is equivalent to atan(y,1).

ceil(x)

Returns the smallest integer not less than x (i.e. \(\lceil{x}\rceil\))

ceil(2.5)   // 3
ceil(2)     // 2

cos(x)

Returns \(\cos(x)\) in radians.

exp(x)

Returns \(e\) raised to the power x (i.e. \(e^x\)).

int(x)

Returns x truncated to an integer.

int(16.34)  // 16

log(x[,b])

Returns \(\log_b(x)\). If b is not provided, log(x) returns the natural log of x (i.e. \(\ln(x)\) or \(\log_e(x)\)).

sin(x)

Returns \(\sin(x)\) in radians.

sqrt(x)

Returns \(\sqrt{x}\).

tan(x)

Returns \(\tan(x)\) in radians.

I/O Functions

close(f)

Closes the file f.

flush(f)

Flushes or saves any written data to file f.

getc([f])

Returns a single character as an integer from file f (stdin by default).

open(s[,m])

Opens the file indicated by the file name s in the mode specified by the string m, returning the resulting file handle.

Flag Mode
r Read
w Write
a Append
r+ Read/write
w+ Read/write
a+ Read/write

The flag b can also be used to specify binary files on non-POSIX systems.

printf(s, ...)

Builds a format string and prints it directly to stdout.

putc(...)

Takes zero or more integers and prints a string composed of the character codes of each respective argument in order. putc() returns the number of argmuents passed.

read([a[,b]])

Reads data from a file stream, returning the data as a string if successful.

Syntax Description
read([f]) Read a line from file f
read([f,]m) Read input from file f according to the mode specified by m
read([f,]n) Read at most n bytes from file f
read([f,]0) Returns 0 if end-of-file has been reached in file f; 1 otherwise

When a file f is not provided, read() will operate on stdin. The default behavior is to read a single line from stdin. Providing a mode string allows control over the read operation. Providing an numeric value n specifies that read() should read up to n bytes from the file. read([f,]0) is a special case to check if the file still has data left to be read.

read() modes
Mode Description
a / A Read until EOF is reached
l / L Read a line

write(v[,f])

Writes the value v to file handle f (stdout by default).

Pseudo-Random Numbers

Riff implements the xoshiro256** generator to produce pseudo-random numbers. When the virtual machine registers the built-in functions, the PRNG is initialized once with time(0). Riff provides an srand() function documented below to allow control over the sequence of the generated pseudo-random numbers.

rand([m[,n]])

Syntax Type Range
rand() Float \([0,1)\)
rand(0) Integer \([\)INT_MIN\(..\)INT_MAX\(]\)
rand(n) Integer \([0 .. n]\)
rand(m,n) Integer \([m .. n]\)
rand(range) Integer See ranges

When called without arguments, rand() returns a pseudo-random floating-point number in the range \([0,1)\). When called with 0, rand(0) returns a pseudo-random Riff integer (signed 64-bit). When called with an integer n, rand(n) returns a pseudo-random Riff integer in the range \([0 .. n]\). n can be negative. When called with 2 arguments m and n, rand(m,n) returns a pseudo-random integer in the range \([m .. n]\). m can be greater than n.

srand([x])

Initializes the PRNG with seed x. If x is not given, time(0) is used. When the PRNG is initialized with a seed x, rand() will always produce the same sequence of numbers.

The following is only an example and may not accurately reflect the expected output for any particular version of Riff.

srand(3)    // Initialize PRNG with seed "3"
rand()      // 0.783235
rand()      // 0.863673

srand() returns the value used to seed the PRNG.

String Functions

byte(s[,i])

Returns the integer value of byte i in string s. i is 0 unless specified by the user.

s = "hello"
byte(s)     // 104
byte(s,2)   // 108

char(...)

Takes zero or more integers and returns a string composed of the character codes of each argument in order. char() accepts valid Unicode code points as arguments.

char(104, 101, 108, 108, 111)   // "hello"

fmt(f, ...)

Returns a formatted string of the arguments following the format string f. This functions largely resembles the C function sprintf() without support for length modifiers such as l or ll.

Each conversion specification in the format string begins with a % character and ends with a character which determines the conversion of the argument. Each specification may also contain one or more flags following the initial % character.

Format modifiers
Flag Description
0 For numeric conversions, leading zeros are used to pad the string instead of spaces, which is the default.
+ The sign is prepended to the resulting conversion. This only applies to signed conversions (d, f, g, i).
space If the result of a signed conversion is non-negative, a space is prepended to the conversion. This flag is ignored if + is specified.
- The resulting conversion is left-justified instead of right-justified, which is the default.

A minimum field width can be specified following any flags (or % if no flags specified), provided as an integer value. The resulting conversion is padded with spaces on to the left by default, or to the right if left-justified. A * can also be specified in lieu of an integer, where an argument will be consumed (as an integer) and used to specify the minimum field width.

The precision of the conversion can be specified with . and an integer value or *, similar to the minimum field width specifier. For numeric conversion, the precision specifies the minimum number of digits for the resulting conversion. For strings, it specifies the maximum number of characters in the conversion. Precision is ignored for character conversions (%c).

The table below outlines the available conversion specifiers.

Format conversion specifiers
Specifier Description
% A literal %
a / A A number in hexadecimal exponent notation (lowercase/uppercase).
b An unsigned binary integer
c A single character.
d / i A signed decimal integer.
e / E A number in decimal exponent notation (lowercase e/uppercase E used).
f / F A decimal floating-point number.
g / G A decimal floating-point number, either in standard form (f/F) or exponent notation (e/E); whichever is shorter.
m A multi-character string (from an integer, similar to %c).
o An unsigned octal integer.
s A character string.
x / X An unsigned hexadecimal integer (lowercase/uppercase).

Note that the %s format specifier will try to handle arguments of any type, falling back to %d for integers and %g for floats.

gsub(s,p[,r])

Returns a copy of s, as a string, where all occurrences of p, treated as a regular expression, are replaced with r. If r is not provided, all occurrences of p will be stripped from the return string.

If r is a string, dollar signs ($) are parsed as escape characters which can specify the insertion of substrings from capture groups in p.

Format Description
$$ Insert a literal dollar sign character ($)
$x
${x}
Insert a substring from capture group x (Either name or number)
$*MARK
${*MARK}
Insert a control verb name
// Simple find/replace
gsub("foo bar", /bar/, "baz")   // "foo baz"

// Strip whitespace from a string
gsub("a b c d", /\s/)           // "abcd"

// Find/replace with captures
gsub("riff", /(\w+)/, "$1 $1")  // "riff riff"

hex(x)

Returns a string with the hexadecimal representation of x as an integer. The string is prepended with “0x.” Riff currently converts all arguments to integers.

hex(123)    // "0x7b"
hex(68.7)   // "0x44"
hex("45")   // "0x2d"

lower(s)

Returns a copy of string s with all uppercase ASCII letters converted to lowercase ASCII. All other characters in string s (including non-ASCII characters) are copied over unchanged.

split(s[,d])

Returns a table with elements being string s split on delimiter d, treated as a regular expression. If no delimiter is provided, the regular expression /\s+/ (whitespace) is used. If the delimiter is the empty string (""), the string is split into a table of single-byte strings.

// Default behavior, split on whitespace
sentence = split("A quick brown fox")

// Print the words on separate lines in order
for word in sentence {
  print(word)
}

// Split string on regex delimiter
words = split("foo1bar2baz", /\d/)
words[0]        // "foo"
words[1]        // "bar"
words[2]        // "baz"

// Split string into single-byte strings
chars = split("Thiswillbesplitintochars","")
chars[0]        // "T"
chars[23]       // "s"

sub(s,p[,r])

Exactly like gsub(), except only the first occurrence of p is replaced in s.

upper(s)

Returns a copy of string s with all lowercase ASCII letters converted to uppercase ASCII. All other characters in string s (including non-ASCII characters) are copied over unchanged.

System Functions

clock()

Returns the approximate CPU time (in seconds) since the program began execution. This is implemented as the ISO C function clock() divided by CLOCKS_PER_SEC.

exit([s])

Terminates the program by calling the ISO C function exit() returning status s. If s is not provided, the programs terminates with 0.


  1. Unless someone else has a really good idea how to handle that[return]