Built-in symbols

In most implementations of Lisp, both the built-in symbols and the user-defined symbols are stored in the Lisp workspace, which resides in RAM. However, with uLisp this clearly wasn't going to be possible, as I was targeting microcontrollers with very limited RAM; for example, my minimum target system was the Arduino Uno, with 32K bytes of flash program memory but only 2K bytes of RAM. The first released version of uLisp had 108 symbols with a total length of 594 characters, which would require a significant proportion of the RAM for a RAM-based lookup table. The latest release of AVR uLisp has 180 symbols, and still fits on the Arduino Uno.

The solution was to store the built-in symbols in a lookup table in program memory. This document describes the structure of that table, and the routines used to access symbols in the table.

The built-in symbols are defined by three components: a list of symbol names, a lookup table, and an enum list.

The list of symbol names

The built-in symbol names are defined by a list of string definitions; for example, here is a section of the list:

const char string89[] PROGMEM = "abs";
const char string90[] PROGMEM = "random";
const char string91[] PROGMEM = "max";
const char string92[] PROGMEM = "min";

The strings are defined as PROGMEM which on AVR platforms specifies that they should be stored in program memory.

The lookup table

The lookup table is a constant array defined in program memory called lookup_table[]:

 const tbl_entry_t lookup_table[] PROGMEM = {

Each entry in the lookup table is specified by the tbl_entry_t structure:

typedef struct {
  PGM_P string;
  fn_ptr_type fptr;
  uint8_t minmax;
} tbl_entry_t;

These fields are as follows:

  • string is the symbol name; for example "random".
  • fptr is a pointer to the function defining the symbol.
  • minmax is an 8-bit constant specifying the minimum and maximum number of arguments the function can have.

The string field references the appropriate string in the list of symbol names.

The fptr field is a pointer to a function that takes two Lisp objects, and returns a Lisp object. Its type is fn_ptr_type defined by:

typedef object *(*fn_ptr_type)(object *, object *);

The minmax field consists of two 4-bit nibbles. The first nibble defines the minimum number of arguments, and the second nibble defines the maximum number of arguments. If either nibble is 0xF the number of arguments is unlimited.

Here's a section of the lookup table:

  { string89, fn_abs, 0x11 },
  { string90, fn_random, 0x11 },
  { string91, fn_maxfn, 0x1F },
  { string92, fn_minfn, 0x1F },

For example, the entry for the Lisp function random references the string "random", the function pointer to the C routine fn_random(), and the minimum and maximum number of arguments is 1.

Here's the function definition:

object *fn_random (object *args, object *env) {
  (void) env;
  int arg = checkinteger(RANDOM, first(args));
  return number(random(arg));
}

The arguments of all functions are:

  • args - the list of arguments.
  • env - the environment of local variables in which the function is being evaluated.

In this case the function simply calls the Arduino function random() with the first argument, converts the result to a Lisp number object, and returns it.

The enum list

The structure defining the random function corresponds to the element lookup_table[90]. To make it easier to refer to each function, an enum associates the number of each element with a memorable symbol. For example, the C symbol RANDOM is equal to 90.

The appropriate part of the enum corresponding to the section of the lookup table shown above is:

enum function { 
...
ABS, RANDOM, MAXFN, MINFN,
...
}

Obviously, for all this to work correctly the list of symbol names, the lookup table, and the enum list must be in synchrony. This isn't as tricky as it sounds, as they are all generated automatically by the uLisp Builder, a Lisp program that compiles the sources to create a source file for a particular platform of uLisp.

Routines

The following routines are used to perform functions with the lookup table.

Most of these routines contain alternative versions of some statements, omitted from the following examples, to cater for the fact that on the older AVR processors, data in program memory has to be accessed through the special functions pgm_read_byte() and pgm_read_word().

builtin

The function builtin() looks up a string in the lookup table, and returns the index of its entry, or ENDFUNCTIONS if no match is found:

int builtin (char* n) {
  int entry = 0;
  while (entry < ENDFUNCTIONS) {
    if (strcasecmp(n, (char*)lookup_table[entry].string) == 0) return entry;
    entry++;
  }
  return ENDFUNCTIONS;
}

lookupfn

The function lookupfn() looks up an entry in the lookup table, and returns the function entry point:

intptr_t lookupfn (symbol_t name) {
  return (intptr_t)lookup_table[name].fptr;
}

getminmax

The function getminmax() looks up an entry in the lookup table, and returns the minmax byte:

uint8_t getminmax (symbol_t name) {
  uint8_t minmax = lookup_table[name].minmax;
  return minmax;
}

lookupbuiltin

The function lookupbuiltin() looks up an entry in the symbol table and returns the name of the function as a string:

char *lookupbuiltin (symbol_t name) {
  char *buffer = SymbolTop;
  strcpy(buffer, (char *)(lookup_table[name].string));
  return buffer;
}