RISC-V assembler overview
The RISC-V version of uLisp allows you to generate machine-code functions, integrated with Lisp, written in RISC-V code. The only board it currently supports is the Sipeed MAiX RISC-V boards.
The RISC-V uLisp assembler has the following features:
- You can create multiple named machine-code functions, limited only by the amount of code memory available.
- Machine-code functions are created with a defcode special form, which has a similar syntax to defun.
- You can include labels in your assembler listing simply by including them as symbols in the body of the defcode form. The defcode form creates these as local variables.
- The defcode form automatically does a two-pass assembly to resolve forward references, used in branches and memory references.
- The defcode form generates an assembler listing, showing the mnemonics and the machine-code generated from them.
- The machine-code functions are saved with save-image, and restored with load-image.
The assembler itself is written in Lisp to make it easy to extend it or add new instructions. For example, you could add support for RISC-V floating-point instructions.
Installing the assembler
Get the assembler here: RISC-V assembler in uLisp. To add it to uLisp use one of the following two methods:
Using the Serial Monitor
There is currently a problem with entering very long Lisp sources into the Maixduino using the Arduino Serial Monitor; the solution is to copy and paste it in two chunks:
- Select the firat half of the file and do Copy.
- Paste it into the field at the top of the Arduino IDE Serial Monitor window.
- Press Return to enter it into uLisp.
- Repeat for the second half of the file.
Using an SD card
- Copy the source listing from your computer to an SD card.
- Give it a name of no more than 8 characters, and a three-character extension, such as "ASM.TXT".
- Put the SD card in your Maixduino board and run uLisp.
- Enter the following program:
(defun load (filename) (with-sd-card (str filename) (loop (let ((form (read str))) (unless form (return)) (print (second form)) (eval form)))))
- Give the load command followed by the filename; for example:
The command will print out the name of each function as it is loaded from the file.
Saving an image
Once you have used either of these methods to load the assembler, you can save the uLisp image to an SD card using:
In future you can then simply reload it using:
The defcode form
The assembler uses a special defcode form to generate machine-code functions.
defcode special form
(defcode name (parameters) form*)
The defcode form is similar in syntax to defun. It creates a named machine-code function from a series of 16-bit integers given in the body of the form. These are written into RAM, and can be executed by calling the function in the same way as a normal Lisp function.
(defcode mul13 (x) #x45b5 #x0533 #x02b5 #x8082)
creates a machine-code routine called mul13, with one parameter, consisting of three instructions which multiplies its single integer argument by 13. For example:
> (mul13 10) 130
Functions defined with defcode can take up to four parameters. These are passed to the machine-code routine in the registers a0 to a3 respectively. The symbols used for the four parameters can be used as synonyms for the corresponding register a0 to a3 in the body of the defcode form.
The machine-code function should return the result back to uLisp in a0. This is returned as an integer.
Although you can supply machine-code instructions as hexadecimal op-codes, the assembler is more convenient as it allows you to write machine-code functions in RISC-V mnemonics. It is written in uLisp.
Where possible the syntax is very similar to RISC-V assembler syntax, with the following differences:
- The mnemonics are prefixed by '$' (because some mnemonics such as push and pop are already in use as Lisp functions).
- Registers are represented as symbols, prefixed with a quote. Constants are just numbers.
Assembler instructions are just Lisp functions, so you can see the code they generate:
> ($li 'a1 13) 17845
The assembler includes a function x16 to print a 16-bit value in hexadecimal, so you can see the result in hexadecimal by writing:
> (x16 ($li 'a1 13)) #x45b5
The following table shows typical RISC-V assembler formats, and the equivalent in this Lisp assembler:
|Examples||RISC-V assembler||uLisp assembler|
|Registers||mv a1, a2||($mv 'a1 'a2)|
|Immediate||li a0,2||($li 'a0 2)|
|Load||ld a0,8(sp)||($ld 'a0 8 '(sp))|
|Load in-line constant||ldr r0, label||($ldr 'r0 label)|
|Branch||ble a0,a1,label||($ble 'a0 'a1 label)|
|Jump to subroutine||jal label||($jal label)|
Here's a simple example consisting of three RISC-V instructions that multiplies its parameter by 13 and returns the result:
(defcode mul13 (x) ($li 'a1 13) ($mul 'a0 'a0 'a1) ($ret))
Evaluating this generates an assembler listing as follows:
0000 45b5 ($li 'a1 13) 0002 0533 ($mul 'a0 'a0 'a1) 0004 02b5 0006 8082 ($ret)
> (mul13 11) 143
The result is the number returned in the r0 register.
Note that functions written using defcode can't be relied upon to have a fixed position in memory and so should be position independent, and use only relative branches and memory references within the machine-code function.
You can include symbols in the body of the defcode form to create labels. The defcode assembler automatically creates these as local variables, and then does a two-pass assembly to resolve forward references. The assembler can then access these variables to calculate the offsets in branches and pc-relative addressing.
Note also that because uLisp requires comments starting with a semi-colon to be terminated by an open parenthesis, you can't put a comment immediately before a label. This is a limitation because the Arduino Serial Monitor removes all line break characters. You can use bracketing comments instead:
#| This is a comment |#
For example, here's a simple routine to calculate the Greatest Common Divisor, which uses two labels:
; Greatest Common Divisor (defcode gcd (a b) swap ($mv 'a2 'a1) ($mv 'a1 'a0) again ($mv 'a0 'a2) ($sub 'a2 'a2 'a1) ($bltz 'a2 swap) ($bnez 'a2 again) ($ret))
Evaluating this form generates the following assembler listing:
0000 swap 0000 862e ($mv 'a2 'a1) 0002 85aa ($mv 'a1 'a0) 0004 again 0004 8532 ($mv 'a0 'a2) 0006 8e0d ($sub 'a2 'a2 'a1) 0008 4ce3 ($bltz 'a2 swap) 000a fe06 000c fe65 ($bnez 'a2 again) 000e 8082 ($ret)
For example, to find the GCD of 3287 and 3460:
> (gcd 3287 3460) 173
For a summary of the RISC-V assembler instructions see RISC-V assembler instructions.
For some more complex examples see RISC-V assembler examples.