Download rivasm.jar
Download rivasm.jar. Bug reports are welcome (Send comments to yamin@hosei.ac.jp).
Screen Shots
Below is the Rivasm main window. An Editor window is in the behand (there is also a Console window).
Editor window:
Console window:
RISC-V RV32IM instructions implemented in Rivasm:
RISC-V Assembly Program Examples
Click a link below and copy the contents to the Editor window.
- Selection sort RISC-V assembly program (selection_sort.s)
- Insertion sort RISC-V assembly program (insertion_sort.s)
- Bubble sort RISC-V assembly program (bubble_sort.s)
- Heap sort RISC-V assembly program (heap_sort.s)
- Merge sort RISC-V assembly program (merge_sort.s)
- Quicksort RISC-V assembly program (quick_sort.s)
- Random numbers RISC-V assembly program (random.s)
- Matrix multiplication RISC-V assembly program (matrix.s)
- Matrix multiplication by shift and addition RISC-V assembly program (matrix_mul_by_add_shift.s)
Then click Assemble button in the Editor window and click Run or Step button in the Main window.
RISC-V RV32IM CPU Implementation on FPGA
The image below shows a RISC-V RV32IM system that displays keyboard scancodes (make codes + break codes) on VGA monitor. The system consists of an RV32IM CPU, instruction memory, data memory, VRAM, a ps2 keyboard interface, and a VGA controller.
Below is the assembly code which will be translated into binary RV32IM machine code and executed on the RISC-V RV32IM CPU for displaying scancodes. The cyan caret was implemented with hardware. The instruction "csrw 0x800, a4" writes the caret background color (RGB = 0x0ff, cyan color) and current caret position (row and column) into a caret register.
By clicking Verilog button in the Rivasm main window, a Verilog HDL file for implementing instruction memory (ROM) is created, as shown as below.
Xilinx COE or Altera MIF can be also created by clicking Xilinx or Altera button.
riscv_rv32i_cpu.v
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 |
// // RISC-V RV32I CPU, By Li Yamin, yamin@ieee.org, Fri Jun 28 09:19:13 JST 2019 // module riscv_rv32i_cpu (clk,clrn,pc,inst,m_addr,d_f_mem,d_t_mem,write, io_rdn,io_wrn,rvram,wvram,read); input clk, clrn; // clock and reset input [31:0] inst; // instruction input [31:0] d_f_mem; // load data output [31:0] pc; // program counter output [31:0] m_addr; // mem or i/o addr output [31:0] d_t_mem; // store data output [3:0] write; // memory byte write enables output read; // memory read output [3:0] wvram; // vram byte write enables output rvram; // vram read output reg io_wrn; // i/o write output reg io_rdn; // i/o read // control signals reg wreg; // write regfile reg [3:0] wmem; // write memory byte enables reg rmem; // write/read memory reg [31:0] alu_out; // alu output reg [31:0] mem_out; // mem output reg [31:0] m_addr; // mem address reg [31:0] next_pc; // next pc reg [31:0] d_t_mem; wire [31:0] pc_plus_4 = pc + 4; // pc + 4 // instruction format wire [6:0] opcode = inst[6:0]; // wire [2:0] func3 = inst[14:12]; // wire [6:0] func7 = inst[31:25]; // wire [4:0] rd = inst[11:7]; // wire [4:0] rs = inst[19:15]; // = rs1 wire [4:0] rt = inst[24:20]; // = rs2 wire [4:0] shamt = inst[24:20]; // == rs2; wire sign = inst[31]; wire [11:0] imm = inst[31:20]; // branch offset 31:13 12 11 10:5 4:1 0 wire [31:0] broffset = {{19{sign}},inst[31],inst[7],inst[30:25],inst[11:8],1'b0}; // beq, bne, blt, bge, bltu, bgeu wire [31:0] simm = {{20{sign}},inst[31:20]}; // lw, addi, slti, sltiu, xori, ori, andi, jalr wire [31:0] stimm = {{20{sign}},inst[31:25],inst[11:7]}; // sw wire [31:0] uimm = {inst[31:12],12'h0}; // lui, auipc wire [31:0] jaloffset = {{11{sign}},inst[31],inst[19:12],inst[20],inst[30:21],1'b0}; // jal // jal target 31:21 20 19:12 11 10:1 0 // instruction decode wire i_auipc = (opcode == 7'b0010111); wire i_lui = (opcode == 7'b0110111); wire i_jal = (opcode == 7'b1101111); wire i_jalr = (opcode == 7'b1100111) & (func3 == 3'b000); wire i_beq = (opcode == 7'b1100011) & (func3 == 3'b000); wire i_bne = (opcode == 7'b1100011) & (func3 == 3'b001); wire i_blt = (opcode == 7'b1100011) & (func3 == 3'b100); wire i_bge = (opcode == 7'b1100011) & (func3 == 3'b101); wire i_bltu = (opcode == 7'b1100011) & (func3 == 3'b110); wire i_bgeu = (opcode == 7'b1100011) & (func3 == 3'b111); wire i_lb = (opcode == 7'b0000011) & (func3 == 3'b000); wire i_lh = (opcode == 7'b0000011) & (func3 == 3'b001); wire i_lw = (opcode == 7'b0000011) & (func3 == 3'b010); wire i_lbu = (opcode == 7'b0000011) & (func3 == 3'b100); wire i_lhu = (opcode == 7'b0000011) & (func3 == 3'b101); wire i_sb = (opcode == 7'b0100011) & (func3 == 3'b000); wire i_sh = (opcode == 7'b0100011) & (func3 == 3'b001); wire i_sw = (opcode == 7'b0100011) & (func3 == 3'b010); wire i_addi = (opcode == 7'b0010011) & (func3 == 3'b000); wire i_slti = (opcode == 7'b0010011) & (func3 == 3'b010); wire i_sltiu = (opcode == 7'b0010011) & (func3 == 3'b011); wire i_xori = (opcode == 7'b0010011) & (func3 == 3'b100); wire i_ori = (opcode == 7'b0010011) & (func3 == 3'b110); wire i_andi = (opcode == 7'b0010011) & (func3 == 3'b111); wire i_csrrw = (opcode == 7'b1110011) & (func3 == 3'b001); // not an rv32i instruction wire i_slli = (opcode == 7'b0010011) & (func3 == 3'b001) & (func7 == 7'b0000000); wire i_srli = (opcode == 7'b0010011) & (func3 == 3'b101) & (func7 == 7'b0000000); wire i_srai = (opcode == 7'b0010011) & (func3 == 3'b101) & (func7 == 7'b0100000); wire i_add = (opcode == 7'b0110011) & (func3 == 3'b000) & (func7 == 7'b0000000); wire i_sub = (opcode == 7'b0110011) & (func3 == 3'b000) & (func7 == 7'b0100000); wire i_sll = (opcode == 7'b0110011) & (func3 == 3'b001) & (func7 == 7'b0000000); wire i_slt = (opcode == 7'b0110011) & (func3 == 3'b010) & (func7 == 7'b0000000); wire i_sltu = (opcode == 7'b0110011) & (func3 == 3'b011) & (func7 == 7'b0000000); wire i_xor = (opcode == 7'b0110011) & (func3 == 3'b100) & (func7 == 7'b0000000); wire i_srl = (opcode == 7'b0110011) & (func3 == 3'b101) & (func7 == 7'b0000000); wire i_sra = (opcode == 7'b0110011) & (func3 == 3'b101) & (func7 == 7'b0100000); wire i_or = (opcode == 7'b0110011) & (func3 == 3'b110) & (func7 == 7'b0000000); wire i_and = (opcode == 7'b0110011) & (func3 == 3'b111) & (func7 == 7'b0000000); // pc reg [31:0] pc; always @ (posedge clk or negedge clrn) begin if (!clrn) pc <= 0; else pc <= next_pc; end // data written to register file wire i_load = i_lw | i_lb | i_lbu | i_lh | i_lhu | i_csrrw; wire [31:0] data_2_rf = i_load ? mem_out : alu_out; // register file reg [31:0] regfile [1:31]; // x1 - x31, should be [1:31] wire [31:0] a = (rs==0) ? 0 : regfile[rs]; // read port wire [31:0] b = (rt==0) ? 0 : regfile[rt]; // read port always @ (posedge clk) begin if (wreg && (rd != 0)) begin regfile[rd] <= data_2_rf; // write port end end // vram space wire vr_space = // vram space: alu_out[31] & // 1 alu_out[30] & // 1 // c0000000-dfffffff ~alu_out[29]; // 0 // output signals assign write = wmem & {4{~vr_space}}; // data memory write assign read = rmem & ~vr_space; // data memory read assign wvram = wmem & {4{vr_space}}; // video ram write assign rvram = rmem & vr_space; // video ram read // control signals, will be combinational circuit always @(*) begin // 38 instructions alu_out = 0; // alu output mem_out = 0; // mem output m_addr = 0; // memory address wreg = 0; // write regfile wmem = 4'b0000; // write memory (sw) rmem = 0; // read memory (lw) io_rdn = 1; io_wrn = 1; d_t_mem = b; next_pc = pc_plus_4; case (1'b1) i_add: begin // add alu_out = a + b; wreg = 1; end i_sub: begin // sub alu_out = a - b; wreg = 1; end i_and: begin // and alu_out = a & b; wreg = 1; end i_or: begin // or alu_out = a | b; wreg = 1; end i_xor: begin // xor alu_out = a ^ b; wreg = 1; end i_sll: begin // sll alu_out = a << b[4:0]; wreg = 1; end i_srl: begin // srl alu_out = a >> b[4:0]; wreg = 1; end i_sra: begin // sra alu_out = $signed(a) >>> b[4:0]; wreg = 1; end i_slli: begin // slli alu_out = a << shamt; wreg = 1; end i_srli: begin // srli alu_out = a >> shamt; wreg = 1; end i_srai: begin // srai alu_out = $signed(a) >>> shamt; wreg = 1; end i_slt: begin // slt if ($signed(a) < $signed(b)) alu_out = 1; end i_sltu: begin // sltu if ({1'b0,a} < {1'b0,b}) alu_out = 1; end i_addi: begin // addi alu_out = a + simm; wreg = 1; end i_andi: begin // andi alu_out = a & simm; wreg = 1; end i_ori: begin // ori alu_out = a | simm; wreg = 1; end i_xori: begin // xori alu_out = a ^ simm; wreg = 1; end i_slti: begin // slti if ($signed(a) < $signed(simm)) alu_out = 1; end i_sltiu: begin // sltiu if ({1'b0,a} < {1'b0,simm}) alu_out = 1; end i_lw: begin // lw alu_out = a + simm; m_addr = {alu_out[31:2],2'b00}; // alu_out[1:0] != 0, exception rmem = 1; mem_out = d_f_mem; wreg = 1; end i_lbu: begin // lbu alu_out = a + simm; m_addr = alu_out; rmem = 1; case(m_addr[1:0]) 2'b00: mem_out = {24'h0,d_f_mem[ 7: 0]}; 2'b01: mem_out = {24'h0,d_f_mem[15: 8]}; 2'b10: mem_out = {24'h0,d_f_mem[23:16]}; 2'b11: mem_out = {24'h0,d_f_mem[31:24]}; endcase wreg = 1; end i_lb: begin // lb alu_out = a + simm; m_addr = alu_out; rmem = 1; case(m_addr[1:0]) 2'b00: mem_out = {{24{d_f_mem[ 7]}},d_f_mem[ 7: 0]}; 2'b01: mem_out = {{24{d_f_mem[15]}},d_f_mem[15: 8]}; 2'b10: mem_out = {{24{d_f_mem[23]}},d_f_mem[23:16]}; 2'b11: mem_out = {{24{d_f_mem[31]}},d_f_mem[31:24]}; endcase wreg = 1; end i_lhu: begin // lhu alu_out = a + simm; m_addr = {alu_out[31:1],1'b0}; // alu_out[0] != 0, exception rmem = 1; case(m_addr[1]) 1'b0: mem_out = {16'h0,d_f_mem[15: 0]}; 1'b1: mem_out = {16'h0,d_f_mem[31:16]}; endcase wreg = 1; end i_lh: begin // lh alu_out = a + simm; m_addr = {alu_out[31:1],1'b0}; // alu_out[0] != 0, exception rmem = 1; case(m_addr[1]) 1'b0: mem_out = {{16{d_f_mem[15]}},d_f_mem[15: 0]}; 1'b1: mem_out = {{16{d_f_mem[31]}},d_f_mem[31:16]}; endcase wreg = 1; end i_sb: begin // sb alu_out = a + stimm; m_addr = alu_out; wmem = 4'b0001 << alu_out[1:0]; end i_sh: begin // sh alu_out = a + stimm; m_addr = {alu_out[31:1],1'b0}; // alu_out[0] != 0, exception wmem = 4'b0011 << {alu_out[1],1'b0}; end i_sw: begin // sw alu_out = a + stimm; m_addr = {alu_out[31:2],2'b00}; // alu_out[1:0] != 0, exception wmem = 4'b1111; end i_beq: begin // beq if (a == b) next_pc = pc + broffset; end i_bne: begin // bne if (a != b) next_pc = pc + broffset; end i_blt: begin // blt if ($signed(a) < $signed(b)) next_pc = pc + broffset; end i_bge: begin // bge if ($signed(a) >= $signed(b)) next_pc = pc + broffset; end i_bltu: begin // bltu if ({1'b0,a} < {1'b0,b}) next_pc = pc + broffset; end i_bgeu: begin // bgeu if ({1'b0,a} >= {1'b0,b}) next_pc = pc + broffset; end i_auipc: begin // auipc alu_out = pc + uimm; wreg = 1; end i_lui: begin // lui alu_out = uimm; wreg = 1; end i_jal: begin // jal alu_out = pc_plus_4; wreg = 1; next_pc = pc + jaloffset; end i_jalr: begin // jalr alu_out = pc_plus_4; wreg = 1; next_pc = (a + simm) & 32'hfffffffe; end i_csrrw: begin // csrrw m_addr = {20'h0,imm}; if (rd != 0) begin io_rdn = 0; // csr read mem_out = d_f_mem; wreg = 1; end if (rs != 0) begin io_wrn = 0; // csr write d_t_mem = a; end end default: ; endcase end endmodule |
RISC-V RV32IM Computer System Implemented on DE1-SoC
Download DE1-SoC hocisor_de1_soc.sof (US-Keyboard for Editor).
Display Japanese Characters (Implemented on DE0-CV)
Exercises
- Design an rv32m.v and add it to riscv_rv32i_cpu.v so that you can have a riscv_rv32im_cpu.v.
- mul rd, rs1, rs2
- mulh rd, rs1, rs2
- mulhsu rd, rs1, rs2
- mulhu rd, rs1, rs2
- div rd, rs1, rs2
- divu rd, rs1, rs2
- rem rd, rs1, rs2
- remu rd, rs1, rs2
- Design a five-stage pipelined RISC-V RV32IM CPU riscv_rv32im_pipelined_cpu.v.
- Design a pipelined RISC-V RV32IMF CPU/FPU with TLBs and caches riscv_rv32imf_pipelined_cpu_tlb_cache.v.