I build a 32-bit RISC-V CPU with Harvard architecture and I want to write programs for it in C. I have a RISC-V compiler set (https://xpack.github.io/riscv-none-embed-gcc/) that can do just that and works fine - for most things. The problem starts when I want to work with global variables, global arrays, etc, because these types get copied to RAM on boot/reset by the start script.
Here is a block diagram of my CPU: (This will be important later. Just note that the Instruction memory = FLASH and Data memory = RAM)
(If you are interested about my CPU, I made a video about it: https://www.youtube.com/watch?v=KzSaFFpBPDM)
Example:
A typical program will look something like this:
#include <stdint.h>
int static_var_1 = 2;
int static_var_2 = 4;
int main(void)
{
int var = static_var_1 + static_var_2;
}
And its objdump something like this:
/opt/xpack-riscv-none-embed-gcc-10.1.0-1.1/riscv-none-embed/bin/objdump build/APP.elf -D
build/APP.elf: file format elf32-littleriscv
Disassembly of section .text:
00000000 <_start>:
0: 00080137 lui sp,0x80
4: ffc10113 addi sp,sp,-4 # 7fffc <_estack>
8: 00c000ef jal ra,14 <main>
c: 0040006f j 10 <_exit>
00000010 <_exit>:
10: 0000006f j 10 <_exit>
00000014 <main>:
14: fe010113 addi sp,sp,-32
18: 00812e23 sw s0,28(sp)
1c: 02010413 addi s0,sp,32
20: 00002703 lw a4,0(zero) # 0 <_start>
24: 00402783 lw a5,4(zero) # 4 <static_var_2>
28: 00f707b3 add a5,a4,a5
2c: fef42623 sw a5,-20(s0)
30: 00000793 li a5,0
34: 00078513 mv a0,a5
38: 01c12403 lw s0,28(sp)
3c: 02010113 addi sp,sp,32
40: 00008067 ret
Disassembly of section .data:
00000000 <static_var_1>:
0: 0002 c.slli64 zero
...
00000004 <static_var_2>:
4: 0004 0x4
...
Disassembly of section ._user_heap_stack:
00000008 <._user_heap_stack>:
...
(the <_start> is a part of my start script that will initialize stack pointer)
The catch:
These are the two instructions that tries to load the global variables:
20: 00002703 lw a4,0(zero) # 0 <_start>
24: 00402783 lw a5,4(zero) # 4 <static_var_2>
But there is a problem - they were never put into RAM, so the CPU will most likely end up with some garbage data, which is unacceptable.
The solution?
Somebody suggested linker relaxation as part of my previous question (RISC-V: Global variables), again, that doesn't seem to be the case, but I can still be wrong though!
From my research, most of the "classic" CPUs use a start script, where the copying takes place, but as this is not a von-neuman architecture, I don't have the FLASH memory mapped to data memory and therefor cannot be read by the program (see the block diagram above). The output program must contain the variables already decoded as executable instructions, for example if we want value 0x4 in RAM at position 0x0, It can be decoded to:
addi t0, zero, 0x4
sw t0, 0(zero)
Re-building my CPU as von-neuman would require much more gates and ICs and this is a discrete build where every IC counts.
Doing it by hardware is for me the worst solution as I stated above, so if it can be done in software I'm all for it - and It can! Obviously, there is a solution, but by far the ugliest: Compile the code, extract the data (with python), generate a new startup script with these variables decoded by the python script and compile it again.
I really don't want to go that route, so if it can be done by modifying startup script, linker, etc, it would be really, really great.
AVR ICs are basically Harvard architecture (though modified) so do they something differently that we can learn from?