I am simulating baremetal executables on Rocket chip with verilator. When I use a large array like
float a[3][224][224]
the simulation in verilator will not work successfully.
Here is my main.cpp:
int main(int argc, char** argv)
{
printf("CPU+cgra execute Resnet1!\n");
long long unsigned start;
long long unsigned end;
float a [1][3][224][224];
int d0 , d1, d2, d3;
for(d0 = 0; d0 < 1; d0++){
for(d1 = 0; d1 < 3; d1++){
for(d2 = 0; d2 < 224; d2++){
for(d3 = 0; d3 < 224; d3++){
a[d0][d1][d2][d3]=d3;
}
}
}
}
printf("value assign finished!\n");
for(d0 = 0; d0 < 1; d0++ ){
for(d1 = 0; d1 < 3; d1++ ){
for(d2 = 0; d2 < 224; d2++ ){
for(d3 = 0; d3 < 224; d3++ ){
int I = (int)(a[d0][d1][d2][d3] * 10000);
printf("a[%d][%d][%d][%d]:%d\n",d0,d1,d2,d3,I );
}
}
}
}
Note: I compile this file with riscv-unknown-elf-gcc in chipyard/.conda-env.
Even the first printf will not show.
Increasing the size of L2 cache will not work either. How can I handle this problem? Thanks.
Increasing L2 cache seems to work for my problem, but simulation is slow.