I tested the following simple code with NVIDIA's nvcc compiler. When I try to run the program,if the value of N is less than or equal to 512, it runs okay. But when I try to set N greater than 512 and run, it gives a segmentation fault. What's the reason for this?
#define N 1024 //changing value
int main(int argc, char *argv[]) {
float hA[N][N], hB[N][N], hC[N][N];
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
hA[i][j] = 1;
hB[i][j] = 1;
}
}
}
There are basically two ways you can allocate the matrices, the most common is to use a pointer-to-pointer-to-float, and then allocate first the outer dimension, and then allocate the inner dimension in a loop:
The second way is to have a pointer to an array, and allocate that:
But all that is moot since you might as well use
std::vector
instead: