I'm currently learning about dnn and is working on a lab assignment where different optimization techniques of matrix multiplication are required to be implemented.
I have generated the code of loop tiling optimization and is running some tests, during which I found that loop tiling could even make the process slower when the size of the matrix is relatively small. I'm wondering why such a negative effect could occur. Is there anything incorrect with my code?
the main part of my cpp code is shown as below:
constexpr int n = 256;
int A[n][n];
int B[n][n];
int C[n][n];
// define the size of tile
int tile_i = 16;
int tile_j = 16;
int tile_k = 16;
int min(int a, int b) {
return a < b ? a : b;
}
// without loop tiling
void matmul() {
memset(C, 0, sizeof(C));
for (int i = 0; i < n; i++) {
for (int k = 0; k < n; k++) {
for (int j = 0; j < n; j++) {
C[i][j] += A[i][k] * B[k][j];
}
}
}
}
// with loop tiling optimization
void matmul_loop_tiling() {
memset(C, 0, sizeof(C));
for(int i_t = 0; i_t < n; i_t += tile_i) {
for(int k_t = 0; k_t < n; k_t += tile_k) {
for(int j_t = 0; j_t < n; j_t += tile_j) {
for(int i = i_t; i < min(n, i_t + tile_i); i++) {
for(int k = k_t; k < min(n, k_t + tile_k); k++) {
for(int j = j_t; j < min(n, j_t + tile_j); j++) {
C[i][j] += A[i][k] * B[k][j];
}
}
}
}
}
}
}
I have generated the code of loop tiling optimization and is running some tests, during which I found that loop tiling could even make the process slower when the size of the matrix is relatively small. I'm wondering why such a negative effect could occur. Is there anything incorrect with my code?
Thanks for your time~