cache optimization of matrice operation

69 Views Asked by At

As a precomputation to a integral function, I need make some computation on a large matrice.

for (size_t x = 1; x < size().x(); ++x)
for (size_t y = 0; y < size().y(); ++y)
for (size_t z = 0; z < size().z(); ++z)
    field::at(x, y, z) += field::at(x - 1, y, z);

for (size_t x = 0; x < size().x(); ++x)
for (size_t y = 1; y < size().y(); ++y)
for (size_t z = 0; z < size().z(); ++z)
    field::at(x, y, z) += field::at(x, y - 1, z);

for (size_t x = 0; x < size().x(); ++x)
for (size_t y = 0; y < size().y(); ++y)
for (size_t z = 1; z < size().z(); ++z)
    field::at(x, y, z) += field::at(x, y, z - 1);

my field inherit a std::vector<size_t> where the at as been overided

T& at(size_t x, size_t y, size_t z)
{
    return container::at(x + y * size().x() + z * size().x() * size().y();
}

Here are some execution times on my machine

  • (128x128x128) ~ 250 ms
  • (256x256x256) ~ 3 sec
  • (512x512x512) ~ 53 sec

That looks very slow to me.

Question

  • Is allocating a std::vector of size 512x512x512 (1G) a bad idea ? Should I divide it in multiple (512) sub vector of size 512x512 (2M/each)
  • Is there any other way to do the same simple computation which would be more cache efficient ? (I'm guessing cache faults are a reason why this is so slow)
0

There are 0 best solutions below