I am building an application based on distributed linear algebra using Trilinos, the main issue is that memory consumption is much higher than expected.
I have built a simple test case for building an Epetra::VbrMatrix with 1.5 million doubles grouped as 5 millions blocks of 3 doubles, which should be about 115MB.
After building the matrix on 2 processors, half data each, I get a memory consumption of 500MB on each processor, which is about 7.5 times the data, it looks unreasonable to me, the matrix should just have some integer arrays for locating the nonzero blocks.
I asked on the trilinos-users mailing list, they say memory usage looks too high, but hope to have some more help here.
I tested both on my laptop with Ubuntu + gcc 4.4.5 + Trilinos 10.0 and on a cluster with PGI compiler and Trilinos 10.4.0, the result is about the same.
My test code is on gist https://gist.github.com/848310, where I also wrote memory consumption at different stage in my testing with 2 MPI processes on my laptop.
If anybody has any suggestion that would be really helpful. Also if you could even just build, run and report memory consumption it would be great.
answer by Alan Williams form the trilinos-users list, in short VBRmatrix is not suitable for such small blocks, as the storage overhead is bigger than the data themselves: