I want to copy as little as possible. At the moment I'm using num_t* array = new num_t[..] and then copying each value of the multidimensional vector into array in a for-loop.
I'd like to find a better way to do this.
I want to copy as little as possible. At the moment I'm using num_t* array = new num_t[..] and then copying each value of the multidimensional vector into array in a for-loop.
I'd like to find a better way to do this.
On
As you stated in the comments your inner vectors of your vector<vector<T>> structure are of the same size. So what you are actually trying to do is to store a m x n matrix.
Usually such matrices are not stored in multi-dimensional structures but in linear memory. The position (row, column) of a given element is then derived based on an indexing scheme of which row-major and column-major order are used most often.
Since you already state that you will copy this data on to a GPU, this copying is then simply done by copying the linear vector as a whole. You will then use the same indexing scheme on the GPU and on the host.
If you are using CUDA, have a look at Thrust. It provides thrust::host_vector<T> and thrust::device_vector<T> and simplifies copying even further:
thrust::host_vector<int> hostVec(100); // 10 x 10 matrix
thrust::device_vector<int> deviceVec = hostVec; // copies hostVec to GPU
For arithmetic types you can use function
memcpy. For exampleThe program output is