I am currently using pycuda and scikits.cuda to solve linear equation A*x = b, where A is an upper/lower matrix. However the cublasStbsv routine requires a specific format. To give an example: if a lower matrix A = [[1, 0, 0], [2, 3, 0], [4, 5, 6]], then the input required by cublasStbsv should be [[1, 3, 6], [2, 5, 0], [4, 0, 0]], where rows are diagonal, subdiagonal1, subdiagonal2, respectively. If using numpy, this can be easily done by stride_tricks.as_strided, but I dont know how to do similar things with pycuda.gpuarray. Any help would be appreciated, thanks. I found pycuda.compyte.array.as_strided, but it cannot be applied to gpuarray.
How to convert an upper/lower gpuarray to the specific format required by cublasStbsv?
185 Views Asked by Ziqian Xie At
1
There are 1 best solutions below
Related Questions in CUDA
- direct global memory access using cuda
- Threads syncronization in CUDA
- Merge sort using CUDA: efficient implementation for small input arrays
- why cuda kernel function costs cpu?
- How to detect NVIDIA CUDA Architecture
- What is the optimal way to use additional data fields in functors in Thrust?
- cuda-memcheck fails to detect memory leak in an R package
- Understanding Dynamic Parallelism in CUDA
- C/CUDA: Only every fourth element in CudaArray can be indexed
- NVCC Cuda 5.0 on Ubuntu 12.04 /usr/lib/libudt.so file format not recognized
- Reduce by key on device array
- Does CUDA include a real c++ library?
- cuMemcpyDtoH yields CUDA_ERROR_INVALID_VALUE
- Different Kernels sharing SMx
- How many parallel threads i can run on my nvidia graphic card in cuda programming?
Related Questions in PYCUDA
- PyCUDA large nonuniform matrix operations
- Efficient library to detect and clip overlapping rectangles with python
- PyCUDA misaligned address cleanup failure
- How to do element-wise assignment in pycuda / scikits.cuda?
- GPGPU performance in high-level languages
- How to use thrust with PyCuda?
- Numbapro cuda python defining array in thread register in gpu
- How to convert an upper/lower gpuarray to the specific format required by cublasStbsv?
- cuda runtime api and dynamic kernel definition
- Installing Pycuda and using Pycharm
- Time taken to copy matrix to host increases by how many times the matrix was used
- How to generate random number inside pyCUDA kernel?
- processing an image using CUDA implementation, python (pycuda) or C++?
- PyCUDA unable to resize images with a cuda program
- Can't install pycuda with pip
Related Questions in CUBLAS
- CUDA cuBlasGetmatrix / cublasSetMatrix fails | Explanation of arguments
- How to make multi CUBLAS APIs (eg. cublasDgemm) really execute concurrently in multi cudaStream
- cublasDgemm getting more slower
- cuBLAS - Issue with cublasSdot and cublasSgemv not taking pointers to GPU memory
- cublas matrix inversion from device
- How to convert an upper/lower gpuarray to the specific format required by cublasStbsv?
- cuda runtime api and dynamic kernel definition
- CUDA Dynamic Parallelizm; stream synchronization from device
- Blas GEMM launch failed: what does this error mean?
- matrix multiplication using cuBLAS on alea gpu
- how to do power of complex number in CUBLAS?
- Find max/min in CUDA without passing it to the CPU
- CUDA/CUBLAS Matrix-Vector Multiplication
- CUDA/CUBLAS: Accessing elements in an array
- Multiple matrix-vector calls with CUBLAS
Related Questions in SCIKITS
- How to use multiple fuzzy input sets / relationships in scikit-fuzzy?
- What function does this code call?
- Plot density using observation weights
- How to do element-wise assignment in pycuda / scikits.cuda?
- Trouble installing scikit-bio on Windows
- How to convert an upper/lower gpuarray to the specific format required by cublasStbsv?
- Scikit Naive Bayes Classification for text
- How do I create a sklearn.datasets.base.Bunch object in scikit-learn from my own data?
- Out of Memory Error in Scikit-learn MultinomialNB
- Why I don't get the same feature descriptions when using the scikit-image hog and the OpenCV hog?
- Python scikits.talkbox can't be used from pip -- want to use lpc
- Is it possible to define a semi-annual frequency for a python scikits.timeseries time_series object?
- Python module audiolab returns error when function is called
- how to Load CSV Data in scikit and using it for Naive Bayes Classification
- How can I define a rank-preserving score function within a LinearRegression model?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
I got it done by using theano. First converted it to cudandarray, change stride and make a copy back to gpuarray. Just be careful about changes between Fortran and C order. update: finally got it done by using gpuarray.multi_take_put