I am trying to implement a convolutional neural netwrok and I don't understand why using im2col operation is more efficient. It basically stores the input to be multiplied by filter in separate columns. But why shouldn't loops be used directly to calculate convolution instead of first performing im2col ?
How is using im2col operation in convolutional nets more efficient?
10.1k Views Asked by Ayush Chaurasia At
1
There are 1 best solutions below
Related Questions in NEURAL-NETWORK
- How to choose good SURF feature keypoints?
- How to avoid overfitting (Encog3 C#)?
- Run out of VRAM using Theano on Amazon cluster
- Calculating equation from image in Java
- Print output of a Theano network
- Torch Lua: Why is my gradient descent not optimizing the error?
- How can I train a neural (pattern recognition) network multiple times in matlab?
- Using Convolution Neural Net with Lasagne in Python error
- Random number of hidden units improves accuracy/F-score on test set
- Matlab example code for deep belief network for classification
- Pybrain Reinforcement Learning Example
- How to speed up caffe classifer in python
- Opencv mlp Same Data Different Results
- Word2Vec Data Setup
- How can I construct a Neural Network in Matlab with matrix of features extracted from images?
Related Questions in CONV-NEURAL-NETWORK
- Using Convolution Neural Net with Lasagne in Python error
- How to prepare data for torch7 deep learning convolutional neural network example?
- additive Gaussian noise in Tensorflow
- Same output in neural network for each input after training
- ConvNet : Validation Loss not strongly decreasing but accuracy is improving
- Tensor flow affecting multiprocessing/threading
- Inceptionv3 Transfer Learning on Torch
- Transfer weights from caffe to tensorflew
- Lasagne NN strange behavior with accuracy and weight convergence
- Multiple outputs in Keras gives value error
- How to use feature maps of CNN to localize obect on the image?
- Why Validation Error Rate remain same value?
- How to create LMDB files for semantic segmentation?
- Training model to recognize one specific object (or scene)
- Restoring saved TensorFlow model to evaluate on test set
Related Questions in CORRELATION
- Corrcoef in Matlab is very slow
- Pairwise correlations over rolling periods ignoring double calculations
- Plot only one or few rows of a correlation matrix
- Build Correlation Matrix to use two different cells of two Csv files in Rapid miner
- How to implement the fast fourier transform to correlate two 2d arrays?
- SoapUI correlation (property transfer)
- Correlate by levels of a variable in R
- Calculating running window Spearman correlation and pvalue in R
- Retaining MDC with Spring AMQP Request/Reply
- Use a spearman rank correlation matrix to perform stepwise deletion of candidate variables based on % deviance explained rankings?
- Normalized Cross correlation
- Correlate a single time series with a large number of time series
- cor.test correlation for 2 data frames with p-value in R
- Pearson Correlation between two Matrices
- ggplot2 heatmap: side margins and color legend
Related Questions in CONVOLUTION
- FFT Fast Convolution: How To Apply Window to minimize crackling
- Is there A 1D interpolation (along one axis) of an image using two images (2D arrays) as inputs?
- xcorr function with impulse response
- I got a error when running a github project in tensorflow
- Why Validation Error Rate remain same value?
- iOS: How to manually set a 2D Float data to MTLTexture or MPSImage?
- Compute mean squared, absolute deviation and custom similarity measure - Python/NumPy
- How to put an "arbitrary" operation into a sliding window using Theano?
- Why can't I train overfeat network with convolution layer 1x1
- Efficient way to calculate the "product" of a discrete convolution
- Implement a custom layer after a series of MPSCNNConvolution
- Make any keras network convolutional
- Why does my Sobel edge detection code not work?
- Convolution by multiplying list of numbers in memory, so an inverse convolution algorithm?
- How to use cross-validation method in Tensor-Flow
Related Questions in DECONVOLUTION
- looking for source code of from gen_nn_ops in tensorflow
- Convolution by multiplying list of numbers in memory, so an inverse convolution algorithm?
- Transposed convolution on feature maps using Theano
- How is using im2col operation in convolutional nets more efficient?
- Issues with the ADAPTS- R package
- domain of inverse fourier transform after operation
- A question about deconvolution of a signal using Python scipy
- How can I properly reconstruct a signal using signal.deconvolve applied to a signal convolved with a ricker wavelet using signal.convolve?
- Problems with Deconvolution with R (deamer package)
- Peak deconvolution knowing the weight function in python
- ArrayFire: Why doesn't convolution followed by deconvolution return the original image (not even close)
- How to Zero Pad in Image processing (FFT)
- On the behavior of deconvolution with scipy and fft with random noise
- Fourier deconvolution doesn't work as expected on python
- Given the radius, how do I unblur an image with gaussian blur?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Well, you are thinking in the right way, In Alex Net almost 95% of the GPU time and 89% on CPU time is spent on the Convolutional Layer and Fully Connected Layer.
The Convolutional Layer and Fully Connected Layer are implemented using GEMM that stands for General Matrix to Matrix Multiplication.
So basically in GEMM, we convert the convolution operation to a Matrix Multiplication operation by using a function called
im2col()which arranges the data in a way that the convolution output can be achieved by Matrix Multiplication.Now, you may have a question instead of directly doing element wise convolution, why are we adding a step in between to arrange the data in a different way and then use GEMM.
The answer to this is, scientific programmers, have spent decades optimizing code to perform large matrix to matrix multiplications, and the benefits from the very regular patterns of memory access outweigh any other losses. We have an optimized CUDA GEMM API in cuBLAS library, Intel MKL has an optimized CPU GEMM while ciBLAS's GEMM API can be used for devices supporting OpenCL.
Element wise convolution performs badly because of the irregular memory accesses involved in it.
In turn,
Im2col()arranges the data in a way that the memory accesses are regular for Matrix Multiplication.Im2col()function adds a lot of data redundancy though, but the performance benefit of using Gemm outweigh this data redundancy.This is the reason for using
Im2col()operation in Neural Nets.This link explains how
Im2col()arranges the data for GEMM: https://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/