I have music data which contains raw vocals of some indian classical and devotional songs and I have segmented each file into 20 seconds files. I have used librosa.load function with a sample rate of 3000 to load the audio file. The output returned by the function is of type ndarrray and I have converted it to pytorch tensor of type float and reshaped it to(1,60001). Each audio file is labeled. I want to do the classification task using lstm. If I consider batch_size as 32 then input will be of shape(32,1,60001). I want to know what is the input size and sequence length in this case. Considering audio as a single channel input will the input size be 1 and sequence length be 60001.The model ran faster each epoch when the input size was 60001.Model ran very very slow when the input size was 1.Which is the correct way of considering the input size and sequence length of the lstm model. Can lstm be trained using raw audio files for classification task??
Input size and sequence length of lstm pytorch
39 Views Asked by Sai Abhishek Bhyri At
0
There are 0 best solutions below
Related Questions in AUDIO
- how to play a sounds in c# forms?
- Winsound not working isn't working at all
- Ringing noise overpowering voice / Recording audio with Max 9814 microphone on Raspberry pi pico using ADC Pin / Circuitpython
- How to take first x seconds of Audio from a wav file read from AWS S3 as binary stream using Python?
- gluon attach audio doesn't play any sound on android
- Implementing trim and fade filters with ffmpeg - MP3
- Unable to set device connection state as INPUT device type is none
- Is there a way to differentiate music and talking from a video?
- How to concatenate audio tracks and make them start a certain moment using Python?
- Combine two audio in different languages to one natural sounding
- STM32 - Serial Audio Interface (SAI) - dual data line transmit possible?
- playing mp3 downloaded via curllib gets cut short
- How to stream PCM audio to a speakers both on mac and linux in Node.js?
- Scikit-Maad -From the function rois.find_rois_cwt, I want to get a csv of the outputs so I can do my own analysis on it
- Using MediaPlayer slows down SoundPool sound effect
Related Questions in PYTORCH
- Influence of Unused FFN on Model Accuracy in PyTorch
- Conda CMAKE CXX Compiler error while compiling Pytorch
- Which library can replace causal_conv1d in machine learning programming?
- yolo v5 export to torchscript: how to generate constants.pkl
- Pytorch distribute process across nodes and gpu
- My ICNN doesn't seem to work for any n_hidden
- a problem for save and load a pytorch model
- The meaning of an out_channel in nn.Conv2d pytorch
- config QConfig in pytorch QAT
- Can't load the saved model in PyTorch
- How can I convert a flax.linen.Module to a torch.nn.Module?
- Snuffle in PyTorch Dataloader
- Cuda out of Memory but I have no free space
- Can not load scripted model using torch::jit::load
- Should I train my model with a set of pictures as one input data or I need to crop to small one using Pytorch
Related Questions in CLASSIFICATION
- While working on binary image classification, the class mode set to binary incorrectly labels the images, but does it correct on categorical
- Decision tree using rpart for factor returns only the first node
- Can someone interpret my Binary Cross Entropy Loss Curve?
- The KNN model I am using is always coming back at 100% accuracy but it shouldn't be
- Normal Bayes Classification
- Outlier removing based on spectral signal in Google Earth Engine (GEE)
- Questions of handling imbalance dataset classification
- How to quantify the consistency of a sequence of predictions, incl. prediction confidence, using standard function from sklearn or a similar library
- Audio data preprocessing in Machine Learning
- Why is my validation accuracy not as smooth as my validation loss?
- sklearn ComplementNB: only class 0 predictions for perfectly seperable data
- Stacking Ensamble Learning for MultilabelClassification
- How to convert frame features and frame mask as a single variable data?
- Input size and sequence length of lstm pytorch
- Classification techniques for continuous arrays as inputs and scalar categorical variable as output
Related Questions in LSTM
- Conclusion from PCA of dataset
- Google Tensorflow LSTMCell Variables Mapping to Hochreiter97_lstm.pdf paper
- Predicting the Sinus Functions with RNNs
- CNTK Complaining about Dynamic Axis in LSTM
- How to Implement "Multidirectional" LSTMs?
- Many-to-one setting in LSTM using CNTK
- Error in Dimension for LSTM in tflearn
- LSTM model approach for time series (future prediction)
- How to improve the word rnn accuracy in tensorflow?
- How to choose layers in RNN (recurrent neural networks)?
- How to insert a value at given index or indices ( mutiple index ) into a Tensor?
- Retrieving last value of LSTM sequence in Tensorflow
- LSTM Networks for Sentiment Analysis - How to extend this model to 3 classes and classify new examples?
- Choosing the Length of Time Steps in Recurrent Neural Network
- The meaning of batch_size in ptb_word_lm (LSTM model of tensorflow)
Related Questions in LIBROSA
- Applying a window function to a frame in librosa
- How to load another file in libROSA?
- Using Librosa to plot a mel-spectrogram
- load directly an audio file with librosa in dB
- Effect of window shifting in spectrogram?
- Time steps difference in spectrogram
- Audio signal split at word level boundary
- understanding librosa.feature.spectral_contrast
- Librosa Spectogram vs Matplotlib Spectrogram
- How can I process OPUS format with Librosa?
- Possible to reconstruct audio only with spectrogram image?
- The use of librosa.effects.trim to remove the silent part in audio
- can mceps be used to classfication audio's kind?
- Librosa - How to create mel-spectrogram for stereophonic audio?
- I'm trying to generate a text file merging two arrays I created, but when I try to do it a bigger array appears which is not the original one
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?