Sometimes performing feature reduction reduces number of features with methods like PCA and then we could scale only the relevant variables. Is there a rule that we need to do normalization/scaling first and then the feature reduction?
Should we always first perform feature normalization and then the feature reduction?
756 Views Asked by Sharat Ainapur At
1
There are 1 best solutions below
Related Questions in MACHINE-LEARNING
- How to cluster a set of strings?
- Enforcing that inputs sum to 1 and are contained in the unit interval in scikit-learn
- scikit-learn preperation
- Spark MLLib How to ignore features when training a classifier
- Increasing the efficiency of equipment using Amazon Machine Learning
- How to interpret scikit's learn confusion matrix and classification report?
- Amazon Machine Learning for sentiment analysis
- What Machine Learning algorithm would be appropriate?
- LDA generated topics
- Spectral clustering with Similarity matrix constructed by jaccard coefficient
- Speeding up Viterbi execution
- Memory Error with Classifier fit and partial_fit
- How to find algo type(regression,classification) in Caret in R for all algos at once?
- Difference between weka tool's correlation coefficient and scikit learn's coefficient of determination score
- What are the approaches to the Big-Data problems?
Related Questions in DATA-SCIENCE
- How access a downloaded library that is not showing up?
- Convert groupby.DataFrameGroupBy object to a dictionary in Python
- How can I detect keypresses using accelerometer/gyroscope data?
- Multiple Linear Regression handle NA
- Input/output error while copying from hadoop file system to local
- Removing duplicated values with missing values in a dataframe
- R editing dataframe based on column value
- PredictionIO Universal Recommender
- Pandas : TypeError: float() argument must be a string or a number
- Text classification algorithms which are not Naive?
- adding row generated inside a loop to a new data frame
- How to read multiple line elements in Spark , where each record of log is starting with yyyy-MM-dd format and each record of log is multi-line?
- Pandas merge duplicate DataFrame columns preserving column names
- How to plot multiple graphs in one chart using pygal?
- Removing non-English words from text using Python
Related Questions in FEATURE-ENGINEERING
- What's the best way to represent Hour of Day and Day of Week as a feature in for value prediction models in Machine Learning?
- how to quantile-discretize on spark?
- Pandas: count identical values in columns but from different index
- Pandas: calculate the std of total column value per "year"
- Pandas: calculate mean of Dataframe column values per "year"
- Pandas: Filter correctly Dataframe columns considering multiple conditions
- How to filter a column by greater than considering an index
- Pandas: How to create a new column in a Dataframe and add values in it considering other existing columns
- Pandas: How to extract and calculate the number of "hour" per row in a Dataframe
- Pandas: How to extract and calculate the number of “hour” per row in a Dataframe
- Pandas: How to filter column information in Dataframe and process it differently
- creating new features with certain percentile of price
- How to complete cases by group
- Pandas Match list of URLs to check dependency
- Performing object column manipulation in python
Related Questions in MACHINE-LEARNING-MODEL
- Passing a dataframe as train data and multiple columns of dataframe as train labels to a machine learning prediction model
- Variable has feature 1, expected 127 in Machine Learning Model
- which power of the feature should i train with? regression
- Wrong type of credentials for creating tuning model in quickstart
- Unable to Find Option for Exporting Custom ML Model for Object Detection in Google Vision (GCP Vertex AI)
- "The size of byte buffer and the shape do not match" erro in android studio with TensorFlow lite model
- Model Trainer Issue on End-to-End ML Project - TypeError: __init__() got an unexpected keyword argument 'config'
- Model Trainer Issue on End-to-End ML Project - TypeError: __init__() got an unexpected keyword argument 'trained_model_file_path'
- Model Trainer Issue on End-to-End ML Project - TypeError: initiate_model_training() missing 4 required positional arguments
- Should we always first perform feature normalization and then the feature reduction?
- Machine Learning Model Choice: Use Features to Predict the Order/Schedule of Labels
- LSTM: ValueError: Input 0 of layer sequential is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 1)
- ValueError: Layer model expects 21 input(s), but it received 1 input tensors
- Creating a new feature from existing ones using a decision tree
- Using early stopping with SVR and grid search
Related Questions in FEATURE-SCALING
- How to implement PySpark StandardScaler on subset of columns?
- Feature rescaling for k-means clustering
- Should we always first perform feature normalization and then the feature reduction?
- Feature Scaling for Time Series Forecasting
- Data normalization and rescaling value in Python
- Does feature scaling need to be done separately for independent variables?
- Does it makes sense to scale features by only one label before using logistic regression?
- How to calculate the number of features based on image resolution in neural networks(non-linear hypothesis)?
- Feature scaling in an incremental analysis
- Machine Learning: Combining Binary Encoder and RobustScaler
- Normalize data before removing low variance, makes errors
- Why Does Tree and Ensemble based Algorithm don't need feature scaling?
- Unable to inverse_transform the value of feature because of different dimensionality
- Is there any package available for scaling to unit length in R?
- Data leakage when feature scaling with K-fold cross validation in R
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
I would suggest first do your normalization/scaling on your feature data and then performing feature selection. This is because most of the feature selection techniques require a meaningful representation of your data. By normalizing your data your features have the same order of magnitude and scatter, which makes it easier to find which one of those is more relevant.
For example, for PCA the computation is based on the standard deviation (SD) of your features to find the relevant axis of a new projection of your data. If you do not normalize your data, features with a high SD will have a higher weight compared to features with a small SD distorting their relevance when computing the PCA.