My question has three parts: (1) Can a feedforward Neural Network handle input features that are mixed: Some are categorical (discrete-valued: e.g., Low, Med, High) and some are real-valued? The total number of the input feature variables is about 80 - 90, and I wish to solve a (supervised) classification problem (2) If the answer to part (1) is yes, I have read about using binary codes like (Low = 001, Med = 010, High = 100, etc.) for representing the discrete-valued input feature-variables in other contexts--will that work for the NN's as well? I am concerned about scaling / normalization of the whole input feature vector (which I suppose is recommended)--how to scale/normalize the whole, mixed feature vector or it is not required? (3) Someone suggested that I use Random Forest (RF). I am not that familiar with the RF's. What are the pros and cons of using RF versus NN's in the given context?
Neural Nets Mixed Real-valued and Categorical Input Features
861 Views Asked by H W At
1
There are 1 best solutions below
Related Questions in MACHINE-LEARNING
- How to cluster a set of strings?
- Enforcing that inputs sum to 1 and are contained in the unit interval in scikit-learn
- scikit-learn preperation
- Spark MLLib How to ignore features when training a classifier
- Increasing the efficiency of equipment using Amazon Machine Learning
- How to interpret scikit's learn confusion matrix and classification report?
- Amazon Machine Learning for sentiment analysis
- What Machine Learning algorithm would be appropriate?
- LDA generated topics
- Spectral clustering with Similarity matrix constructed by jaccard coefficient
- Speeding up Viterbi execution
- Memory Error with Classifier fit and partial_fit
- How to find algo type(regression,classification) in Caret in R for all algos at once?
- Difference between weka tool's correlation coefficient and scikit learn's coefficient of determination score
- What are the approaches to the Big-Data problems?
Related Questions in NEURAL-NETWORK
- How to choose good SURF feature keypoints?
- How to avoid overfitting (Encog3 C#)?
- Run out of VRAM using Theano on Amazon cluster
- Calculating equation from image in Java
- Print output of a Theano network
- Torch Lua: Why is my gradient descent not optimizing the error?
- How can I train a neural (pattern recognition) network multiple times in matlab?
- Using Convolution Neural Net with Lasagne in Python error
- Random number of hidden units improves accuracy/F-score on test set
- Matlab example code for deep belief network for classification
- Pybrain Reinforcement Learning Example
- How to speed up caffe classifer in python
- Opencv mlp Same Data Different Results
- Word2Vec Data Setup
- How can I construct a Neural Network in Matlab with matrix of features extracted from images?
Related Questions in RANDOM-FOREST
- Is Gradient Boosting regression be more accurate (lower MSE) than the random forest?
- randomForest package in R mse calculation
- Big accuracy difference between cross-validation and testing with a test set in weka? is it normal?
- displaying variable in plot(varImp(randomForest_model))
- Encoding String to numbers so as to use it in scikit-learn
- Using the predict_proba() function of RandomForestClassifier in the safe and right way
- Why connection is terminating
- How to change the function a random forest uses to make decisions from individual trees?
- What is the equivalent to rpart.plot in Python? I want to visualize the results of my random forest
- How to predict probabilities on test dataset in R's caret package?
- R package for Weighted Random Forest? classwt option?
- is their any way to show random forest as nonlinear using suppose 100 attributes
- Is this the correct way of getting in-sample and out-of-sample predictions / performance in R's caret package?
- How to collapse a RandomForest into an equivalent decision tree?
- Random forests performed under expectation
Related Questions in FEATURE-SELECTION
- Selecting samples for supervised machine learning
- Multiple Scope value in Binding (Specflow)
- In sklearn, does a fitted pipeline reapply every transform?
- Identifying filtered features after feature selection with scikit learn
- mrmr feature selection and SVM classifier what is mean of m?
- "Points" not available for .C() for package "dprep"
- Find selected features by RandomizedLogisticRegression
- Reducing the Sparsity of a One-Hot Encoded dataset
- Sklearn MLP Feature Selection
- Recursive Feature Elimination CV in Sklearn changes when I remove features
- Is feature selection built into scikit-learn's SVMs?
- Display Correlation and pvalues as a list and erase which doesn't meet certain features
- how to calculate feature's discriminability
- Neural Nets Mixed Real-valued and Categorical Input Features
- Scikit-Learn Linear Regression how to get coefficient's respective features?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
As far as point 2 goes, if you transform each of your categorical inputs into a k-vector (with k = # of classes) you are just introducing k new inputs, which are scaled in the range [0, 1], so if your real-valued input features are themselves scaled in that range you're pretty much okay.
Note that if you are using a tanh activation function (whose outputs range from -1 to 1), you should transform your categorical input features accordingly, so (say k = 3):
0 should become <1, -1, -1>
1 should become <-1, 1, -1>
2 should become <-1, -1, 1>
Hope I'm clear about that.