Weka's J48 allows one to check information gain on a full set of attributes, should I use those significant attributes to build my model? Or should I use the full set of attributes?
Use significant attributes only, or use full set of attributes to build J48 model after checking information gain?
158 Views Asked by Guanhua Lee At
1
There are 1 best solutions below
Related Questions in TREE
- Python - how to make tree without any library
- how to get the full path of antd tree
- Python Quadtree won't insert values
- Top View Of Binary Tree Depth First Search Using TreeMap
- Select/filter tree structure in postgres
- PySimpleGUI tree doesn't Insert data into tree
- Is it possible to create a node-link diagram with ggplot?
- Represent a full, but not complete, binary tree with an array structure
- Redirecting stdout with execvp
- Prevent selected node to be unselect primevue Tree component
- Binary Search Tree (BST) - array representations
- Debugging AVL Tree Deletion: Unbalanced Node Not on Deletion Path
- How to shorten line length in react-d3-tree
- installed dm-tree vs imported tree
- Why the height of segment tree is O(logn)
Related Questions in WEKA
- I keep getting a "NoClassDefFound" error with Weka Ai using Java. I keep getting this Error?
- How to treat integer attributes in WEKA i.e. number of bedrooms (cannot be float values)
- Dataset not being accepted by Weka's J48 plugin (C 4.5 algorithm)
- weka inital heap size memory allocated
- Problem with Decision Tree Visualization in Weka: sorry there is no instances data for this node
- How can I limit the depth of a decision tree using C4.5 in Weka?
- Weka supplied test set didn't process the full dataset
- converting a csv file to arff file using weka converter, but it is not counting enough columns
- i have loaded a csv file in weka tool but J48 is not highlight
- Why am I getting these exceptions when trying to load a .csv file into Weka 3.8.6?
- converting a csv file to arff file using weka converter
- WEKA EEG data Filter creation
- How can I see the ideal range of a numerical independent variable according to its dependent variable?
- Intepreting WEKA data
- Java Weka API: Getting ROC Area values
Related Questions in C4.5
- Use significant attributes only, or use full set of attributes to build J48 model after checking information gain?
- Information Gain in R
- ML Decision Tree classifier is only splitting on the same tree / asking about the same attribute
- How does pessimistic error pruning in C4.5 algorithm working?
- Paralleizing implementation of Decision tree ID3/C4.5 on Hadoop
- I am looking for specific algorithms in Orange
- R caret train() underperforming on J48 compared to manual parameter setting
- Identify application based on its packets
- C4.5 Decision Tree Algorithm doesn't improve the accuracy
- c4.5 algorithm missing values
- R packages/models that can handle NA's
- Numeric Values in C4.5 algorithm
- Reduced Error Pruning Algorithm
- Pruning nodes in a decision tree
- Transform from one decision tree (J48) classification to ensemble in python
Related Questions in J48
- Dataset not being accepted by Weka's J48 plugin (C 4.5 algorithm)
- i have loaded a csv file in weka tool but J48 is not highlight
- Weka j48 output
- How to install ant package in java correctly?
- Using WEKA Filters in Java - Oversampling and Undersampling
- make_Weka_classifier("weka/classifiers/bayes/naiveBayes") and J48 not working on my R?
- Zero-R model calculation of Sensitivity and Specificity using Confusion Matrix and Statistics with Caret
- Weka 3.8 - the decision tree J48 seem to have correct tree to predicate data but fail on the testing
- J48 Analysis With WEKA
- How visualize j48 tree in weka
- J48 algorithm in weka algorithm or flowchart steps
- What does Number of leaves and Size of tree mean in Weka?
- Weka - How can I improve J48 performance?
- How to get classification values in RWeka?
- What is the meaning of leaf node of J48 tree classifier
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
In data mining, there is a multi-way trade-off between the number of features that you use, your accuracy, and the time it takes to generate a model. In theory, you'd want include every possible feature to boost accuracy; however, going about data mining in this way guarantees lengthy model generation times. Further, models that produce textual decision trees like J48 aren't as useful when the tree has thousands of nodes.
Depending on how many features you start out with, you may very well want to remove features that don't provide a large enough information gain. If you have a small number of features to begin with (e.g. fewer than 20), it might make sense just to keep all of them.
If you do wish to limit the number of features you use, it would be best to choose those with the highest Information Gain. It would also be worthwhile to look into things like Principal Component Reduction (which can be done through WEKA) to help select the best features.