I’m currently using WEKA with a dataset based on housing. I have a numerical variable “square foot” and a binary yes/no variable “demand”. I’m trying to find out which number, or range of square foot is most likely to fall in to the yes category of demand (so what size property has the highest demand).
I tried to visualise it in a scatterplot in WEKA, with square footage on the y axis and demand on the x, but it doesn’t show specific enough numbers, only at 3 intervals along the axis which seems a bit useless.
Is there maybe a regression model that can be used here or a clearer way to visualise the plot? It has to be done in WEKA, otherwise I would just use matplotlib.
[This is not really a programming question...]
A regression model would only work if the class is numeric, however, your class is nominal (
yes/no).You could try discretizing your input variable using the weka.filters.supervised.attribute.Discretize filter. This supervised version takes the class attribute into account when generating the bins.