I have an assignment, which is below. I have done the first 5 tasks and have a problem with the last one. To plot it. Please give instruction on how to do it. Thank you in advance.
*(I have started learning SVM and ML just several days ago, please take it into account)
**(As I think the sequence of actions should be the same for plotting for all types of kernels. If you show even for one of them it would be great. I will try to adapt your code for others)
The procedure to follow:
Randomly take the samples from this map. (#100) and take this into Python for SVC. This dataset includes Easting, Northing and Rock information.
With these 100 randomly selected samples, split again randomly to train and test datasets.
Try to run the SVC with the kernels of linear, polynomial, radial basis function, and tangent.
Find the best of each, for instance, if you are using a radial basis function, which "C" and "gamma" can be the optimum one based on the accuracy that you get from accuracy scores.
Once you have the fitted model and you calculated the accuracy scores (obtained from test dataset), then import the whole dataset into the obtained FIT MODELS and predict the output of all that 90,000 sample points that we have in the reference.csv.
Show me the obtained maps and also the accuracy scores that you get from each FIT MODEL.
The dataset looks like:
90000 points in the same style.
Here is the code:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
### Importing Info
df = pd.read_csv("C:/Users/Admin/Desktop/RA/step 1/reference.csv", header=0)
df_model = df.sample(n = 100)
df_model.shape
## X-y split
X = df_model.loc[:,df_model.columns!="Rock"]
y = df_model["Rock"]
y_initial = df["Rock"]
### for whole dataset
X_wd = df.loc[:, df_model.columns!="Rock"]
y_wd = df["Rock"]
## Test-train split
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
## Standardizing the Data
from sklearn.preprocessing import StandardScaler
sc = StandardScaler().fit(X_train)
X_train_std = sc.transform(X_train)
X_test_std = sc.transform(X_test)
## Linear
### Grid Search
from sklearn.model_selection import GridSearchCV
from sklearn import svm
from sklearn.metrics import accuracy_score, confusion_matrix
params_linear = {'C' : (0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10, 50, 100, 500,1000)}
clf_svm_l = svm.SVC(kernel = 'linear')
svm_grid_linear = GridSearchCV(clf_svm_l, params_linear, n_jobs=-1,
cv = 3, verbose = 1, scoring = 'accuracy')
svm_grid_linear.fit(X_train_std, y_train)
svm_grid_linear.best_params_
linsvm_clf = svm_grid_linear.best_estimator_
accuracy_score(y_test, linsvm_clf.predict(X_test_std))
### training svm
clf_svm_l = svm.SVC(kernel = 'linear', C = 0.1)
clf_svm_l.fit(X_train_std, y_train)
### predicting model
y_train_pred_linear = clf_svm_l.predict(X_train_std)
y_test_pred_linear = clf_svm_l.predict(X_test_std)
y_test_pred_linear
clf_svm_l.n_support_
### whole dataset
y_pred_linear_wd = clf_svm_l.predict(X_wd)
### map
## Poly
### grid search for poly
params_poly = {'C' : (0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10, 50, 100, 500,1000),
'degree' : (1,2,3,4,5,6)}
clf_svm_poly = svm.SVC(kernel = 'poly')
svm_grid_poly = GridSearchCV(clf_svm_poly, params_poly, n_jobs = -1,
cv = 3, verbose = 1, scoring = 'accuracy')
svm_grid_poly.fit(X_train_std, y_train)
svm_grid_poly.best_params_
polysvm_clf = svm_grid_poly.best_estimator_
accuracy_score(y_test, polysvm_clf.predict(X_test_std))
### training svm
clf_svm_poly = svm.SVC(kernel = 'poly', C = 50, degree = 2)
clf_svm_poly.fit(X_train_std, y_train)
### predicting model
y_train_pred_poly = clf_svm_poly.predict(X_train_std)
y_test_pred_poly = clf_svm_poly.predict(X_test_std)
clf_svm_poly.n_support_
### whole dataset
y_pred_poly_wd = clf_svm_poly.predict(X_wd)
### map
## RBF
### grid search rbf
params_rbf = {'C' : (0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10, 50, 100, 500,1000),
'gamma' : (0.001, 0.01, 0.1, 0.5, 1)}
clf_svm_r = svm.SVC(kernel = 'rbf')
svm_grid_r = GridSearchCV(clf_svm_r, params_rbf, n_jobs = -1,
cv = 10, verbose = 1, scoring = 'accuracy')
svm_grid_r.fit(X_train_std, y_train)
svm_grid_r.best_params_
rsvm_clf = svm_grid_r.best_estimator_
accuracy_score(y_test, rsvm_clf.predict(X_test_std))
### training svm
clf_svm_r = svm.SVC(kernel = 'rbf', C = 500, gamma = 0.5)
clf_svm_r.fit(X_train_std, y_train)
### predicting model
y_train_pred_r = clf_svm_r.predict(X_train_std)
y_test_pred_r = clf_svm_r.predict(X_test_std)
### whole dataset
y_pred_r_wd = clf_svm_r.predict(X_wd)
### map
## Tangent
### grid search
params_tangent = {'C' : (0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10, 50),
'gamma' : (0.001, 0.01, 0.1, 0.5, 1)}
clf_svm_tangent = svm.SVC(kernel = 'sigmoid')
svm_grid_tangent = GridSearchCV(clf_svm_tangent, params_tangent, n_jobs = -1,
cv = 10, verbose = 1, scoring = 'accuracy')
svm_grid_tangent.fit(X_train_std, y_train)
svm_grid_tangent.best_params_
tangentsvm_clf = svm_grid_tangent.best_estimator_
accuracy_score(y_test, tangentsvm_clf.predict(X_test_std))
### training svm
clf_svm_tangent = svm.SVC(kernel = 'sigmoid', C = 1, gamma = 0.1)
clf_svm_tangent.fit(X_train_std, y_train)
### predicting model
y_train_pred_tangent = clf_svm_tangent.predict(X_train_std)
y_test_pred_tangent = clf_svm_tangent.predict(X_test_std)
### whole dataset
y_pred_tangent_wd = clf_svm_tangent.predict(X_wd)
### map
Here is the answer for plotting linear visualization, for those who will encounter the same problem as me. It will be easy to adapt these code for other kernels.