I generated a data-set from a bivariate gaussian to simulate a signal and a background. Then I tried to separate the two components using svm.SVC with kernel 'rbf'. Now I'd like to count how many background points lay inside the acceptance region, how many signal points lay outside the region etc. in order to study the efficiency of the separation method.
I have to say that I got parts of this code from an online example because it is the first time for me doing such type of analysis; so I'm not sure I really understand how the contour line is created, consequently I'm not able to handle it and to analyse its position respect to the points.
Thanks
mu_vec1 = np.array([0,0])
cov_mat1 = np.array([[0.3**2,0.5*0.3*0.3],[0.5*0.3*0.3,0.3**2]])
x1_samples = np.random.multivariate_normal(mu_vec1, cov_mat1, 1000)
#mu_vec1 = mu_vec1.reshape(1,2).T
mu_vec2 = np.array([2,1])
cov_mat2 = np.array([[1,0.4],[0.4,1]])
x2_samples = np.random.multivariate_normal(mu_vec2, cov_mat2, 1000)
#mu_vec2 = mu_vec2.reshape(1,2).T
X = np.concatenate((x1_samples,x2_samples), axis = 0)
Y = np.array([0]*1000 + [1]*1000)
plt.scatter(x1_samples[:,0],x1_samples[:,1], label='Signal')
plt.scatter(x2_samples[:,0],x2_samples[:,1],label='Background')
C = 1.0 # SVM regularization parameter
clf = svm.SVC(kernel = 'rbf', C=C, probability = True )
clf.fit(X, Y)
h = .02 # step size in the mesh
# create a mesh to plot in
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
np.arange(y_min, y_max, h))
print(clf.score(X,Y))
# Plot the decision boundary. For that, we will assign a color to each
# point in the mesh [x_min, m_max]x[y_min, y_max].
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
# Put the result into a color plot
Z = Z.reshape(xx.shape)
plt.contour(xx, yy, Z, cmap=plt.cm.Paired)
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
I actually solve it using the method 'predict' this way: