Support Vector Machine Algorithm

In this page we will learn about What is Support Vector Machine Algorithm in Machine Learning?, Types of SVM, Hyperplane and Support Vectors in the SVM algorithm, How does SVM works?, Python Implementation of Support Vector Machine, Fitting the SVM classifier to the training set.


What is Support Vector Machine Algorithm?

The Support Vector Machine Algorithm, or SVM, is a popular Supervised Learning technique that may be used to solve both classification and regression issues. However, it is mostly utilized in Machine Learning for Classification difficulties.
The SVM algorithm's purpose is to find the optimum line or decision boundary for categorizing n-dimensional space into classes so that additional data points can be readily placed in the correct category in the future. A hyperplane is the name for the optimal choice boundary.
The extreme points/vectors that assist create the hyperplane are chosen via SVM. Support vectors are the extreme instances, and the algorithm is called a Support Vector Machine. Consider the diagram below, which shows how a decision boundary or hyperplane is used to classify two different categories:

support vector machine algorithm

Example: The example we used in the KNN classifier can help you understand SVM. If we observe an unusual cat that also has some dog-like characteristics, we can use the SVM method to develop a model that can properly identify whether it is a cat or a dog. We'll first train our model with a large number of photographs of cats and dogs so that it can learn about their many characteristics, and then we'll put it to the test with this weird creature. As a result, the extreme case of cat and dog will be shown because the support vector forms a decision boundary between these two data (cat and dog) and chooses extreme cases (support vectors). It will be classified as a cat based on the support vectors.

support vector machine algorithm 2

Face detection, image classification, text categorization, and other tasks can all benefit from the SVM method.

Types of SVM

SVM can be of two types:

  • Linear SVM: Linear SVM is a classifier that is used for linearly separable data, which implies that if a dataset can be classified into two classes using a single straight line, it is called linearly separable data, and the classifier is named Linear SVM.
  • Non-linear SVM: Non-linear SVM is used for non-linearly separated data, which means that if a dataset cannot be classified using a straight line, it is non-linear data, and the classifier used is Non-linear SVM

Hyperplane and Support Vectors in the SVM algorithm:

In n-dimensional space, there can be several lines/decision boundaries to separate the classes, but we must choose the optimum decision boundary to help classify the data points. The hyperplane of SVM refers to the best boundary.
The hyperplane's dimensions are determined by the features in the dataset; for example, if there are two features (as shown in the image), the hyperplane will be a straight line. If three characteristics are present, the hyperplane will be a two-dimensional plane.
We always make a hyperplane with a maximum margin, which refers to the distance between data points.

Support Vector:

Support Vectors are the data points or vectors that are closest to the hyperplane and have an effect on the hyperplane's position. These vectors are called Support vectors because they support the hyperplane.

How does SVM works?

Linear SVM:

An example can be used to explain how the SVM algorithm works. Assume that we have a dataset with two tags (green and blue) and two features (x1 and x2). We're looking for a classifier that can categorize the pair of coordinates (x1, x2) as green or blue. Consider the following illustration:
An example can be used to explain how the SVM algorithm works. Assume that we have a dataset with two tags (green and blue) and two features (x1 and x2). We're looking for a classifier that can categorize the pair of coordinates (x1, x2) as green or blue. Consider the following illustration:

support vector machine algorithm 3

Because this is a two-dimensional space, we can easily separate these two classes by drawing a straight line between them. However, numerous lines can be used to divide these classes. Consider the following illustration:

support vector machine algorithm 4

As a result, the SVM method aids in the discovery of the best line or decision boundary, which is referred to as a hyperplane. The SVM method locates the intersection of the lines from both classes. Support vectors are the names given to these points. Margin is the distance between the vectors and the hyperplane. The purpose of SVM is to increase this margin as much as possible. The ideal hyperplane is the one with the greatest margin.

support vector machine algorithm 5

Non-Linear SVM:

We can separate data that is linearly structured using a straight line, but we cannot draw a single straight line for non-linear data. Consider the following illustration:

support vector machine algorithm 6

So we'll need to add another dimension to distinguish these data pieces. We've utilized two dimensions for linear data, x and y, so we'll add a third dimension, z, for non-linear data. It can be calculated using the following formula:

z = x2 + y2

The example space will look like this after adding the third dimension:

support vector machine algorithm 7

So now, SVM will divide the datasets into classes in the following way. Consider the below image:

support vector machine algorithm 8

Because we are in 3-d space, it appears to be a plane parallel to the x-axis. If we convert it to 2d space with z=1, it looks like this:

support vector machine algorithm 9

In the case of non-linear data, we get a circumference of radius 1.

Python Implementation of Support Vector Machine

Now we'll use Python to develop the SVM algorithm. We'll use the same dataset user data that we used for KNN classification and Logistic regression.

Data Pre-processing step

The code will remain the same till the Data pre-processing stage. The code is as follows:

 
   #Data Pre-processing Step  
   # importing libraries  
   import numpy as nm  
   import matplotlib.pyplot as mtp  
   import pandas as pd  

   d#importing datasets  
   data_set = pd.read_csv('user_data.csv')  

   #Extracting Independent and dependent Variable  
   x = data_set.iloc[:, [2,3]].values  
   y = data_set.iloc[:, 4].values  

   #Splitting the dataset into training and test set.  
   from sklearn.model_selection import train_test_split  
   x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.25, random_state = 0)  
   #feature Scaling  
   from sklearn.preprocessing import StandardScaler    
   st_x = StandardScaler()    
   x_train = st_x.fit_transform(x_train)    
   x_test = st_x.transform(x_test)       

We will pre-process the data after executing the above code. The code will give the dataset as:

support vector machine algorithm 10

The scaled output for the test set will be:

support vector machine algorithm 11

Fitting the SVM classifier to the training set:

The SVM classifier will now be fitted to the training set. We'll use the SVC class from the Sklearn.svm package to build the SVM classifier. The code for it is as follows:


    #"Support vector classifier" 
    rom sklearn.svm import SVC 
    classifier = SVC(kernel='linear', random_state=0)  
    classifier.fit(x_train, y_train)  

We used kernel='linear' in the above code since we are generating SVM for linearly separable data. For non-linear data, though, we can adjust it. The classifier was then fitted to the training dataset (x train, y train).

Output:

                
 
  Out[8]:
     SVC(C = 1.0, cache_size = 200, class_weight = None, coef0 = 0.0,
     decision_function_shape = 'ovr', degree =3,
     gamma = 'auto_deprecated',  
     kernel = 'linear', max_iter = -1, probability = False,
     random_state=0,  
     shrinking = True, tol = 0.001, verbose = False)  

Change the values of C(Regularization factor), gamma, and kernel to vary the model's performance.

Predicting the test set result:

We'll now predict the output for the test set. We'll do this by creating a new vector called y pred. The code for I is shown below.

 
   #Predicting the test set result  
   y_pred = classifier.predict(x_test)  

After getting the y pred vector, we can compare the y pred and y test results to see how much the actual value differs from the anticipated value.

Output:

The following is the output for the test set prediction:

support vector machine algorithm 12

Creating the confusion matrix: Now we'll compare the performance of the SVM classifier to the Logistic regression classifier to see how many false predictions it makes. To make the confusion matrix, we'll need to use the sklearn library's confusion matrix function. We'll use a new variable cm to call the function when it's been imported. The function has two parameters: y true (the actual values) and y pred (the predicted values) (the targeted value return by the classifier). The code for it is as follows:

 
   #Creating the Confusion matrix  
   from sklearn.metrics import confusion_matrix  
   cm = confusion_matrix(y_test, y_pred)  

Output:

support vector machine algorithm 13

There are 66+24= 90 correct predictions and 8+2= 10 correct predictions, as seen in the above output image. As a result, we can say that our SVM model outperformed the Logistic regression model.

Visualizing the training set result:

The result of the training set will now be visualized; the code for this may be seen below:

 
  from matplotlib.colors import ListedColormap  
  x_set, y_set = x_train, y_train  
  x1, x2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:, 0].max() + 1, step = 0.01),  
  nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1, step = 0.01))  
  mtp.contourf(x1, x2, classifier.predict(nm.array([x1.ravel(), x2.ravel()]).T).reshape(x1.shape),  
  alpha = 0.75, cmap = ListedColormap(('red', 'green')))  
  mtp.xlim(x1.min(), x1.max())  
  mtp.ylim(x2.min(), x2.max())  
  for i, j in enumerate(nm.unique(y_set)):  
    mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1],  
      c = ListedColormap(('red', 'green'))(i), label = j)  
  mtp.title('SVM classifier (Training set)')  
  mtp.xlabel('Age')  
  mtp.ylabel('Estimated Salary')  
  mtp.legend()  
  mtp.show()                                      

Output:

By executing the above code, we will get the output as:

support vector machine algorithm 14

As can be seen, the above output looks a lot like the Logistic regression output. We received the straight line as hyperplane in the output since we utilized a linear kernel in the classifier. We've already shown that the hyperplane in SVM is a straight line in 2d space.

Visualizing the test set result:

 
  #Visulaizing the test set result  
  from matplotlib.colors import ListedColormap  
  x_set, y_set = x_test, y_test  
  x1, x2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:, 0].max() + 1, step  = 0.01),  
  nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1, step = 0.01))  
  mtp.contourf(x1, x2, classifier.predict(nm.array([x1.ravel(), x2.ravel()]).T).reshape(x1.shape),  
  alpha = 0.75, cmap = ListedColormap(('red','green' )))  
  mtp.xlim(x1.min(), x1.max())  
  mtp.ylim(x2.min(), x2.max())  
  for i, j in enumerate(nm.unique(y_set)):  
    mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1],  
      c = ListedColormap(('red', 'green'))(i), label = j)  
  mtp.title('SVM classifier (Test set)')  
  mtp.xlabel('Age')  
  mtp.ylabel('Estimated Salary')  
  mtp.legend()  
  mtp.show()                                                                                       

Output:

By executing the above code, we will get the output as:

support vector machine algorithm 15

The SVM classifier has separated the users into two zones, as shown in the above output image (Purchased or Not purchased). Users who bought the SUV are represented by the red scatter spots. Users who did not purchase the SUV are represented by green scatter points in the green region. The hyperplane has separated the two classes into variables that have been purchased and variables that have not been purchased.