Support Vector Machine (SVM) is a supervised machine learning algorithm that is widely used in classification, regression, and outlier detection problems. SVM is based on the concept of finding the optimal hyperplane that separates different classes in the feature space.
In detail, the SVM algorithm works by mapping the input data into a highdimensional feature space using a nonlinear mapping function. It then finds the optimal hyperplane that maximizes the margin between the two classes. The margin is the distance between the hyperplane and the nearest data points from each class. The hyperplane that has the maximum margin is the one that is chosen as the optimal hyperplane. SVM is capable of handling both linear and nonlinear classification problems by using different kernel functions.
How the algorithm works:
 First, the algorithm takes the input data and maps it into a higherdimensional space. This mapping is done using a kernel function, which transforms the input data into a new space where it is easier to separate the classes using a hyperplane.
 Next, the algorithm finds the hyperplane that maximizes the margin between the two classes. The margin is the distance between the hyperplane and the closest data points from each class.
 The algorithm then predicts the class of new data points by determining which side of the hyperplane they fall on.
Advantages of SVM include:

 SVM can handle both linear and nonlinear classification problems by using different kernel functions such as linear, polynomial, radial basis function (RBF), and sigmoid.
 SVM can handle highdimensional data and can perform well even when the number of features is greater than the number of samples.
 SVM has a regularization parameter that helps to avoid overfitting and improve the generalization performance of the model.
 SVM can handle both binary and multiclass classification problems by using different strategies such as onevsone and onevsall.
Disadvantages of SVM include:

 SVM can be sensitive to the choice of kernel function and its parameters. Choosing the right kernel function and its parameters can be a challenging task.
 SVM can be computationally expensive, especially for large datasets with a large number of features.
 SVM can be sensitive to outliers in the data and may result in a suboptimal solution.
An example of building a simple SVM; model using Python’s scikitlearn library:
1.First, let’s load the dataset and split it into training and testing sets:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
# Load data
Cancer = load_breast_cancer()
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(cancer.data, cancer.target, test_size=0.3, random_state=42)
2. Next, let’s create an SVM model with a radial basis function (RBF) kernel and a regularization parameter of 1.0:
from sklearn.svm import SVC
# Create SVM model
svc = SVC(kernel='rbf', C=1.0)
svc = SVC(kernel='rbf', C=1.0)
3. We can train the model on the training data using the fit method:
# Train SVM model on training data
svc.fit(X_train, y_train)
4. We can then use the model to make predictions on the testing data using the predict method:
# Make predictions on testing data
y_pred = svc.predict(X_test)
5. Finally, we can evaluate the performance of the model using metrics such as accuracy, precision, recall, and F1score:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
# Calculate evaluation metrics
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)
# Print evaluation metrics
print('Accuracy: {:.2f}'.format(accuracy))
print('Precision: {:.2f}'.format(precision))
print('Recall: {:.2f}'.format(recall))
print('F1score: {:.2f}'.format(f1))
This will output the evaluation metrics for the SVM model on the testing data. The exact values may vary each time the code is run due to the random splitting of the data into training and testing sets.
In this example, we first load the iris dataset from Scikitlearn’s builtin datasets. We split the data into training and testing sets using the train_test_split function. We create an SVM model with a linear kernel and a regularization parameter of 1.0. We train the SVM model on the training data using the fit function. We then use the trained model to predict the classes of the testing data using the predict function. Finally, we calculate the accuracy score of the model on the testing data using the accuracy_score function and print the result.
Reference research paper: Hearst, M. A., Dumais, S. T., Osuna, E., Platt, J., & Scholkopf, B. (1998). Support vector machines. IEEE Intelligent Systems and their applications, 13(4), 1828.
That’s it!
You can also read the Random Forests method here: https://airesearchstudies.com/whatisrandomforests/
Happy reading!!!
[…] You can also explore support vector machine method here: https://airesearchstudies.com/whatissupportvectormachine/ […]