Hyperparameters in SVM
Hyperparameters in SVM are parameters that are set prior to training the model and cannot be learned from the data during training. These parameters can significantly affect the performance of the model, and therefore, they need to be carefully selected and tuned.
In SVM, there are several hyperparameters that we can tune to optimize the performance of the model. Some of the most important hyperparameters are:
Kernel function: The kernel function determines how the input data is transformed into a higher-dimensional space. The choice of kernel function can have a significant impact on the performance of the model. Some common kernel functions are linear, polynomial, radial basis function (RBF), and sigmoid.
C parameter: The C parameter controls the trade-off between maximizing the margin and minimizing the classification error. A larger value of C will result in a smaller margin and more classification errors, while a smaller value of C will result in a larger margin but more classification errors.
Gamma parameter: The gamma parameter controls the shape of the decision boundary. A larger value of gamma will result in a more complex decision boundary, which can lead to overfitting, while a smaller value of gamma will result in a smoother decision boundary, which can lead to underfitting.
Epsilon parameter: The epsilon parameter controls the size of the margin around the regression line. A larger value of epsilon will result in a larger margin but more errors, while a smaller value of epsilon will result in a smaller margin but fewer errors.
To regulate the hyperparameters in SVM, we can use a technique called grid search, which involves evaluating the performance of the model using different combinations of hyperparameters and selecting the combination that results in the best performance. Grid search involves selecting a range of values for each hyperparameter and evaluating the performance of the model using all possible combinations of the hyperparameters. The combination of hyperparameters that results in the highest performance metric (such as accuracy or mean squared error) is then selected as the optimal hyperparameters for the model.
Here's an example code to perform grid search in Python using scikit-learn:
# Import required libraries
from sklearn import svm, datasets
from sklearn.model_selection import GridSearchCV, train_test_split
# Load the diabetes dataset
diabetes = datasets.load_diabetes()
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(diabetes.data, diabetes.target, test_size=0.2, random_state=0)
# Create an SVM regression model
model = svm.SVR()
# Define the hyperparameter grid
param_grid = {'kernel': ['linear', 'poly', 'rbf', 'sigmoid'],
'C': [0.1, 1, 10, 100],
'gamma': ['scale', 'auto'],
'epsilon': [0.1, 0.01, 0.001]}
# Perform grid search using 5-fold cross-validation
grid_search = GridSearchCV(model, param_grid, cv=5)
# Fit the grid search to the training data
grid_search.fit(X_train, y_train)
# Print the best hyperparameters and mean squared error on the testing data
print('Best hyperparameters:', grid_search.best_params_)
y_pred = grid_search.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print('Mean squared error: %.2f' % mse)
In this code, we first load the diabetes dataset and split the data into training and testing sets. We then create an SVM regression model using the SVR class. We define a grid of hyperparameters using the param_grid dictionary, which contains a range of values for each hyperparameter. We then perform grid search using the above mentioned param_grid and reach to the best hyperparameters.