Classification Algorithm in Machine Learning

In this page we will learn Classification Algorithm in Machine Learning, What is the Classification Algorithm in Machine Learning?, Learners in Classification Problems, Types of ML Classification Algorithms, Evaluating a Classification model, Log Loss or Cross-Entropy Loss, Use cases of Classification Algorithms.

As we all know, the Regression and Classification Algorithms are two types of Supervised Machine Learning algorithms. We predicted the output for continuous values using Regression techniques, but we require Classification methods to predict categorical values.

What is the Classification Algorithm in Machine Learning?

The Classification algorithm is a Supervised Learning technique that uses training data to determine the category of new observations. Classification is the process of a software learning from a dataset or observations and then classifying fresh observations into one of several classes or groupings. Yes or No, 0 or 1, Spam or Not Spam, cat or dog, and so forth. Targets/labels or categories are all terms that can be used to describe classes.

Unlike regression, Classification produces a category rather than a value, such as "Green or Blue," "fruit or animal," and so on. Because the Classification method is a Supervised learning technique, it uses labeled input data with corresponding output, which means it includes data that has been labeled.
In classification algorithm, a discrete output function(y) is mapped to input variable(x).

y = f(x), where y = categorical output

Email Spam Detector is the greatest example of a machine learning classification method.
The fundamental purpose of a classification algorithm is to determine which category a dataset belongs to, and these algorithms are typically used to anticipate the output for categorical data.
The diagram below helps to understand classification methods. There are two classes in the diagram below: class A and class B. These classes share features in common with one another as well as differences from other classes.

classification algorithm in machine learning

A classifier is the algorithm that performs the classification on a dataset. There are two sorts of classifications: classifications and classifications.

  • Binary Classifier: This type of classifier is used when there are only two possible outputs to a classification task.
    YES or NO, MALE or FEMALE, SPAM or NOT SPAM, CAT or DOG, and so on are some examples.
  • Multi-class Classifier: A Multi-class Classifier is used when a classification task involves more than two outcomes.
    Classifications of different sorts of crops, for example, or classifications of different types of music.

Learners in Classification Problems:

There are two sorts of learners in classification problems: We

  • Lazy Learners: A lazy learner saves the training dataset first and then waits for the test dataset. In the case of the lazy learner, classification is based on the most closely related data in the training dataset. Training takes less time, but projections take longer.
    Case-based reasoning, for example, uses the K-NN method.
  • Eager Learners: Before receiving a test dataset, Eager Learners create a classification model based on a training dataset. Eager Learners, in contrast to Lazy Learners, spend more time learning and less time predicting. Decision Trees, Nave Bayes, and ANN are some examples.

Types of ML Classification Algorithms:

Classification Algorithms can be further divided into the Mainly two category:

  • Linear Models
    1. Logistic Regression
    2. Support Vector Machines
  • Non-linear Models
    1. K-Nearest Neighbours
    2. Kernel SVM
    3. Naïve Bayes
    4. Decision Tree Classification
    5. Random Forest Classification

Note: We will learn the above algorithms in later chapters.

Evaluating a Classification model:

It is vital to evaluate the performance of our model once it has been completed, whether it be a Classification or Regression model. So, we have the following options for testing a classification model:

1. Log Loss or Cross-Entropy Loss:

  • It's used to measure how well a classifier performs when the result is a probability value between 0 and 1.
  • The value of log loss should be close to 0 for a decent binary Classification model.
  • If the anticipated value differs from the actual value, the value of log loss rises.
  • The lower the log loss, the more accurate the model is.
  • Cross-entropy can be determined for binary classification as follows:
    Where y= Actual output, p= predicted output.

2. Confusion Matrix:

  • The confusion matrix generates a matrix/table as an output that describes the model's performance.
  • The error matrix is another name for it.
  • The matrix is made up of the results of the forecasts in a summary form, with a total number of correct and incorrect predictions. The matrix is shown in the table below:
Actual Positive Actual Negative
Predictive Positive True Positive False Positive
Predictive Negative False Negative True Negative
classification algorithm in machine learning 2

3. AUC-ROC curve:

  • The AUC refers for Area Under the Curve, and the ROC curve is for Receiver Operating Characteristics Curve.
  • It's a graph that depicts the classification model's performance at various thresholds.
  • The AUC-ROC Curve is used to visualize the performance of the multi-class classification model.
  • TPR (True Positive Rate) is plotted on the Y-axis, while FPR (False Positive Rate) is plotted on the X-axis.

Use cases of Classification Algorithms

Algorithms for classification can be employed in a variety of situations. Here are a few examples of how Classification Algorithms are used:

  • Email Spam Detection
  • Speech Recognition
  • Identifications of Cancer tumor cells.
  • Drugs Classification
  • Biometric Identification, etc.