Unsupervised Machine Learning
In this page we will lean about What We learnt from Supervised Machine Learning?, What is Unsupervised Learning?, Why we use Unsupervised Learning?, Working of Unsupervised Learning, Types of Unsupervised Learning Algorithm, Advantages of Unsupervised Learning, Disadvantages of Unsupervised Learning.
What We learnt from Supervised Machine Learning?
We learnt supervised machine learning in the previous topic, which involves training models with labeled data under the supervision of training data. However, there may be times when we don't have labeled data and need to uncover hidden patterns in a dataset. Unsupervised learning approaches are required to solve such problems in machine learning.
What is Unsupervised Learning?
Unsupervised learning is a machine learning technique in which
models are not supervised using a training dataset, as the
name suggests. Models, on the other hand, use the data to
uncover hidden patterns and insights. It is comparable to the
learning that occurs in the human brain while learning new
things. It can be summed up as follows:
“Unsupervised learning is a type of machine learning in
which models are trained using unlabeled dataset and are
allowed to act on that data without any supervision.”
Because, unlike supervised learning, we have the input data
but no corresponding output data, unsupervised learning cannot
be immediately applied to a regression or classification task.
Unsupervised learning aims to uncover a dataset's underlying
structure, categorize data based on similarities, and display
the dataset in a compact fashion.
Consider the following scenario: the unsupervised learning
system is given an input dataset containing photographs of
various cats and dogs. The algorithm is never trained on the
given dataset, therefore it has no knowledge what the
dataset's characteristics are. The unsupervised learning
algorithm's job is to find the image features on their own.
This work will be completed by using an unsupervised learning
algorithm to cluster the image dataset into groups based on
image similarities.
Why use Unsupervised Learning?
The following are some of the most important arguments for the
relevance of unsupervised learning:
- Unsupervised learning is beneficial for extracting relevant information from data.
- Unsupervised learning is analogous to how a human learns to think via their own experiences, bringing it closer to true AI.
- Because unsupervised learning works with unlabeled and uncategorized data, it is more important.
- In the real world, we don't always have input data that corresponds to output data, hence we require unsupervised learning to handle these problems.
Working of Unsupervised Learning
Working of unsupervised learning can be understood by the below diagram:
We've used unlabeled input data, which means it hasn't been
categorized and no outputs have been provided. Now, the
machine learning model is fed this unlabeled input data in
order to train it. It will first analyze the raw data in order
to uncover hidden patterns, and then use appropriate
algorithms such as k-means clustering, Decision tree, and so
on.
After applying the appropriate method, the algorithm splits
the data objects into groups based on their similarities and
differences.
Types of Unsupervised Learning Algorithm:
The unsupervised learning algorithm can be further categorized
into two types of problems:
Clustering: Clustering is a way of organizing things
into clusters so that those with the most similarities stay in
one group while those with less or no similarities stay in
another. Cluster analysis identifies commonalities among data
objects and classifies them according to the presence or
absence of such commonalities.
Association: An association rule is an unsupervised
learning strategy that is used to discover links between
variables in a large database. It identifies the group of
items that appear in the dataset together. The association
rule improves the effectiveness of marketing strategies.
People who buy X item (say, bread) are more likely to buy Y
(Butter/Jam) item.
Note: We will learn these algorithms in later chapters.
Unsupervised Learning algorithms:
The list of some popular unsupervised learning algorithms are
given below:
- K-means clustering
- KNN (k-nearest neighbors)
- Hierarchal clustering
- Anomaly detection
- Neural Networks
- Principle Component Analysis
- Independent Component Analysis
- Apriori algorithm
- Singular value decomposition
Advantages of Unsupervised Learning
- Unsupervised learning is utilized for more complex problems than supervised learning because there is no labeled input data in unsupervised learning.
- Unsupervised learning is preferred because unlabeled data is easier to obtain than labeled data.
Disadvantages of Unsupervised Learning
- Because it lacks a comparable output, unsupervised learning is inherently more challenging than supervised learning.
- Because the input data is not labeled and algorithms do not know the exact output in advance, the result of an unsupervised learning algorithm may be less accurate.