Unsupervised Machine Learning

In this page we will lean about What We learnt from Supervised Machine Learning?, What is Unsupervised Learning?, Why we use Unsupervised Learning?, Working of Unsupervised Learning, Types of Unsupervised Learning Algorithm, Advantages of Unsupervised Learning, Disadvantages of Unsupervised Learning.


What We learnt from Supervised Machine Learning?

We learnt supervised machine learning in the previous topic, which involves training models with labeled data under the supervision of training data. However, there may be times when we don't have labeled data and need to uncover hidden patterns in a dataset. Unsupervised learning approaches are required to solve such problems in machine learning.

What is Unsupervised Learning?

Unsupervised learning is a machine learning technique in which models are not supervised using a training dataset, as the name suggests. Models, on the other hand, use the data to uncover hidden patterns and insights. It is comparable to the learning that occurs in the human brain while learning new things. It can be summed up as follows:

“Unsupervised learning is a type of machine learning in which models are trained using unlabeled dataset and are allowed to act on that data without any supervision.”

Because, unlike supervised learning, we have the input data but no corresponding output data, unsupervised learning cannot be immediately applied to a regression or classification task. Unsupervised learning aims to uncover a dataset's underlying structure, categorize data based on similarities, and display the dataset in a compact fashion.

Consider the following scenario: the unsupervised learning system is given an input dataset containing photographs of various cats and dogs. The algorithm is never trained on the given dataset, therefore it has no knowledge what the dataset's characteristics are. The unsupervised learning algorithm's job is to find the image features on their own. This work will be completed by using an unsupervised learning algorithm to cluster the image dataset into groups based on image similarities.

unsupervised machine learning

Why use Unsupervised Learning?

The following are some of the most important arguments for the relevance of unsupervised learning:

  • Unsupervised learning is beneficial for extracting relevant information from data.
  • Unsupervised learning is analogous to how a human learns to think via their own experiences, bringing it closer to true AI.
  • Because unsupervised learning works with unlabeled and uncategorized data, it is more important.
  • In the real world, we don't always have input data that corresponds to output data, hence we require unsupervised learning to handle these problems.

Working of Unsupervised Learning

Working of unsupervised learning can be understood by the below diagram:

unsupervised machine learning 1

We've used unlabeled input data, which means it hasn't been categorized and no outputs have been provided. Now, the machine learning model is fed this unlabeled input data in order to train it. It will first analyze the raw data in order to uncover hidden patterns, and then use appropriate algorithms such as k-means clustering, Decision tree, and so on.

After applying the appropriate method, the algorithm splits the data objects into groups based on their similarities and differences.

Types of Unsupervised Learning Algorithm:

The unsupervised learning algorithm can be further categorized into two types of problems: Clustering: Clustering is a way of organizing things into clusters so that those with the most similarities stay in one group while those with less or no similarities stay in another. Cluster analysis identifies commonalities among data objects and classifies them according to the presence or absence of such commonalities.

Association: An association rule is an unsupervised learning strategy that is used to discover links between variables in a large database. It identifies the group of items that appear in the dataset together. The association rule improves the effectiveness of marketing strategies. People who buy X item (say, bread) are more likely to buy Y (Butter/Jam) item.

Note: We will learn these algorithms in later chapters.

Unsupervised Learning algorithms:

The list of some popular unsupervised learning algorithms are given below:

  • K-means clustering
  • KNN (k-nearest neighbors)
  • Hierarchal clustering
  • Anomaly detection
  • Neural Networks
  • Principle Component Analysis
  • Independent Component Analysis
  • Apriori algorithm
  • Singular value decomposition

Advantages of Unsupervised Learning

  • Unsupervised learning is utilized for more complex problems than supervised learning because there is no labeled input data in unsupervised learning.
  • Unsupervised learning is preferred because unlabeled data is easier to obtain than labeled data.

Disadvantages of Unsupervised Learning

  • Because it lacks a comparable output, unsupervised learning is inherently more challenging than supervised learning.
  • Because the input data is not labeled and algorithms do not know the exact output in advance, the result of an unsupervised learning algorithm may be less accurate.