0% found this document useful (0 votes)

220 views18 pages

Clustering PPT 1233

Uploaded by

gugulothdevendarnaik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

220 views18 pages

Clustering PPT 1233

Uploaded by

gugulothdevendarnaik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 18

Clustering In

Machine Learning
BY
BODA SANTOSH NAIK(EC21B020)
BANOTH ROHITH(EC21B015)
DESAVATH SIVA NAIK(EC21B024)
Introduction to Clustering
Clustering is an unsupervised learning technique used to group
similar data points.

It helps in discovering inherent patterns within datasets without

prior labels.

Clustering is widely used in various applications such as image

segmentation and customer segmentation.

PAGE-2
Importance of Clustering

Clustering simplifies complex datasets by reducing dimensionality.

It facilitates
. better data analysis by grouping similar items together.

Clustering can improve decision-making processes in business and research.

PAGE-3
Types of Clustering

Clustering can be categorized into several types,

including centroid-based, density-based, and
hierarchical clustering.

Each type has its own methodology and use cases

suited for different data distributions.

Understanding the types of clustering is crucial for

selecting the appropriate algorithm.
Centroid-Based Clustering
K-Means is a widely used centroid-based clustering
algorithm.

It partitions the data into K clusters by minimizing

the variance within each cluster.

The algorithm iteratively updates cluster centroids

until convergence is reached.
Density-Based Clustering

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) identifies clusters based
on the density of data points.

DBSCAN requires two parameters: epsilon (neighborhood radius) and minPts (minimum points
to form a cluster).

Density Based consists of 3 types of data points

Core point : It should satisfy the condition of min. pts

Boundary point : Neighbour of Core.

Noise point : Not core nor boundary

PAGE-6
Hierarchical Clustering

Hierarchical clustering creates a tree-like

structure to represent data relationships.

It can be agglomerative (bottom-up) or

divisive (top-down) in its approach.

Dendrograms are commonly used to

visualize the results of hierarchical
clustering.

PAGE-7
Evaluation Metrics

Evaluation matrices are crucial tools in machine learning for assessing the performance of a model. They
provide quantitative measures to understand how well a model is making predictions. Here are some
commonly used evaluation matrices.
.
Classification of matrices:
Accuracy : Accuracy is a matrices that measures how often a machine learning model correctly
predicts the outcomes.

Precision : Precision performance the quality of a positive prediction made by the model.

Recall : Recall is a machine learning metric that measures how well a model can identify positive
instances in a dataset.

PAGE-8
Challenges in Clustering

Clustering is sensitive to outliers, which

can distort the results significantly.

The choice of the number of clusters (K)

in algorithms like K-Means can be
subjective.

High-dimensional data often leads to the

“curse of dimensionality,” complicating
clustering.
Practical Applications

Clustering is used in customer segmentation

to tailor marketing strategies effectively.

It plays a critical role in image and video

processing for object recognition.

In bioinformatics, clustering helps in gene

expression analysis and protein
classification.

PAGE-10
Tools and Libraries

Popular libraries for clustering in Python include Scikit-

learn, Scipy, and HDBSCAN.

R also offers robust clustering packages such as 'cluster'

and 'factoextra’.

These tools provide easy-to-use implementations of

various clustering algorithms.

Pandas is useful for data manipulation and preprocessing

before clustering.

Numpy is useful for numerical operations, it’s often used

for implementing clustering algorithms from scratch.
Case Study: Customer Segmentation
A retail company used K-Means clustering to segment its
customer base into distinct groups.

This segmentation enabled targeted marketing campaigns

and improved customer engagement.

The results showed a significant increase in sales and

customer satisfaction.

PAGE-12
Case Study: Image Segmentation

Researchers applied DBSCAN for segmenting complex

images in a computer vision project.

The algorithm effectively identified regions of interest

while ignoring background noise.

This segmentation improved the accuracy of subsequent

image classification tasks.

The segmentation approach was applied to real-world

data, such as satellite images and medical scans, Where
DBSCAN successfully identified key region like urban
areas or tumor boundaries, further validating its
effectiveness.

PAGE-13
Future Directions

The integration of clustering with deep learning

techniques is an emerging trend.

Research is focusing on developing algorithms that

can handle dynamic and streaming data.

Further advancements in clustering will enhance its

applicability across various domains.

PAG-14
Best Practices

Always preprocess your data to remove noise and

handle missing values before clustering.

Experiment with multiple algorithms and

parameters to find the most suitable method for
your data.

Visualize the clusters formed to gain insights and

validate the clustering results.

PAGE-15
Conclusion
Clustering is a powerful tool for data analysis that
uncovers hidden structures in data.

Understanding different clustering algorithms and

their applications is essential for practitioners.

As data continues to grow, the importance and

relevance of clustering in machine learning will
only increase

PAG-16
PAG-17

Marketing Research Sample MCQS: True/False Questions
No ratings yet
Marketing Research Sample MCQS: True/False Questions
50 pages
Data Science Program by Simplilearn
No ratings yet
Data Science Program by Simplilearn
32 pages
Long Quiz 3
100% (1)
Long Quiz 3
18 pages
CS6659 AI UNIT 3 Notes
50% (4)
CS6659 AI UNIT 3 Notes
30 pages
Understanding Machine Learning
100% (69)
Understanding Machine Learning
416 pages
Clustering K-Means
100% (2)
Clustering K-Means
28 pages
Parallel Computing
No ratings yet
Parallel Computing
57 pages
Ai-Unit-Iii Notes
No ratings yet
Ai-Unit-Iii Notes
46 pages
ML - CSA 301 - ML Perspective and Issues
No ratings yet
ML - CSA 301 - ML Perspective and Issues
34 pages
Module 3 Games Optimal Decisions in Games Minimax Algorithm
No ratings yet
Module 3 Games Optimal Decisions in Games Minimax Algorithm
18 pages
Customer Churn Prediction Review
100% (1)
Customer Churn Prediction Review
7 pages
CS6659 UNIT 5 Notes
89% (9)
CS6659 UNIT 5 Notes
25 pages
DAA Unit-2: Fundamental Algorithmic Strategies
No ratings yet
DAA Unit-2: Fundamental Algorithmic Strategies
5 pages
Ai Unit 2 Notes
No ratings yet
Ai Unit 2 Notes
52 pages
Ai-Unit-I Notes
No ratings yet
Ai-Unit-I Notes
74 pages
AI CH3 Unit3
No ratings yet
AI CH3 Unit3
40 pages
Unit 4 Ensemble Techniques and Unsupervised Learning
100% (1)
Unit 4 Ensemble Techniques and Unsupervised Learning
25 pages
Adversarial Search 2020
No ratings yet
Adversarial Search 2020
34 pages
Department of Computer Science and Engineering
No ratings yet
Department of Computer Science and Engineering
23 pages
AI - Expert System
100% (1)
AI - Expert System
24 pages
Machine Learning: Presentation
100% (2)
Machine Learning: Presentation
23 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
Constraint Satisfaction Problems: AIMA: Chapter 6
No ratings yet
Constraint Satisfaction Problems: AIMA: Chapter 6
64 pages
IR 2 - Implementation of Single Pass Algorithm For Clustering
No ratings yet
IR 2 - Implementation of Single Pass Algorithm For Clustering
4 pages
Artificial Intelligence Module 5
No ratings yet
Artificial Intelligence Module 5
23 pages
Introduction of Machine Learning
No ratings yet
Introduction of Machine Learning
58 pages
Lecture 6 - State Space Search - Uninformed Search
No ratings yet
Lecture 6 - State Space Search - Uninformed Search
43 pages
AI 2ndunit
No ratings yet
AI 2ndunit
25 pages
Java String
No ratings yet
Java String
29 pages
CONSTRUCTOR AND DESTRUCTOR (C++)
No ratings yet
CONSTRUCTOR AND DESTRUCTOR (C++)
24 pages
Chapter 1 - Data Representation 1.1 - Data Types
No ratings yet
Chapter 1 - Data Representation 1.1 - Data Types
12 pages
Compiler Design Unit 4
No ratings yet
Compiler Design Unit 4
28 pages
Big IoT Data Analytics
100% (1)
Big IoT Data Analytics
15 pages
On Color Image Segmentation
No ratings yet
On Color Image Segmentation
17 pages
Jntuk Machine Learning 3-2 Unit-4
No ratings yet
Jntuk Machine Learning 3-2 Unit-4
32 pages
CS2055 - Software Quality Assurance
No ratings yet
CS2055 - Software Quality Assurance
15 pages
Ontology Engineering PDF
No ratings yet
Ontology Engineering PDF
25 pages
Tell Me About Your Self
No ratings yet
Tell Me About Your Self
3 pages
Data Science: Master of Science in
No ratings yet
Data Science: Master of Science in
46 pages
SOC Lab Manual
No ratings yet
SOC Lab Manual
11 pages
Market Basket Analysis
No ratings yet
Market Basket Analysis
86 pages
Informed Search Algorithms in AI - Javatpoint
No ratings yet
Informed Search Algorithms in AI - Javatpoint
10 pages
07 Game Playing
No ratings yet
07 Game Playing
30 pages
Studocu DAA Unit 5 Notes
No ratings yet
Studocu DAA Unit 5 Notes
23 pages
AI-ques-ans-Unit-1 Prof. Anuj Khanna KOIT
100% (1)
AI-ques-ans-Unit-1 Prof. Anuj Khanna KOIT
17 pages
AI Digital Notes Complete
100% (1)
AI Digital Notes Complete
202 pages
Unit 2 AI
No ratings yet
Unit 2 AI
22 pages
Ainotes Module4 Parta
No ratings yet
Ainotes Module4 Parta
11 pages
Unit 5 - Data Mining - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Data Mining - WWW - Rgpvnotes.in
15 pages
Applet Life Cycle in Java
No ratings yet
Applet Life Cycle in Java
6 pages
Amity School of Engineering and Technology Amity University, Uttar Pradesh
No ratings yet
Amity School of Engineering and Technology Amity University, Uttar Pradesh
5 pages
Tree Traversals (Inorder, Preorder and Postorder)
No ratings yet
Tree Traversals (Inorder, Preorder and Postorder)
4 pages
Cluster Validation: Presented By:Rohit Paul
No ratings yet
Cluster Validation: Presented By:Rohit Paul
22 pages
Unit I R Data Structures
No ratings yet
Unit I R Data Structures
30 pages
Artificial Intelligence Question Bank
100% (2)
Artificial Intelligence Question Bank
8 pages
Unit II - Problem Solving by Searching
No ratings yet
Unit II - Problem Solving by Searching
21 pages
CS6659 Artificial Intelligence
No ratings yet
CS6659 Artificial Intelligence
6 pages
Unit-4: Define The Domain For Clustering
No ratings yet
Unit-4: Define The Domain For Clustering
13 pages
Socransky and Haffajee 1988 - Microbial Complexes Again
No ratings yet
Socransky and Haffajee 1988 - Microbial Complexes Again
6 pages
UNIT1
No ratings yet
UNIT1
38 pages
Unit 4 Data Science
No ratings yet
Unit 4 Data Science
21 pages
Unit 4
No ratings yet
Unit 4
4 pages
Question Bank COURSE: Artificial Intelligence Department: Cse Class: Iii B.Tech Sem Ii Year: 2009-2010 Unit I
No ratings yet
Question Bank COURSE: Artificial Intelligence Department: Cse Class: Iii B.Tech Sem Ii Year: 2009-2010 Unit I
9 pages
Eric C. Chi's CV
No ratings yet
Eric C. Chi's CV
15 pages
Analysis of Mood Based On Song Data Using Clustering and Supervised Learning Techniques
No ratings yet
Analysis of Mood Based On Song Data Using Clustering and Supervised Learning Techniques
3 pages
Problem Solving Techniques
No ratings yet
Problem Solving Techniques
52 pages
Expert System Architecture
No ratings yet
Expert System Architecture
11 pages
Unit I - Data Science
No ratings yet
Unit I - Data Science
161 pages
Unit 1 - Project Management - WWW - Rgpvnotes.in
No ratings yet
Unit 1 - Project Management - WWW - Rgpvnotes.in
13 pages
ADS Phase4
No ratings yet
ADS Phase4
21 pages
ML-3-Decision Tree
No ratings yet
ML-3-Decision Tree
17 pages
JNTUK R20 B.Tech CSE 3-2 Machine Learning Unit 4 Notes
No ratings yet
JNTUK R20 B.Tech CSE 3-2 Machine Learning Unit 4 Notes
23 pages
02 - Clustering
No ratings yet
02 - Clustering
43 pages
Mini Max
100% (1)
Mini Max
9 pages
A Discretization Method For Industrial Data Based On Big Data Technology
No ratings yet
A Discretization Method For Industrial Data Based On Big Data Technology
3 pages
ML Notes MAKAUT 7th Sem
No ratings yet
ML Notes MAKAUT 7th Sem
31 pages
Data Mining Models - GeeksforGeeks
No ratings yet
Data Mining Models - GeeksforGeeks
4 pages
6415 BI Journal
No ratings yet
6415 BI Journal
116 pages
Unit 2 - Advanced Computer Architecture - WWW - Rgpvnotes.in
No ratings yet
Unit 2 - Advanced Computer Architecture - WWW - Rgpvnotes.in
15 pages
Iris Segmentation Methodology For Non-Cooperative Recognition
No ratings yet
Iris Segmentation Methodology For Non-Cooperative Recognition
7 pages
Birdwatch Paper 2022 10 27
No ratings yet
Birdwatch Paper 2022 10 27
24 pages
A Space-Time Permutation Scan Statistic
No ratings yet
A Space-Time Permutation Scan Statistic
9 pages
Data Science Process and Machine Learning
No ratings yet
Data Science Process and Machine Learning
6 pages
Big Questions With Answers
100% (1)
Big Questions With Answers
32 pages
Unit1 ML
No ratings yet
Unit1 ML
23 pages
HU14 CISC 520 Data Analytics Final Project
No ratings yet
HU14 CISC 520 Data Analytics Final Project
6 pages
Smart Intelligent Computing and Applications Proceedings of the Third International Conference on Smart Computing and Informatics Volume 2 Suresh Chandra Satapathy - The ebook in PDF and DOCX formats is ready for download now
100% (1)
Smart Intelligent Computing and Applications Proceedings of the Third International Conference on Smart Computing and Informatics Volume 2 Suresh Chandra Satapathy - The ebook in PDF and DOCX formats is ready for download now
59 pages
Enhancing Food Integrity Through Artificial Intelligence and Machine Learning: A Comprehensive Review
No ratings yet
Enhancing Food Integrity Through Artificial Intelligence and Machine Learning: A Comprehensive Review
28 pages
Unit-4 of Ai
No ratings yet
Unit-4 of Ai
9 pages
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Clustering PPT 1233

Uploaded by

Clustering PPT 1233

Uploaded by

Clustering In

It helps in discovering inherent patterns within datasets without

Clustering is widely used in various applications such as image

Clustering simplifies complex datasets by reducing dimensionality.

Clustering can improve decision-making processes in business and research.

Clustering can be categorized into several types,

Each type has its own methodology and use cases

Understanding the types of clustering is crucial for

It partitions the data into K clusters by minimizing

The algorithm iteratively updates cluster centroids

Density Based consists of 3 types of data points

Core point : It should satisfy the condition of min. pts

Boundary point : Neighbour of Core.

Noise point : Not core nor boundary

Hierarchical clustering creates a tree-like

It can be agglomerative (bottom-up) or

Dendrograms are commonly used to

Clustering is sensitive to outliers, which

The choice of the number of clusters (K)

High-dimensional data often leads to the

Clustering is used in customer segmentation

It plays a critical role in image and video

In bioinformatics, clustering helps in gene

Popular libraries for clustering in Python include Scikit-

R also offers robust clustering packages such as 'cluster'

These tools provide easy-to-use implementations of

Pandas is useful for data manipulation and preprocessing

Numpy is useful for numerical operations, it’s often used

This segmentation enabled targeted marketing campaigns

The results showed a significant increase in sales and

Researchers applied DBSCAN for segmenting complex

The algorithm effectively identified regions of interest

This segmentation improved the accuracy of subsequent

The segmentation approach was applied to real-world

The integration of clustering with deep learning

Research is focusing on developing algorithms that

Further advancements in clustering will enhance its

Always preprocess your data to remove noise and

Experiment with multiple algorithms and

Visualize the clusters formed to gain insights and

Understanding different clustering algorithms and

As data continues to grow, the importance and

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.