0% found this document useful (0 votes)

48 views12 pages

CSE4014 - High Performance Computing (EPJ) : Submitted by Project Guide

This document describes a project to accelerate the Density Peak Clustering (DPC) algorithm for large datasets. The DPC algorithm spends most time calculating local density and separation distance for each point. The project aims to speed this up by scanning only a point's neighbors to calculate separation distance, and identifying non-peak points early. The objectives are to accelerate DPC calculation and yield the same clusters as the original algorithm. Preliminary details on DPC clustering and limitations like dimensionality are also provided.

Uploaded by

Ashish Paudel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views12 pages

CSE4014 - High Performance Computing (EPJ) : Submitted by Project Guide

Uploaded by

Ashish Paudel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 12

CSE4014 - High Performance Computing

(EPJ)

Submitted by Project Guide

Raj Prakash Shrivastav- 18BCE2463 MANJULA V
Ashutosh Devkota- 18BCE2465 Associate Professor
Ashish Paudel- 18BCE2494 School of Information Technology and
Engineering
ABSTRACT

• The Density Peak Clustering (DPC) algorithm is a new density-based clustering method.
• It spends most of its execution time on calculating the local density and the separation
distance for each data point in a dataset.
• The purpose of this study is to accelerate its computation.
• On average, the DPC algorithm scans half of the dataset to calculate the separation distance of
each data point.
MODIFICATIONS…!!

• We propose an approach to calculate the separation distance of a data point

by scanning only the neighbors of the data point.
• Additionally, the purpose of the separation distance is to assist in choosing the
density peaks, which are the data points with both high local density and high
separation distance.
• We propose an approach to identify non-peak data points at an early stage to
avoid calculating their separation distances.
• Our experimental results show that most of the data points in a dataset can
benefit from the proposed approaches to accelerate the DPC algorithm.
PROBLEM STATEMENT AND OBJECTIVES

Accelerating DPC by Scanning Neighbors Only

The objectives of the project is:

 To speed up the density clustering algorithm for large data sets.
 To accelerate the calculation of separation distances and yield the same clustering
results as that of the DPC algorithm.
 To accelerate the DPC algorithm by identifying a significant portion of the non-
peak data points and avoiding calculating their separation distances.
 Input: the set of data points X∈ℝNXM and the parameters 𝑑C for defining the
neighborhood, and 𝑑r for selecting density peaks
 Output: the label vector of cluster index y∈ℝNx1
 Algorithm:
 1. Calculate ρ(𝑥i) for each 𝑥i ∈ X using either (1) or (3).
 2. Sort all data points in X by their local densities descendingly.
 3. Calculate δ(𝒙i) and σ(𝒙i) for each 𝒙i ∈ X using (4) and (5), respectively.
 4. Select data points with ρ(𝒙i)δ(𝒙i) > 𝑑r as density peaks.
 5. For each density peak 𝒙i, set 𝑦i = 𝑖. // starting point of each cluster
 6. For each non-peak data point 𝒙i, set 𝑦i = 𝑦δ(𝒙i). // cluster assignment
 7. Return y.
Preliminary

– Consider a set of points in some space to be clustered. Let ε be a parameter specifying the radius of a neighborhood
with respect to some point.
– For the purpose of DPC clustering, the points are classified as core points, (density-)reachable points and outliers, as
follows:
– A point p is a core point if at least minPts points are within distance ε of it (including p).
– A point q is directly reachable from p if point q is within distance ε from core point p.
– Points are only said to be directly reachable from core point.
– A point q is reachable from p if there is a path p1, ..., pn with p1 = p and pn = q, where each pi+1 is directly reachable
from pi.
– All points not reachable from any other point are outliers or noise points.

– Now if p is a core point, then it forms a cluster together with all points (core or non-core) that are reachable from it.
Each cluster contains at least one core point; non-core points can be part of a cluster, but they form its "edge", since
they cannot be used to reach more points.
Contd..

■ Reachability is not a symmetric relation since, by definition, no point may be reachable

from a non-core point, regardless of distance (so a non-core point may be reachable, but
nothing can be reached from it).
■ Therefore, a further notion of connectedness is needed to formally define the extent of
the clusters found by DBSCAN. Two points p and q are density-connected if there is a
point o such that both p and q are reachable from o. Density-
connectedness is symmetric.
■ A cluster then satisfies two properties:
■ All points within the cluster are mutually density-connected.
■ If a point is density-reachable from any point of the cluster, it is part of the cluster as
well.
 The quality of DPC depends on the distance measure used in the function region
Query.
 The most common distance metric used is Euclidean distance. Especially for high-
dimensional data, this metric can be rendered almost useless due to the so-called
"Curse of dimensionality", making it difficult to find an appropriate value.
 This effect, however, is also present in any other algorithm based on Euclidean
distance.
 DPC cannot cluster data sets well with large differences in densities.
Conclusion

• The proposed methods focus on accelerating the calculation of the

separation distance.
• However, it is also possible to improve the DPC algorithm by
accelerating the calculation of the local density .
• Conceptually, the DPC algorithm builds a directed acyclic graph of
all data points with an out-degree ≤ 1. Then, it selects several
data points from the graph as the density peaks.
• Finally, it removes the outgoing links of the density peaks and
breaks the graph into several subgraphs, each of which represents
a cluster.

Unit 8 DBSCAN
No ratings yet
Unit 8 DBSCAN
53 pages
Density Based Clustering Technique
No ratings yet
Density Based Clustering Technique
54 pages
2016 Study On Density Peaks Clustering Based On K-Nearest Neighbors and Principal Component Analysis
No ratings yet
2016 Study On Density Peaks Clustering Based On K-Nearest Neighbors and Principal Component Analysis
17 pages
Data Mining: Hierarchical Clustering, DBSCAN The EM Algorithm
No ratings yet
Data Mining: Hierarchical Clustering, DBSCAN The EM Algorithm
63 pages
Sliterature Review DPC
No ratings yet
Sliterature Review DPC
12 pages
Applsci 14 00715
No ratings yet
Applsci 14 00715
13 pages
DBSCAN
No ratings yet
DBSCAN
3 pages
Parallel Implementation of OPTICS Algorithm
No ratings yet
Parallel Implementation of OPTICS Algorithm
10 pages
A Graph Adaptive Density Peaks Clustering Algorithm For Automatic Centroid Selection and Effective Aggregation
No ratings yet
A Graph Adaptive Density Peaks Clustering Algorithm For Automatic Centroid Selection and Effective Aggregation
16 pages
Dbscan
No ratings yet
Dbscan
18 pages
Density Based Clustering Algorithm
No ratings yet
Density Based Clustering Algorithm
25 pages
Clustering
No ratings yet
Clustering
12 pages
DBSCAN Presentation
No ratings yet
DBSCAN Presentation
10 pages
Module 10
No ratings yet
Module 10
59 pages
Density Based Clustering
No ratings yet
Density Based Clustering
17 pages
DS143 Group 13 Presentation-1
No ratings yet
DS143 Group 13 Presentation-1
27 pages
Lesson 4.1 - Unsupervised Learning Partitioning Methods
No ratings yet
Lesson 4.1 - Unsupervised Learning Partitioning Methods
32 pages
OPTICS: Ordering Points To Identify The Clustering Structure
No ratings yet
OPTICS: Ordering Points To Identify The Clustering Structure
10 pages
Cluster Analysis
No ratings yet
Cluster Analysis
22 pages
DBSCAN
No ratings yet
DBSCAN
42 pages
Density Based
No ratings yet
Density Based
52 pages
Density-Based Methods: DBSCAN: Density-Based Clustering Based On Connected Regions With High Density
No ratings yet
Density-Based Methods: DBSCAN: Density-Based Clustering Based On Connected Regions With High Density
3 pages
DBSCAN
No ratings yet
DBSCAN
3 pages
Closest Pairs Clustering
No ratings yet
Closest Pairs Clustering
4 pages
M6
No ratings yet
M6
23 pages
Density Based
No ratings yet
Density Based
52 pages
DB Scan Clustering
No ratings yet
DB Scan Clustering
11 pages
DBSCAN
No ratings yet
DBSCAN
18 pages
DBSCAN Clustering
No ratings yet
DBSCAN Clustering
19 pages
DBSCAN - Density-Based - Spatial - Clustering - of - Applications - With (1) (Autosaved)
No ratings yet
DBSCAN - Density-Based - Spatial - Clustering - of - Applications - With (1) (Autosaved)
12 pages
Clustering Density Based
No ratings yet
Clustering Density Based
14 pages
Density Based Clustering Methods
No ratings yet
Density Based Clustering Methods
15 pages
DBSCAN
No ratings yet
DBSCAN
27 pages
Using Metasploit For Real
100% (1)
Using Metasploit For Real
6 pages
7 - Chapter 7-Chapter 7 - Density-Based Clustering Methods
No ratings yet
7 - Chapter 7-Chapter 7 - Density-Based Clustering Methods
30 pages
Lecture 11 DBSCAN
No ratings yet
Lecture 11 DBSCAN
6 pages
Lecture 5
No ratings yet
Lecture 5
20 pages
Density Based Clustering Methods
No ratings yet
Density Based Clustering Methods
14 pages
DBSCAN Algorithm
No ratings yet
DBSCAN Algorithm
15 pages
Density Based
No ratings yet
Density Based
52 pages
Dbscan and Optics
No ratings yet
Dbscan and Optics
28 pages
Density-Based Clustering Algorithm: Presented by - Rohit Paul
No ratings yet
Density-Based Clustering Algorithm: Presented by - Rohit Paul
12 pages
DM Lect 8 - Clustering - DBSCAN
No ratings yet
DM Lect 8 - Clustering - DBSCAN
22 pages
Birch
No ratings yet
Birch
6 pages
Density ML
No ratings yet
Density ML
51 pages
User Manual II Erbe Vio 3
No ratings yet
User Manual II Erbe Vio 3
77 pages
Dbscan: Densiy Based Scan Algorithm
No ratings yet
Dbscan: Densiy Based Scan Algorithm
8 pages
Density and Grid Based Clustering
No ratings yet
Density and Grid Based Clustering
5 pages
Density Based Clustering (Unit 5)
No ratings yet
Density Based Clustering (Unit 5)
5 pages
4.6 Dbscan
No ratings yet
4.6 Dbscan
27 pages
How To Use Authority Checks in Business Object Processing Framework
No ratings yet
How To Use Authority Checks in Business Object Processing Framework
11 pages
Density Based
No ratings yet
Density Based
27 pages
Unsupervised Learning Clustering II
No ratings yet
Unsupervised Learning Clustering II
17 pages
Data Mining
No ratings yet
Data Mining
3 pages
DB SCAN Unit 4
No ratings yet
DB SCAN Unit 4
6 pages
20 - 1 - ML - Unsup - 03 - Dbscan Hdbscan
No ratings yet
20 - 1 - ML - Unsup - 03 - Dbscan Hdbscan
21 pages
Dbscan: Presented By: Garrett Poppe
No ratings yet
Dbscan: Presented By: Garrett Poppe
22 pages
DBSCAN Clustering Algorithm: Presented by
No ratings yet
DBSCAN Clustering Algorithm: Presented by
22 pages
The End of Errors PDF
No ratings yet
The End of Errors PDF
425 pages
Computer Aided Design and Analysis
No ratings yet
Computer Aided Design and Analysis
25 pages
Communication Projects Using Matlab PDF
67% (3)
Communication Projects Using Matlab PDF
2 pages
An Improvement of DBSCAN Algorithm To Analyze Cluster For Large Dataset
No ratings yet
An Improvement of DBSCAN Algorithm To Analyze Cluster For Large Dataset
5 pages
Choose The Letter That Corresponds To The Correct Answer
100% (4)
Choose The Letter That Corresponds To The Correct Answer
4 pages
Understanding DBSCAN Algorithm and Implementation From Scratch - by Andrewngai - Towards Data Science
No ratings yet
Understanding DBSCAN Algorithm and Implementation From Scratch - by Andrewngai - Towards Data Science
10 pages
Comparison of Density-Based Clustering Algorithms: Mariam Rehman
No ratings yet
Comparison of Density-Based Clustering Algorithms: Mariam Rehman
5 pages
IMS Case Study
No ratings yet
IMS Case Study
18 pages
JBoss Enterprise Application Platform-5-Performance Tuning Guide-En-US
No ratings yet
JBoss Enterprise Application Platform-5-Performance Tuning Guide-En-US
76 pages
Multi Density DBScan
No ratings yet
Multi Density DBScan
8 pages
Symphony Impco Demand Release Tcode: Sap User Manual
No ratings yet
Symphony Impco Demand Release Tcode: Sap User Manual
21 pages
Test Plan Template (IEEE 829-1998 Format)
No ratings yet
Test Plan Template (IEEE 829-1998 Format)
9 pages
RH033
No ratings yet
RH033
264 pages
History of Cloud Computing Characteristic of Cloud Computing Layers of Cloud Computing Deployment Models
No ratings yet
History of Cloud Computing Characteristic of Cloud Computing Layers of Cloud Computing Deployment Models
21 pages
Beauty Plc. VPN Project Proposal
No ratings yet
Beauty Plc. VPN Project Proposal
16 pages
Deadlock Condition System Programming & Operating System PDF
No ratings yet
Deadlock Condition System Programming & Operating System PDF
41 pages
Ensayo Expocision Ingles Angie
No ratings yet
Ensayo Expocision Ingles Angie
3 pages
KilatHosting Vulnerability Report
No ratings yet
KilatHosting Vulnerability Report
14 pages
ASP NET Interview Questions
No ratings yet
ASP NET Interview Questions
39 pages
Introduction To Distributed Systems: Brian Nielsen Bnielsen@cs - Aau.dk Bnielsen@cs - Aau.dk
No ratings yet
Introduction To Distributed Systems: Brian Nielsen Bnielsen@cs - Aau.dk Bnielsen@cs - Aau.dk
54 pages
Explain The UML Concepts in Detail With An Example
No ratings yet
Explain The UML Concepts in Detail With An Example
8 pages
RPRT
No ratings yet
RPRT
9 pages
Srs
No ratings yet
Srs
13 pages
Computer Programming
No ratings yet
Computer Programming
10 pages
INFORMIX JDBC Driver Programmer's Guide
No ratings yet
INFORMIX JDBC Driver Programmer's Guide
8 pages
BMC Remedy ITSM 8.2 Fixes
No ratings yet
BMC Remedy ITSM 8.2 Fixes
22 pages
LabVIEW Database Connectivity Toolkit Cheat Sheet
No ratings yet
LabVIEW Database Connectivity Toolkit Cheat Sheet
2 pages
Introduction To Ultra Reliable and Low Latency Communications in 5G
No ratings yet
Introduction To Ultra Reliable and Low Latency Communications in 5G
15 pages
Installation: V500 Cordless Notebook Mouse V500 Souris Notebook Sans Fil
No ratings yet
Installation: V500 Cordless Notebook Mouse V500 Souris Notebook Sans Fil
2 pages
C171567 Telenet-OF-BE OneFM Integration LLD - Appendix 1
No ratings yet
C171567 Telenet-OF-BE OneFM Integration LLD - Appendix 1
13 pages
Beep Codes and PSA Diagnostic Chart - Desktop Wiki - Desktop - Dell Community
No ratings yet
Beep Codes and PSA Diagnostic Chart - Desktop Wiki - Desktop - Dell Community
13 pages
Pseudo Code Library
No ratings yet
Pseudo Code Library
4 pages
CSE1004 Networks and Communication Lab Experiment 2: Name: Abhik Dhakal Reg. No: 18BCE2487
No ratings yet
CSE1004 Networks and Communication Lab Experiment 2: Name: Abhik Dhakal Reg. No: 18BCE2487
10 pages
Software Cat2 PDF
No ratings yet
Software Cat2 PDF
8 pages
Maths Da
No ratings yet
Maths Da
8 pages
Test Automation Services
No ratings yet
Test Automation Services
4 pages
Network and Communication: An Informative Video Related To Network Layer On IP Addressing
No ratings yet
Network and Communication: An Informative Video Related To Network Layer On IP Addressing
2 pages
Human Computer
No ratings yet
Human Computer
4 pages
A Star: Fundamentals and Applications
From Everand
A Star: Fundamentals and Applications
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

CSE4014 - High Performance Computing (EPJ) : Submitted by Project Guide

Uploaded by

CSE4014 - High Performance Computing (EPJ) : Submitted by Project Guide

Uploaded by

CSE4014 - High Performance Computing

Submitted by Project Guide

• We propose an approach to calculate the separation distance of a data point

Accelerating DPC by Scanning Neighbors Only

The objectives of the project is:

■ Reachability is not a symmetric relation since, by definition, no point may be reachable

• The proposed methods focus on accelerating the calculation of the

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.