DMW Lab Manual
DMW Lab Manual
CO1 2 2 2 3 1 - - - - - - - 3 1 2 2
CO2 3 3 2 2 1 - - - - - - - 2 1 2 1
CO3 3 2 3 3 2 - - - - - - - 2 2 1 2
CO4 2 3 3 2 2 - - - - - - - 2 1 2 2
CO5 3 2 3 3 3 - - - - - - - 3 3 2 3
EXPERIMENT NO: 1
Aim:
Create an Employee Table with the help of Data Mining Tool WEKA.
Description:
We need to create an Employee Table with training data set which includes attributes like name,
id, salary,experience, gender, phone number.
Procedure:
Steps:
@data x,101,low,2,male,250311
y,102,high,3,female,251665
z,103,medium,1,male,240238
a,104,low,5,female,200200
b,105,high,2,male,240240
Result:
Aim:
Description:
Real world databases are highly influenced to noise, missing and inconsistency due to their queue
size so thedata can be pre-processed to improve the quality of data and missing results and it also improves
the efficiency.
1) Add
2) Remove
3) Normalization
@relation weather
@attribute outlook
{sunny,rainy,overcast}@attribute
temparature numeric @attribute
humidity numeric
@attribute windy
{true,false}@attribute play
{yes,no}
@data
sunny,85.0,85.0,false,no
overcast,80.0,90.0,true,no
sunny,83.0,86.0,false,yes
rainy,70.0,86.0,false,yes
rainy,68.0,80.0,false,yes
rainy,65.0,70.0,true,no
overcast,64.0,65.0,false,y
essunny,72.0,95.0,true,no
sunny,69.0,70.0,false,yes
rainy,75.0,80.0,false,yes
3) After that the file is saved with .arff file format.
4) Minimize the arff file and then open Start Programs weka-3-4.
5) Click on weka-3-4, then Weka dialog box is displayed on the screen.
6) In that dialog box there are four modes, click on explorer.
7) Explorer shows many options. In that click on ‘open file’ and select the arff file
8) Click on edit button which shows weather table on weka.
Add Pre-Processing Technique:
Procedure:
Procedure:
1) Start Programs Weka-3-4 Weka-3-4
2) Click on explorer.
3) Click on open file.
4) Select Weather.arff file and click on open.
5) Click on Choose button and select the Filters option.
6) In Filters, we have Supervised and Unsupervised data.
7) Click on Unsupervised data.
8) Select the attribute Remove.
9) Select the attributes windy, play to Remove.
10) Click Remove button and then Save.
11) Click on the Edit button, it shows a new Weather Table on Weka.
Procedure:
1) Start Programs Weka-3-4 Weka-3-4
2) Click on explorer.
3) Click on open file.
4) Select Weather.arff file and click on open.
5) Click on Choose button and select the Filters option.
6) In Filters, we have Supervised and Unsupervised data.
7) Click on Unsupervised data.
8) Select the attribute Normalize.
9) Select the attributes temparature, humidity to Normalize.
10) Click on Apply button and then Save.
11) Click on the Edit button, it shows a new Weather Table with normalized values on Weka.
Aim:
Description:
The knowledge flow provides an alternative way to the explorer as a graphical front end to WEKA’s
algorithm. Knowledge flow is a working progress. So, some of the functionality from explorer is not yet
available. So, on the other hand there are the things that can be done in knowledge flow, but not in explorer.
Knowledge flow presents a dataflow interface to WEKA. The user can select WEKA components from a
toolbar placed them on a layout campus and connect them together in order to form a knowledge flow for
processing and analyzing the data.
@relation weather
@attribute outlook
{sunny,rainy,overcast}@attribute
temparature numeric @attribute
humidity numeric
@attribute windy
{true,false}@attribute play
{yes,no}
@data
sunny,85.0,85.0,false,no
overcast,80.0,90.0,true,no
sunny,83.0,86.0,false,yes
rainy,70.0,86.0,false,yes
rainy,68.0,80.0,false,yes
rainy,65.0,70.0,true,no
overcast,64.0,65.0,false,y
essunny,72.0,95.0,true,no
sunny,69.0,70.0,false,yes
rainy,75.0,80.0,false,yes
3) After that the file is saved with .arff file format.
4) Minimize the arff file and then open Start Programs weka-3-4.
5) Click on weka-3-4, then Weka dialog box is displayed on the screen.
6) In that dialog box there are four modes, click on explorer.
7) Explorer shows many options. In that click on ‘open file’ and select the arff file
8) Click on edit button which shows Weather table on weka.
Output:
Training Data Set Weather Table
Result:
Description:
In data mining, association rule learning is a popular and well researched method for discovering
interesting relations between variables in large databases. It can be described as analyzing and presenting
strong rules discovered in databases using different measures of interestingness. In market basket analysis
association rules are used and they are also employed in many application areas including Web usage mining,
intrusion detection and bioinformatics.
Output:
Training Data Set Buying Table
Aim:
Description:
Classification & Prediction:
Classification is the process for finding a model that describes the data values and concepts
for thepurpose of Prediction.
Decision Tree:
A decision Tree is a classification scheme to generate a tree consisting of root node, internal
nodesand external nodes.
Root nodes representing the attributes. Internal nodes are also the attributes. External nodes
are theclasses and each branch represents the values of the attributes
Decision Tree also contains set of rules for a given data set; there are two subsets in Decision
Tree. One is a Training data set and second one is a Testing data set. Training data set is previously
classified data.Testing data set is newly generated data.
Creation of Weather Table:
Procedure:
Output:
Decision Tree:
Aim:
Description:
This program calculates and has comparisons on the data set selection of attributes and methods
of manipulations have been chosen. The Visualization can be shown in a 2-D representation of the
information.
@data
sunny,85,85,FALSE,no
sunny,80,90,TRUE,no
overcast,83,86,FALSE,yes
rainy,70,96,FALSE,yes
rainy,68,80,FALSE,yes
rainy,65,70,TRUE,no
overcast,64,65,TRUE,yes
sunny,72,95,FALSE,no
sunny,69,70,FALSE,yes
rainy,75,80,FALSE,yes
sunny,75,70,TRUE,yes
overcast,72,90,TRUE,yes
overcast,81,75,FALSE,yes
rainy,71,91,TRUE,no
Output:
5) After that we select the Select Attribute button, then select Outlook attribute and clock OK.
6) Click on the Update button to display the output.
7) After that select the Select Attribute button and select Temperature attribute and then click OK.
8) Increase the Plot Size and Point Size.
9) Click on the Update button to display the output.
10) After that we select the Select Attribute button, then select Humidity attribute and clock OK.
11) Click on the Update button to display the output.
12) After that select the Select Attribute button and select Windy attribute and then click OK.
13) Increase the Jitter Size.
14) Click on the Update button to display the output.
15) After that we select the Select Attribute button, then select Play attribute and clock OK.
16) Click on the Update button to display the output.
Output:
Output:
Output:
Result:
This program has been successfully executed.
EXPERIMENT NO:7
Aim:
Write a procedure for cross-validation using J48 Algorithm for weather table.
Description:
Cross-validation, sometimes called rotation estimation, is a technique for assessing how the results
of a statistical analysis will generalize to an independent data set. It is mainly used in settings where the goal
is prediction, and one wants to estimate how accurately a predictive model will perform in practice. One round
of cross-validation involves partitioning a sample of data into complementary subsets, performing the analysis
on one subset (called the training set), and validating the analysis on the other subset (called the validation set
or testing set).
Result:
The program has been successfully executed.
EXPERIMENT NO:8
Aim: Write a procedure for Clustering Customer data using Simple KMeans Algorithm.
Description:
Cluster analysis or clustering is the task of assigning a set of objects into groups (called clusters) so
that the objects in the same cluster are more similar (in some sense or another) to each other than to those in
other clusters. Clustering is a main task of explorative data mining, and a common technique for statistical
data analysis used in many fields, including machine learning, pattern recognition, image analysis, information
retrieval, and bioinformatics.
@data x,youth,high,A
y,youth,low,B
z,middle,high,A
u,middle,low,B
v,senior,high,A
l,senior,low,B
w,youth,high,A
q,youth,low,B
r,middle,high,A
n,senior,high,A
Procedure:
1) Click Start -> Programs -> Weka 3.4
2) Click on Explorer.
3) Click on open file & then select Customer.arff file.
4) Click on Cluster menu. In this there are different algorithms are there.
5) Click on Choose button and then select SimpleKMeans algorithm.
6) Click on Start button and then output will be displayed on the screen.
Output:
Result:
The program has been successfully executed.
EXPERIMENT NO:9
Aim: Write a procedure for Employee data using Make Density Based Cluster Algorithm.
Description:
Cluster analysis or clustering is the task of assigning a set of objects into groups (called clusters) so
that the objects in the same cluster are more similar (in some sense or another) to each other than to those in
other clusters. Clustering is a main task of explorative data mining, and a common technique for statistical
data analysis used in many fields, including machine learning, pattern recognition, image analysis, information
retrieval, and bioinformatics.
@data 101,raj,10000,4,pdtr
102,ramu,15000,5,pdtr
103,anil,12000,3,kdp
104,sunil,13000,3,kdp
105,rajiv,16000,6,kdp
106,sunitha,15000,5,nlr
107,kavitha,12000,3,nlr
108,suresh,11000,5,gtr
109,ravi,12000,3,gtr
110,ramana,11000,5,gtr
111,ram,12000,3,kdp
112,kavya,13000,4,kdp
113,navya,14000,5,kdp
3) After that the file is saved with .arff file format.
4) Minimize the arff file and then open Start Programs weka-3-4.
5) Click on weka-3-4, then Weka dialog box is displayed on the screen.
6) In that dialog box there are four modes, click on explorer.
7) Explorer shows many options. In that click on ‘open file’ and select the arff file
8) Click on edit button which shows employee table on weka.
Result:
The program has been successfully executed.
EXPERIMENT NO:10
Description:
Cluster analysis or clustering is the task of assigning a set of objects into groups (called clusters) so
that the objects in the same cluster are more similar (in some sense or another) to each other than to those in
other clusters. Clustering is a main task of explorative data mining, and a common technique for statistical
data analysis used in many fields, including machine learning, pattern recognition, image analysis, information
retrieval, and bioinformatics.
@data
sunny,85,85,FALSE,no
sunny,80,90,TRUE,no
overcast,83,86,FALSE,yes
rainy,70,96,FALSE,yes
rainy,68,80,FALSE,yes
rainy,65,70,TRUE,no
overcast,64,65,TRUE,yes
sunny,72,95,FALSE,no
sunny,69,70,FALSE,yes
rainy,75,80,FALSE,yes
sunny,75,70,TRUE,yes
overcast,72,90,TRUE,yes
overcast,81,75,FALSE,yes
rainy,71,91,TRUE,no
Procedure:
9) Click Start -> Programs -> Weka 3.4
10) Click on Explorer.
11) Click on open file & then select Weather.arff file.
12) Click on Cluster menu. In this there are different algorithms are there.
13) Click on Choose button and then select EM algorithm.
14) Click on Start button and then output will be displayed on the screen.
Output:
Result:
The program has been successfully executed.