Ques 1.give Some Examples of Data Preprocessing Techniques?: Assignment - DWDM Submitted By-Tanya Sikka 1719210284
Ques 1.give Some Examples of Data Preprocessing Techniques?: Assignment - DWDM Submitted By-Tanya Sikka 1719210284
Example: Before starting any Project, we need to check it’s feasibility. In this case, a classifier is
required to predict class labels such as ‘Safe’ and ‘Risky’ for adopting the Project and to further
approve it. It is a two-step process such as :
1. Learning Step (Training Phase): Construction of Classification Model
Different Algorithms are used to build a classifier by making the model learn using the
training set available. The model has to be trained for the prediction of accurate results.
2. Classification Step: Model used to predict class labels and testing the constructed model
on test data and hence estimate the accuracy of the classification rules.
A decision tree is a structure that includes a root node, branches, and leaf nodes. Each internal
node denotes a test on an attribute, each branch denotes the outcome of a test, and each leaf
node holds a class label. The topmost node in the tree is the root node.
The following decision tree is for the concept buy_computer that indicates whether a customer at
a company is likely to buy a computer or not. Each internal node represents a test on an attribute.
Each leaf node represents a class.
Input:
Data partition, D, which is a set of training tuples
and their associated class labels.
attribute_list, the set of candidate attributes.
Attribute selection method, a procedure to determine the
splitting criterion that best partitions that the data
tuples into individual classes. This criterion includes a
splitting_attribute and either a splitting point or splitting subset.
Output:
A Decision Tree
Method
create a node N;
if Dj is empty then
attach a leaf labeled with the majority
class in D to node N;
else
attach the node returned by Generate
decision tree(Dj, attribute list) to node N;
end for
return N;
Ques6 . what are neural networks?
The inspiration for neural networks came from examination of central nervous
systems. In an artificial neural network, simple artificial nodes, called
“neurons”, “neurodes”, “processing elements” or “units”, are connected
together to form a network which mimics a biological neural network.
Neural networks are also similar to biological neural networks in that functions
are performed collectively and in parallel by the units, rather than there being
a clear delineation of subtasks to which various units are assigned. The term
“neural network” usually refers to models employed in statistics, cognitive
psychology and artificial intelligence. Neural network models which emulate
the central nervous system are part of theoretical neuroscience and
computational neuroscience.
Real-life applications
The tasks artificial neural networks are applied to tend to fall within the
following broad categories:
Grid-based Method
In this, the objects together form a grid. The object space is quantized into finite number of cells
that form a grid structure.
Advantages
The major advantage of this method is fast processing time.
It is dependent only on the number of cells in each dimension in the quantized space
It is shown that the k-nearest neighbor algorithm (kNN) outperforms the first nearest neighbor
algorithm only under certain conditions. Data sets must contain moderate amounts of noise.
Training examples from the different classes must belong to clusters that allow an increase in the
value of k without reaching into clusters of other classes. Methods for choosing the value of k for
kNN are investigated. It shown that one-fold cross-validation on a restricted number of values for k
suffices for best performance. It is also shown that for best performance the votes of the k-nearest
neighbors of a query should be weighted in inverse proportion to their distances from the query.