Machine Learning BCA QA Detailed
Machine Learning BCA QA Detailed
Data Mining is the process of discovering patterns, correlations, trends, and useful information from
large datasets using statistical and computational techniques. It focuses on analyzing data for
Machine Learning, on the other hand, is a subset of Artificial Intelligence that enables computers to
learn from data and make predictions or decisions without being explicitly programmed. While data
mining extracts information, machine learning uses that information to build predictive models.
In supervised learning, the algorithm is trained on a labeled dataset, which means the input data is
paired with the correct output. The standard approach involves feeding the training data to the
model so it can learn the mapping from inputs to outputs. Once the model is trained, it is tested on
The training set is a portion of the dataset used to train a machine learning model. It allows the
model to learn and adjust its parameters based on the input-output pairs.
The test set is a separate portion of the dataset that is not used during training. It is used to assess
A classifier is a type of algorithm in machine learning used to assign a category label to input data. It
is part of supervised learning and is commonly used in applications such as spam detection, image
Unsupervised learning is a type of machine learning where the algorithm is provided with data that
has no labels. The goal is to explore the data and find hidden patterns or structures, such as
Tree pruning is a technique used in decision tree algorithms to reduce the size of the tree by
removing parts that do not provide additional power in classifying instances. It helps to improve the
calculating the error at the output and then propagating it backward through the network layers to
update the weights. This process is repeated iteratively to minimize the error and improve the
model's accuracy.
The target function in a learning program is the ideal function that the learning algorithm aims to
approximate. It represents the correct mapping from input to output, and the algorithm tries to find a
A useful perspective of a learning program refers to its ability to generalize from the training data to
real-world data. This means that the model should not just memorize the training data but should be
two or more classes using a linear combination of features. It is used to find the decision boundary
Support vectors are the data points that are closest to the decision boundary (or hyperplane) in a
Support Vector Machine (SVM). These points are critical in defining the boundary and maximizing
Clustering is an unsupervised learning technique that involves grouping a set of data points into
clusters such that points in the same cluster are more similar to each other than to those in other
recognition.
Linear regression is a supervised learning algorithm used to model the relationship between a
dependent variable and one or more independent variables. It assumes a linear relationship and
tries to fit a straight line (regression line) that best predicts the output based on the input features.