0% found this document useful (0 votes)
12 views31 pages

Artificial Intelligence - Machine Learning Fundamentals

The document provides an overview of machine learning (ML), defining it as a subfield of artificial intelligence that enables systems to learn from data without explicit programming. It discusses key components of ML, its importance, applications across various sectors, and differentiates between AI, ML, and deep learning. Additionally, it covers types of ML, data preprocessing, model evaluation, and performance metrics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views31 pages

Artificial Intelligence - Machine Learning Fundamentals

The document provides an overview of machine learning (ML), defining it as a subfield of artificial intelligence that enables systems to learn from data without explicit programming. It discusses key components of ML, its importance, applications across various sectors, and differentiates between AI, ML, and deep learning. Additionally, it covers types of ML, data preprocessing, model evaluation, and performance metrics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 31

Machine Learning Fundamentals

Endang Wahyu Pamungkas, Ph.D.


What is Machine Learning?
● Definition: Machine Learning is a
subfield of artificial intelligence (AI)
that provides systems the ability to
automatically learn and improve
from experience without being
explicitly programmed.
● Core Idea: The goal of ML is to
develop algorithms that can receive
input data and use statistical
analysis to predict an output.
What is Machine Learning?
● Key Components:
○ Data: The raw information from
which the machine learns.
○ Model: A mathematical
representation of how the
machine will interpret data.
○ Learning Algorithm: The
method used to train the model
on the data.
○ Prediction/Inference: The
outcome or decision made by the
model after training.
Importance and Applications of
Machine Learning
● Why Machine Learning Matters:
○ Ability to handle vast amounts of
data and perform complex
computations efficiently.
○ Adaptability and improvement
over time with more data.
○ Enables predictive capabilities
and decision-making in real-time.
Importance and Applications of
Machine Learning
● Applications of ML:
○ Healthcare: Predictive analytics
for patient diagnosis, medical
imaging, and treatment
personalization.
○ Finance: Fraud detection, risk
management, and algorithmic
trading.
○ Technology: Search engines,
recommendation systems (like
those used by Netflix and
Amazon), and speech recognition
AI vs. Machine Learning vs.
Deep Learning
● Artificial Intelligence (AI):
○ Broad concept of machines being
able to carry out tasks in a way
that we would consider “smart”.
○ AI includes anything from a
computer program playing chess,
to solving complex mathematical
problems, or understanding and
processing human language.
AI vs. Machine Learning vs.
Deep Learning
● Machine Learning (ML):
○ A subset of AI that involves the
creation of algorithms that can
modify themselves without
human intervention to produce
desired outputs by feeding on
data.
○ Focuses on the development of
programs that can access data
and learn for themselves.
AI vs. Machine Learning vs.
Deep Learning
● Deep Learning (DL):
○ A subset of ML that uses layered
neural networks to simulate
human decision-making.
○ Enables highly accurate and
efficient models capable of
handling large sets of
unstructured data like images,
sound, and text.
Types of Machine Learning
● Supervised Learning
● Unsupervised Learning
● Semi-Supervised Learning
● Reinforcement Learning
Supervised Learning
● Definition: Supervised learning is a
type of machine learning where the
model is trained on labeled data.
The training data includes both the
input and the desired output.
● How it Works:
○ The algorithm makes predictions
based on the training data.
○ It learns through the feedback
loop where the model’s
predictions are compared with
the actual outcomes to find errors
Supervised Learning
● Examples:
○ Regression: Predicting continuous
values, e.g., housing prices,
temperature forecasts.
○ Classification: Categorizing data
into predefined groups, e.g., spam
detection in emails, image
recognition, and financial crisis
prediction (Maryam et al. 2022).
● Use
Maryam, Cases:
M., Anggoro, D. A., Tika, M. F., & Kusumawati, F. C.
(2022). An intelligent hybrid model using artificial neural
○ Financial
networks forecasting
and particle for
swarm optimization stock
technique for financial
crisis prediction. Pakistan Journal of Statistics and Operation
prices.
Research , 1015-1025.
Unsupervised Learning
● Definition: Unsupervised learning
involves training a model on data
that has not been labeled, allowing
the model to act on that information
without guidance.
● How it Works:
○ The algorithm tries to organize
the data into groups or clusters
based on similarities, patterns, or
differences without prior training
of data.
Unsupervised Learning
● Examples:
○ Clustering: Segmenting a heterogeneous
population into a number of more
homogeneous groups, e.g., customer
segmentation for marketing (Dwididanti
et al. 2022).
○ Association: Discovering rules that
describe large portions of your data, e.g.,
people who buy X also tend to buy Y.
● Use Cases:
○ MarketS.,basket
Dwididanti, Anggoro, analysis in retail
D. A., & Sutanto, M. H. to
(2022).
understand customer purchase patterns.
Analisis Perbandingan Algoritma Bisecting K-Means dan Fuzzy
C-Means pada Data Pengguna Kartu Kredit. Emitor: Jurnal
○ Anomaly
Teknik detection
Elektro, 22(2), 110-117.for identifying
Semi-supervised Learning
● Combines a small amount of labeled
data with a large amount of unlabeled
data during training.
● Used when acquiring a fully labeled
dataset is expensive or laborious.
● Example: Enhancing the accuracy of a
model in image recognition where
only some images are labeled.
Reinforcement Learning
● Models learn to make decisions by
taking actions in an environment to
maximize some notion of cumulative
reward.
● Involves decision-making sequences
where the outcome of current
decisions impacts future results.
● Example: Video game AI, where the
model learns to make in-game
decisions to maximize the game
score.
Importance of Data in Machine
Learning
● Definition of Data Preprocessing:
The process of cleaning and organizing
raw data to make it suitable for a
machine learning model.
● Significance:
○ Enhances the quality of data,
leading to better machine learning
models.
● Key Points:
○ Garbage In, Garbage Out: Quality
of input data directly affects the
output.
Handling Missing Values
● Common Techniques:
○ Deletion: Removing records with
missing values, which is simple but
can lead to loss of valuable data.
○ Imputation: Filling in missing values
using various techniques like mean,
median, mode (for numerical data)
or most frequent category (for
categorical data).
○ Prediction Models: Using algorithms
to predict and fill missing values
based on other available data.
Data Normalization
● Normalization (Scaling):
○ Process of rescaling the values [0,
1] or [-1, 1].
○ Methods include Min-Max scaling
and Proportional scaling.
○ Essential for models that are
sensitive to large numerical
values.
Feature Selection and
Engineering
● Feature Selection:
○ The process of identifying the
most relevant features for use in
model construction.
○ Techniques include filter methods,
wrapper methods, and embedded
methods.
○ Reduces overfitting, improves
accuracy, and reduces training
time.
Feature Selection and
Engineering
● Feature Engineering:
○ Creating new features from
existing ones to increase the
predictive power of the model.
○ Involves domain knowledge to
add meaningful variables that
help the model to learn better
patterns.
○ Examples: Polynomial features,
interaction terms, and
aggregation of transactional data.
Linear Regression (Supervised)
● Definition: Linear regression is a
statistical method used to model the
relationship between a dependent
variable and one or more
independent variables by fitting a
linear equation to observed data.
Decision Tree (Supervised)
● Definition: A decision tree is a
flowchart-like tree structure where
each internal node represents a
"test" on an attribute, each branch
represents the outcome of the test,
and each leaf node represents a
class label.
● How it Works:
○ Splits the data into subsets based
on the most significant attribute
at each node.
○ The process continues recursively,
K-Means Clustering
(Unsupervised)
● K-Means Clustering: A popular
method in unsupervised learning used
to partition data into K distinct, non-
overlapping subgroups (clusters) where
each data point belongs to only one
group.
● Algorithm Steps:
○ Initialization: Select K cluster centers
randomly.
○ Assignment: Assign each data point
to the nearest cluster center.
○ Update: Recalculate the cluster
K-Means Clustering
(Unsupervised)
● Use Cases:
○ Market segmentation based on
customer behavior and purchasing
patterns.
○ Organizing computing clusters for
more efficient data processing.
● Challenges:
○ Determining the optimal number of
clusters (K).
○ Sensitivity to the initial selection of
cluster centers.
Importance of Model Evaluation
● Purpose of Model Evaluation: To
assess the performance of a machine
learning model in making accurate
predictions or decisions based on
new, unseen data.
● Key Aspects:
○ Generalization: The ability of a
model to perform well on new,
unseen data, not just the data on
which it was trained.
○ Bias-Variance Tradeoff:
Balancing the error introduced by
Overfitting vs. Underfitting
● Overfitting:
○ Occurs when a model learns the
training data too well, capturing
noise along with the underlying
pattern.
○ Leads to poor generalization to
new data.
○ Signs include much lower error on
training data compared to test
data.
○ Prevention strategies: Simplify the
model, use more training data,
Overfitting vs. Underfitting
● Underfitting:
○ Happens when a model is too
simple to capture the underlying
structure of the data adequately.
○ Results in poor performance on
both training and testing data.
○ Prevention strategies: Increase
model complexity, add more
features, or decrease
regularization.
Cross-Validation
● Cross-Validation:
○ A method to evaluate the model’s
ability to generalize to an
independent dataset.
○ Common techniques include k-fold
cross-validation, where the data is
divided into k subsets, and the
model is trained and tested k
times, using each subset as the
test set once.
Performance Metrics
● Classification Metrics:
○ Accuracy: The proportion of true
results (both true positives and
true negatives) among the total
number of cases examined.
○ Precision and Recall: Precision
measures the accuracy of positive
predictions, and recall measures
the ability of the model to find all
the positive samples.
○ F1 Score: The harmonic mean of
precision and recall, providing a
Performance Metrics
● Regression Metrics:
○ Mean Squared Error (MSE): The
average of the squares of the
errors between the actual and
predicted values.
○ Root Mean Squared Error (RMSE):
The square root of MSE, providing
error in the same units as the data.
○ R-squared (R²): Measures the
proportion of the variance in the
dependent variable that is
predictable from the independent
Video

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy