0% found this document useful (0 votes)
10 views81 pages

1-Introduction

EECS 836 is a machine learning course taught by Assistant Professor Zijun Yao at the University of Kansas, covering fundamental concepts, deep neural networks, and real-world applications. The course includes lectures, assignments, a team project, and exams, with a grading policy based on attendance, assignments, and exams. Prerequisites include basic algorithms and data structures in Python, as well as linear algebra, probability, and statistics.

Uploaded by

vinay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views81 pages

1-Introduction

EECS 836 is a machine learning course taught by Assistant Professor Zijun Yao at the University of Kansas, covering fundamental concepts, deep neural networks, and real-world applications. The course includes lectures, assignments, a team project, and exams, with a grading policy based on attendance, assignments, and exams. Prerequisites include basic algorithms and data structures in Python, as well as linear algebra, probability, and statistics.

Uploaded by

vinay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 81

EECS 836: Machine Learning

Zijun Yao
Assistant Professor, EECS Department
The University of Kansas
Class and Office Hour

• Instructor: Zijun Yao, Eaton Hall 2048 (Office)

• Class: 4:00pm - 5:15pm Monday/Friday, LEEP2 2300

• Office hour: 12:00pm - 1:00pm Monday or by appointments

• E-mail: zyao@ku.edu
• Recommended Subject: EECS836 <Your Last Name> <Brief headline>

• Course web: https://canvas.ku.edu/

2
Course Coverage
• Fundamental concepts in machine learning
• Deep neural networks (MLP, CNN, RNN, Transformer, GenAI)
• How to train deep learning models with popular deep learning
frameworks such as PyTorch
• How machine learning is used in real-world application
• Hands-on (e.g., computer vision, natural language processing,
recommender systems)
• Analyzing business cases

3
Syllabus and Course Schedule
• Course schedule available under Syllabus section at Canvas

• Lecture slides and tutorial will be posted under Lecture Slides


section at Canvas

• Assignments, exams and other course work will be posted


under Assignments section at Canvas

• You should check Announcements frequently to remain


updated.
4
Course Components
• Lectures on Monday and Friday
• Show up and actively engage
• Slides
• Slides will be posted before lecture
• Readings
• Read material before the class
• Assignments and exams
• Submit assignments on time.
• Project
• Apply ML algorithms on real-world data and present the results

5
Grading Policy
• Attendance: Check 4 times by random
• Assignments: Late submissions will receive less
1. Attendance 4% credit
2. Assignments 26% • Exams: There will be no make-up exams
3. Project 20% • Team project: a team with at least 3 students, up
4. Exam I 25% to 4; proposal and report will be required.
5. Exam II 25% • The final grade is based on a curve
Total 100% • Active class participation earns up to 3% credit
• Academic integrity: Do NOT cheat in any
homework and exam. Highly identical answer will
require explanation.

6
Project
1. Group. Up to 4 students.
2. Proposal. Maximum 8 PPT pages of project proposal, dataset,
problem definition, data processing, models and (optional)
expected outcomes.
3. Report. Maximum 15-page report (single-space, 12-point font)
consists of sections of introduction, motivation, method, results,
and conclusion.
4. Project tutorial. Will be available in following lectures

7
Project
• Topic: anything you’re interested (sentiment analysis,
recommendation systems, healthcare, Covid-19, finance, security)
• Search on Internet website for ideas
• Kaggle has rich resources including ideas, data and code
https://www.kaggle.com/

• Define input-output Dog

• Scope: not too broad, not too narrow (eg., training a linear classifier
for standard dataset)

• Evaluation metric: quantitative measure or insights


Will discuss more in project tutorial 8
Course Prerequisites
• Algorithm skills - basic algorithms and data structures in
python (EECS 168 Programming I or equivalents)
• Math skills - linear algebra, probability, and statistics (MATH
526 Applied Mathematical Statistics I or equivalents)
• Programming with Python

9
Textbooks (Optional)
• Use following books as the primary reference

available at available at available at available at


https://www.micros https://www.deep https://d2l.ai/ https://github.com/maxi
oft.com/en- m5/cs229-2018-
learningbook.org/
us/research/people/ autumn/tree/main/notes
cmbishop/prml-
10
book/
Prepare Yourself for the Future
• AI will create 133 million new and displace 75 million old jobs
worldwide (with the net creation of 58 million of new jobs) within
the next few years (According to a World Economic Forum
report)
• Contributing up to $15 trillion to the global GDP by 2030,
according to PwC.
• There is an acute AI skills shortage around the world: the
demand for the AI jobs is measured in millions

11
Course Expectation
• Develop a strong vocabulary and understanding of ML
techniques
• Make informed trade-offs on what ML approaches to use
• Communicate with confidence among developers and
consultants
• Hands-on experience via a staged progression of exercises
using application data
• Exposure to various AI/ML tools
• Approach business/research problems analytically by identifying
opportunities to derive actionable insights from data
12
What Do You Think of AI

13
Robotics

https://www.youtube.com/watch?v=fn3KWM1kuAw

14
Autonomous cars

https://www.youtube.com/watch?v=tlThdr3O5Qo 15
Alpha Go

AlphaGo - The Movie | Full Documentary


https://www.youtube.com/watch?v=WXuK6gekU1Y 16
ChatGPT

It’s Time to Pay Attention to A.I. (ChatGPT and Beyond)


https://www.youtube.com/watch?v=0uQqMxXoNVs 17
Stable Diffusion

18
What is AI?
The science of making machines (or computers) that

Think like Think


people rationally

Act like Act


people rationally

19
AI - Turing test (1950)

Turing test: A computer can be


said to be intelligent, or “think”,
if a human judge cannot tell if
he/she is interacting with a
human or a machine
20
AI - Turing Test
The computer needs following
capabilities to pass the test
• Natural language processing
• Knowledge representation
• Automated reasoning
• Machine learning
• Computer vision
• Robotics

21
Machine Learning: Since 1990s
• Machine learning
• Support Vector Machine (1995)
• Graphical models
• Bayesian Network
• Topic Modeling (2002)
• Chess-playing case
• Deep Blue Beats Kasparov
(1997)

History of AI
22
Deep Learning: Since 2010s
• Two conditions boost AI
• Big data
• Computer power
• Deep learning (computer vision, NLP, robotics)
• 2012, AlexNet competed in the ImageNet Large Scale Visual
Recognition Challenge
• Demonstrate the power of deep learning
• Accurate training of DNN with GPUs
• 2014, generative adversarial network (GAN)
• 2016, AlphaGo defeated 18-time world champion Lee Sedol on
Go game
• 2020, AlphaFold, breakthrough AI solution to a 50-year-old grand
challenge in biology
• 2022, ChatGPT, the fastest-growing consumer software application
in history
• There are more: Midjourney, DALL-E, Stable Diffusion …
23
AI vs. ML vs. Deep Learning
Artificial
Enable a machine to mimic human cognitive
Intelligence functions such as learning and problem-solving.

Machine
Learning Allow a machine to use algorithm to automatically
learn from past data without programming explicitly
(A major application of AI)

Deep
Use a layered structure of algorithms called an artificial
Learning neural network (ANN) (A major technique of ML)

24
Machine Learning is Everywhere
• Speech technologies (e.g. Siri)
• Automatic speech recognition (ASR)
• Text-to-speech synthesis (TTS)
• Dialog systems

• Language processing technologies


• Question answering
• Machine translation

• Web search

• Text classification, spam filtering, etc…

25
Robotic surgery and medical diagnosis Intelligent surveillance

Image and video searching

Computer vision

Self-driving cars
26
Tools for Predictions & Decisions

More pervasive than you think

27
Reasons for Tremendous Advances in ML?
• Big data
• ImageNet has 14 million images have been hand-annotated
• Text data available on Internet, eg. Wikipedia
• ……
• Machine (deep) learning models
• AlexNet, Residual Nets, GANs, Attention, BERT
• Computer power
• GPUs
• Deep learning frameworks, eg, PyTorch, TensorFlow

28
AlexNet in ImageNet Challenge (2012)

Data: ImageNet, 14 million hand- Deep learning models: Computer power:


annotated images AlexNet GPUs

• Won the competition by a large margin--15.3% VS 26.2% (second


place) error rates
• Demonstrate the power of deep learning
• Accurate training of deep neural netowrks with GPUs
29
Progress in Image Recognition

30
Machine Learning
• Machine learning aims to build a mathematical (statistical)
model based on sample data, known as "training data", to make
predictions or decisions

• Machine learning is the process that powers many of the services we


use today
• Recommendation systems like those on Netflix, YouTube, and Spotify;
• Search engines like Google;
• Social-media feeds like Facebook and Twitter;
• Voice assistants like Siri and Alexa

31
Machine Learning: Speech Recognition Example

Learning ......
“Hi”

“How are you”


You said “Hello”
“Good bye”

You write the program A large amount of


for learning. audio data
32
Machine Learning : Image Recognition Example

Learning ......
“monkey”

“cat”
This is “cat”

“dog”

You write the program


for learning. A large amount of
images
33
Machine Learning ≈ Look for Function
• Speech Recognition

f( ) = “How are you”


• Image Recognition
f( ) = “Cat”

• Playing Go f( ) = “5-5”
(next move)

• Dialogue System
f ( “How are you?” ) = “I am fine.”
(what the user said) (system response) 34
Image Recognition:
Start with a Project
f( )= “cat”

A set of Model
function f1 , f 2  Different parameters

f1 ( )= “cat” f2 ( )= “monkey”

f1 ( )= “dog” f2 ( )= “snake”

35
Image Recognition:
Framework
f( )= “cat”

A set of Model
function f1 , f 2  Better!

Goodness of
function f
Supervised Learning

Training function input:


Data
function output: “monkey” “cat” “dog”
36
Image Recognition:
Framework
f( )= “cat”

Training Testing
A set of Model
function f1 , f 2  “cat”
Step 1

Goodness of find the “Best” Function


Using f 
function f f*
Step 2 Step 3

Training
Data
“monkey” “cat” “dog” 37
Step 0 - Problem Formulation
Application oriented
Step 0: What kind of function do you want to find? • Different data
• Different Tasks
Step 1: Step 2: Step 3: pick
define a set goodness of the best
of function function function
Just like the three steps to put an elephant into the fridge……

38
(Big) Data is Everywhere…
processed about over 20
petabytes of data per day Twitter now sends and
receives as many as 500
million “tweets” every day.

As of January 2013, Facebook users had S3: 449B objects, peak 290k
uploaded over 240 billion photos, with 350 request/second (7/2011)
million new photos every day. 1T objects (6/2012)

transfers about 30 petabytes of


data through its networks each
day. 150 PB on 50k+ servers
running 15k apps (6/2011)

39
What is Data?
Attributes
• Collection of data objects and their
attributes
Tid Refund Marital Taxable
Income Cheat
• An attribute is a property or Status

characteristic of an object 1 Yes Single 125K No

• Examples: eye color of a person, 2 No Married 100K No

temperature, etc. 3 No Single 70K No

Objects
• Attribute is also known as variable, field, 4 Yes Married 120K No

characteristic, or feature 5 No Divorced 95K Yes

• A collection of attributes describe an 6 No Married 60K No


7 Yes Divorced 220K No
object
8 No Single 85K Yes
• Object is also known as record, point, case,
9 No Married 75K No
sample, entity, or instance
10 No Single 90K Yes
10
Attributes
• Attribute (or dimensions, features, variables):
• a data field, representing a characteristic or feature of a data object.
• E.g., customer _ID, name, address

• Types:
• Nominal
• Binary
• Ordinal
• Numeric

41
Attribute Values

• Attribute values are numbers or symbols assigned to


an attribute for a particular object

• Distinction between attributes and attribute values


• Same attribute can be mapped to different attribute values
• Example: height can be measured in feet or meters

• Different attributes can be mapped to the same set of values


• Example: Attribute values for ID and age are integers
• But properties of attribute values can be different

42
Attribute Types
• Nominal: used for naming or labelling variables, without any quantitative value
• Categories, states, or “names of things”
• Hair_color = {auburn, black, blond, brown, grey, red, white}
• marital status, occupation, ID numbers, zip codes
• Binary
• Nominal attribute with only 2 states (0 and 1)
• e.g., medical test (positive vs. negative)
• Ordinal
• Values have a meaningful order (ranking) but magnitude between successive values is not
known.
• Size = {small, medium, large}, grades = {A, B, C, D, F}
• Numeric
• Continuous: real numbers such as speed
• Discrete: can only take certain values
43
Types of Data Sets

timeout

season
coach

game
score
team

ball

lost
pla

wi
n
y
• Record
Document 1 3 0 5 0 2 6 0 2 0 2
• Relational records
• Data matrix, e.g., numerical matrix, crosstabs Document 2 0 7 0 2 1 0 0 3 0 0

• Document data: text documents: term-frequency vector Document 3 0 1 0 0 1 2 2 0 3 0

• Transaction data

• Graph and network


• World Wide Web
• Social or information networks
• Molecular Structures

• Ordered
• Video data: sequence of images
• Temporal data: time-series
• Sequential Data: transaction sequences
• Genetic sequence data

• Spatial, image and multimedia:


• Spatial data: maps
• Image data:
• Video data:

44
Record Data
• Data that consists of a collection of records, each of which
consists of a fixed set of attributes
Tid Refund Marital Taxable
Status Income Cheat

1 Yes Single 125K No


2 No Married 100K No
3 No Single 70K No
4 Yes Married 120K No
5 No Divorced 95K Yes
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes
10

45
Graph Data
• Examples: Social networks, Generic graph, a Molecule, and Webpages

2
5 1 Social networks

2
5

part of the Internet

Benzene Molecule: C6H6 46


Ordered (Sequential) Data

time
series

“Natural language processing is a subfield of linguistics,


computer science, and artificial intelligence concerned with
the interactions between computers and human language,
human
in particular how to program computers to process and language
analyze large amounts of natural language data.”

47
Spatial Data

map images

48
Data Matrix
• If data objects have the same fixed set of numeric attributes, then the
data objects can be thought of as points in a multi-dimensional space,
where each dimension represents a distinct attribute

• Such data set can be represented by an 𝑚 × 𝑛 matrix, where there are


m rows, one for each object, and n columns, one for each attribute

Projection Projection Distance Load Thickness


of x Load of y load

10.23 5.27 15.22 2.7 1.2


12.65 6.25 16.22 2.2 1.1

49
Image Data

50
Image Data: Color Images

51
Document Data

• Each document becomes a `term' vector,


• each term is a component (attribute) of the vector,
• the value of each component is the number of times the corresponding
term occurs in the document.

D1 = "I really, really, like math"


D2 = "I hate math",

I really like hate math


D1 1 2 1 0 1
D2 1 0 0 1 1

52
Step 0 - Problem Formulation
Application oriented
Step 0: What kind of function do you want to find? • Different data
• Different Tasks
Step 1: Step 2: Step 3: find
define a goodness of the best
function function function
Just like the three steps to put an elephant into the fridge……

53
Task: Regression

The output of the target


Regression
function 𝑓 is “scalar”.

PM2.5 today
Predict
PM2.5
PM2.5 yesterday f PM2.5 tomorrow
……. (scalar)

Training Data:
Output:
Input:
9/03 PM2.5 = 100
9/01 PM2.5 = 63 9/02 PM2.5 = 65

Input: Output:
9/12 PM2.5 = 30 9/13 PM2.5 = 25 9/14 PM2.5 = 20
54
Task: Classification
• Given a collection of records (training set)
• Each record is characterized by a tuple (x, y), where x is the attribute
set and y is the class label
• x: attribute, predictor, independent variable, input
• y: class, response, dependent variable, output

• Task:
– Learn a model that maps each attribute set x into one of the predefined
class labels y

55
Task: Classification

• Binary Classification • Multi-class Classification

Yes or No Class 1, Class 2, … Class N

Function f Function f

Input Input
56
Binary Classification

Spam
filtering Function Yes/No

Yes

Training
Data No

(http://spam-filter-review.toptenreviews.com/)
57
Multi-Class Classification

Image Recognition Training Data


“monkey”
“monkey”
“cat”
Function
“cat”
“dog”

Each possible “dog”


object is a class
58
Multi-Class Classification

Go Play
Each position
is a class
(19 x 19 classes)

Function
a position on
the board

Next move
Playing GO
Step 1 – Function with Unknow Parameters
Application oriented
Step 0: What kind of function do you want to find? • Different data
• Different Tasks
Step 1: Step 2: Step 3: find
define a goodness of the best
function function function
Just like the three steps to put an elephant into the fridge……

60
The function we want to find …

𝑦=𝑓
Return on
Monday?
Function with Unknown Parameters

Model 𝑦 = 𝑏 + 𝑤𝑥1 based on domain knowledge


feature
𝑦: close price on Jan 27, 𝑥1 : close price on Jan 24
𝑤 and 𝑏 are unknown parameters (learned from data)
weight bias
Classification: Handwritten Digit Recognition

63
Neural Network
Neuron
z = a1w1 + ... + ak wk + ... + aK wK + b
a1 w1 A simple function


wk z  (z )
ak + a𝑦

Activation

wK function
aK weights b bias

Weights and biases are called network parameters 64


Neural Network

Neuron Sigmoid Function  (z )

 (z ) =
1
−z
1+ e z
2
1

 (z )
4
-1 -2 + 0.98

Activation
-1
function
1 weights 1 bias
65
Input Model Output
x1 x1 y1
0.1 is 1
x2 x2
y2
0.7 is 2
𝑓

……

……
Output: “2”

……
……
x256 x256
16 x 16 = 256
y10
0.2 is 0
Ink → 1, No ink → 0

x1 …… y1 is 1
x2 …… y2 is 2
“2”

……
……

……

……

……
……
x256 …… y10 is 0
Input Layer 1 Layer 2 Layer L Output 66
Step 2 - Measure Error
Application oriented
Step 0: What kind of function do you want to find? • Different data
• Different Tasks
Step 1: Step 2: Step 3: find
define a goodness of the best
function function function
Just like the three steps to put an elephant into the fridge……

67
Supervised Learning
Speech
“How are you”
Recognition y
x
Supervised
x1: x2: x3:
y1: Hello y2: Good y3: I am fine

Image
x “cat”
Recognition y
Supervised
x1: x2: x3:

y1: monkey y2: cat y3: dog 68


Define Loss ➢ Loss is a function of
from Training Data parameters 𝐿 𝑏, 𝑤
➢ Loss: how good a set of
values is.
𝐿 −5,1 𝑦 = 𝑏 + 𝑤𝑥1 𝑦 = −5 + 1𝑥1 How good it is?
Data in 2024
01/02/2024 01/03 01/04 …… 12/30/2024 12/31

248.42 238.45 237.93 417.41 403.84

−5+1𝑥1 = 𝑦ො 243.42

𝑒1= 𝑦 − 𝑦ො = 4.97
label 𝑦

238.45
Define Loss ➢ Loss is a function of
from Training Data parameters 𝐿 𝑏, 𝑤
➢ Loss: how good a set of
values is.
𝐿 −5,1 𝑦 = 𝑏 + 𝑤𝑥1 𝑦 = −5 + 1𝑥1 How good it is?
Data in 2024
01/02/2024 01/03 01/04 …… 12/30/2024 12/31

248.42 238.45 237.93 417.41 403.84

−5+1𝑥1 = 𝑦 233.45 −5+1𝑥1 = 𝑦


𝑒2= 𝑦 − 𝑦ො = 4.48 𝑒𝑁
𝑦 𝑦

238.45 237.93 403.84


Define Loss ➢ Loss is a function of
from Training Data parameters 𝐿 𝑏, 𝑤
➢ Loss: how good a set of
values is.

248.42 238.45

𝑏 + 𝑤𝑥1 = 𝑦 1
Loss: 𝐿 = ෍ 𝑒𝑛
𝑒1 𝑁
𝑛
𝑦ො
238.45

𝑒 = 𝑦 − 𝑦ො 𝐿 is mean absolute error (MAE)


2
𝑒 = 𝑦 − 𝑦ො 𝐿 is mean square error (MSE)
If 𝑦 and 𝑦ො are both probability distributions Cross-entropy
Define Loss ➢ Loss is a function of
from Training Data parameters 𝐿 𝑏, 𝑤
➢ Loss: how good a set of
Model 𝑦 = 𝑏 + 𝑤𝑥1 values is.
Small 𝐿

𝑏 Error Surface

Large 𝐿 𝑤
Step 3 - Optimization
Application oriented
Step 0: What kind of function do you want to find? • Different data
• Different Tasks
Step 1: Step 2: Step 3: find
define a set goodness of the best
of function function function
Just like the three steps to put an elephant into the fridge……

73
Find the Best Function through Optimization

Layer l Layer l+1


Enumerate all possible values

Network parameters 𝜃 =
𝑤1 , 𝑤2 , 𝑤3 , ⋯ , 𝑏1 , 𝑏2 , 𝑏3 , ⋯ 106
weights

……
……
Millions of parameters

Today a network can have


1000 1000
more than 100B parameters.
neurons neurons

74
Source of image: http://chico386.pixnet.net/album/photo/171572850
Optimization
𝑤 ∗ , 𝑏 ∗ = 𝑎𝑟𝑔 min 𝐿
𝑤,𝑏

Gradient Descent
➢ (Randomly) Pick an initial value 𝑤 0
𝜕𝐿
➢ Compute |𝑤=𝑤 0
𝜕𝑤
Loss
𝐿 Negative Increase w

Positive Decrease w

𝑤0 𝑤
Source of image: http://chico386.pixnet.net/album/photo/171572850
Optimization
𝑤 ∗ , 𝑏 ∗ = 𝑎𝑟𝑔 min 𝐿
𝑤,𝑏

Gradient Descent
➢ (Randomly) Pick an initial value 𝑤 0
𝜕𝐿
➢ Compute |𝑤=𝑤 0
𝜕𝑤
Loss 𝜕𝐿
1 0
𝐿 𝑤 ←𝑤 −𝜂 |𝑤=𝑤 0
𝜕𝑤
𝜕𝐿
𝜂 |𝑤=𝑤 0 𝜂: learning rate
𝜕𝑤
hyperparameters

𝑤0 𝑤1 𝑤
Source of image: http://chico386.pixnet.net/album/photo/171572850
Optimization
𝑤 ∗ , 𝑏 ∗ = 𝑎𝑟𝑔 min 𝐿
𝑤,𝑏

Gradient Descent
➢ (Randomly) Pick an initial value 𝑤 0
𝜕𝐿
➢ Compute |𝑤=𝑤 0
𝜕𝑤
Loss 𝜕𝐿
1 0
𝐿 𝑤 ←𝑤 −𝜂 |𝑤=𝑤 0
𝜕𝑤
➢ Update 𝑤 iteratively
Does local minima truly cause the problem?

Local global
minima minima
𝑤0 𝑤1 𝑤2 𝑤𝑇 𝑤
Optimization
𝑤 ∗ , 𝑏 ∗ = 𝑎𝑟𝑔 min 𝐿
𝑤,𝑏

➢ (Randomly) Pick initial values 𝑤 0 , 𝑏 0


➢ Compute
𝜕𝐿 𝜕𝐿
|𝑤=𝑤 0 ,𝑏=𝑏0 1
𝑤 ←𝑤 −𝜂 0
|𝑤=𝑤 0 ,𝑏=𝑏0
𝜕𝑤 𝜕𝑤
𝜕𝐿 𝜕𝐿
|𝑤=𝑤 0 ,𝑏=𝑏0 1 0
𝑏 ← 𝑏 − 𝜂 |𝑤=𝑤 0 ,𝑏=𝑏0
𝜕𝑏 𝜕𝑏

Can be done in one line in most deep learning frameworks


➢ Update 𝑤 and 𝑏 interatively
Model 𝑦 = 𝑏 + 𝑤𝑥1
Optimization
𝑤 ∗ , 𝑏 ∗ = 𝑎𝑟𝑔 min 𝐿
𝑤,𝑏

Compute 𝜕𝐿Τ𝜕𝑤, 𝜕𝐿Τ𝜕𝑏

𝑤 ∗ = 0.97, 𝑏 ∗ = 0.1𝑘
𝑏 𝐿 𝑤 ∗ , 𝑏 ∗ = 0.48𝑘
(−𝜂 𝜕𝐿Τ𝜕𝑤, −𝜂 𝜕𝐿Τ𝜕𝑏)

Compute 𝜕𝐿Τ𝜕𝑤, 𝜕𝐿Τ𝜕𝑏

𝑤
Optimization and Gradient Descent

80
Summary: Steps in Machine Learning

Step 0: What kind of function do you want to find?


• Know your data: image, text, social networks, reviews ……
• Know your task: Regression, Classification, Generation ……

Step 1: Step 2: Step 3: find


define a set goodness of the best
of function function function
Deep Learning Supervised Optimization
SVM Reinforcement Gradient Descent
Decision Tree …… ……
……
81

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy