0% found this document useful (0 votes)

11 views6 pages

Sms Essay 2

The document discusses how multivariable calculus relates to artificial intelligence. It begins with an introduction to multivariable calculus, covering concepts like functions of multiple variables, partial derivatives, and gradient vectors. It then explains several applications of multivariable calculus in AI, including using gradient descent for linear regression in machine learning models and training artificial neural networks. Gradient descent is described as an iterative algorithm for finding the minimum of a cost function by taking steps in the direction of steepest descent.

Uploaded by

akhile293

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views6 pages

Sms Essay 2

Uploaded by

akhile293

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Multivariable Calculus and its Applications in

Artificial intelligence
Chong How, NUS High School
July 2020

1 Abstract
Mathematics plays an important role in modern science and technology. From
speed of reaction and thermodynamics to electric circuits and computer network,
the usage of math is found almost everywhere. In particular, math pushes the
progress of science and technology, making advancements possible, while new
problems arise because of unlocks in new scientific domains, this often cultivate
the development of new Mathematical tools to analyse them. We will look into
how Multivariable Calculus is related to Artificial Intelligence, such as Neural
Network and Regression.

2 Introduction
“Mathematics is the queen of the sciences” - Carl Friedrich Gauss

Artificial Intelligence (AI) is closely linked to Mathematics. Why? Some key-

words associated with AI are “machine learning” “algorithms” “neural network”
and “graph theory”. These domains are heavy in maths. Some of the related
mathematical tools are Linear Algebra, Multivariable Calculus and Probability.
We shall discuss Multivariable Calculus.

3 Multivariable Calculus
3.1 Coordinates and Functions
In secondary school, the 2-D Cartesian coordinate system (or xy coordinate
system) is introduced. A function y = f (x) is a function in x. For example,
the function y = x2 − 1 is a quadratic function that cuts through the x-axis
at x = 1 and x = −1, and the y-axis at y = −1. In the Euclidean 3-D space,
there is another axis — the z-axis which is orthogonal to xy axis and its positive
direction determined by the right hand rule.

1
A function z = f (x, y) is a function in x, y. The value of z depends on x
and y. For example, the function z = f (x, y) = x2 + y 2 is a elliptical paraboloid
passing through (0,0,0).

3.2 Differentiation
dy f (x + h) − f (x)
Suppose y = f (x), the derivative, dx is defined to be lim . For
h→0 h
d 3
example, dx (2x ) = 6x2

In 3-D coordinate, suppose z = f (x, y), then the partial derivative fx (x, y) is
f (x + h, y) − f (x, y) ∂z
defined to be lim = ∂x . And the partial derivative fy (x, y)
h→0 h
f (x, y + h) − f (x, y) ∂z
is defined to be lim = ∂y .
h→0 h
What does this mean? Construct a plane y = a that cuts through the sur-
face of z = f (x, y). fx (x, y) measures the rate of change of f on the plane in the
direction of positive x-axis. fy (x, y) is interpreted the same way.

Example 1: Let f (x, y) = x2 + y 2 . Then fx (x, y) = 2x, fy (x, y) = 2y

Example 2: Let f (x, y) = x2 y 2 . Then fx (x, y) = 2xy 2 ,fy (x, y) = 2yx2

3.3 Gradient Vector and Directional Derivatives

Let f be a function of x and y. Then the gradient of f is the vector function

∇f (x, y) = hfx (x, y), fy (x, y)i

and the directional derivative of f at (x0 , y0 ) in the direction of a unit vector u

= ha, bi is
f (x0 + ha, y0 + hb) − f (x0 , y0 )
Du f (x0 , y0 ) = lim
h→0 h
if this limit exists.

Example: Let f (x, y) = x + y, then ∇f (x, y) = h1, 1i.

A useful property of ∇f (x0 , y0 ) is that it provides the direction of the greatest

rate of change of f at the point (x0 , y0 ), which is |∇f (x0 , y0 )|. This let us to
think: What if Du f (x0 , y0 ) = 0 at some point (x0 , y0 ) of f ? What if we travel
in the direction of ∇f (x0 , y0 )?

4 Artificial Intelligence
Multivariable calculus has applications in AI development. These applications
include:

2
Gradient Descent
Artificial Neural Networks
Maximising a expectation-maximization algorithm
Optimization problems
Finding maximal margin in support vector algorithm, etc

We will discuss the first two.

4.1 Linear Regression and Multiple linear Regression

Regression is a method in Statistics to model the relationship between a de-
pendent variable and one or more independent variables. In Machine Learning,
Regression Algorithm is a method to train the model. Simple Linear Regression
predicts the dependent variable based on one independent variable. The aim of
applying Linear Regression is to find a function Y = mX + c and fit predicted
values and actual values as close as possible. The difference between predicted
and actual values are quantified by using a cost (sometimes called loss) function,
which could be based on mean squared error algorithm.

Example: Suppose Y values (y¯1 , y¯2 , y¯3 ) = (4.4, 4.8, 4.9) are predicted from a
set of X values (x1 , x2 , x3 ) through the relationship Y = mX + c, while the
actual Y values are (y1 , y2 , y3 ) = (5.4, 5.8, 5.9), then the mean squared error is
X3
1
3 (yi − y¯i )2 = 1.
i=1

What above Multiple Linear Regression? Multiple Linear Regression predicts

the dependent variable based on two or more independent variable. The rela-
tionship is often given by Y = m1 X1 + m2 X2 + ... + mn Xn + c.

4.2 Gradient Descent

Gradient descent is a iterative optimization algorithm to find the minimum value
of a function. In this context, it is the cost function.

Imagine a man is going down a mountain. There is haze contributed by neigh-

bourhood countries. He does not know how to get down. He tries to move to
another point that is lower. When the slope of the mountain is steep, he takes
a huge step, while if the slope is gentle, he take smaller steps. The next point
is determined by previous point and once he reach the bottom, he stops this
process, where hopefully he reached the bottom.

3
Now, Suppose we have a bunch of n data points. Define a cost function
n
1X
E= (yi − y¯i )2 (1)
n i=1

Since y¯i = mxi + c We rewrite (1) as

n
1X
E= (yi − (mxi + c))2 (2)
n i=1

We wish to minimise E, i.e we need to find suitable values of m and c such that
E is minimised.
To do so, we evaluate
n
∂E −2 X
= xi (yi − y¯i ) (3)
∂m n i=1

and
n
∂E −2 X
= (yi − y¯i ) (4)
∂c n i=1
∂E
Can’t we just solve ∂m = 0 and ∂E ∂c = 0 ? This method does work if the
function is not complicated. It is not feasible for complicated functions, and
unfortunately most error functions are complicated, and typically we also have
∂E
a large data set, which makes solving ∂m = 0 and ∂E∂c = 0 extremely difficult.

Learning rate, L, of model controls how much we modify our model every time.
If the learning rate is too large, it cause the model to converge too quickly, while
if the learning rate is too small, it may cause long training process.

To find the local minimum, we start by letting learning rate L = 0.0001, which
controls how much m and c changes with each iteration. Let m1 = c1 = 0.
∂E
We plug in values of our data points and current m, c values into ∂m and ∂E
∂c .
Then, we update our m and c value, given by the recurrence relation:
∂E
mn+1 = mn − L ×
∂m
∂E
cn+1 = cn − L ×
∂c
As more iterations are being ran, finally we have m = lim mn and c = lim cn .
n→∞ n→∞
Hopefully, we have reached our desired linear relation Y = mX + c that fits the
actual value with predicted value optimally. Why? This is because using this
method, we could have either reach a global minimum or local minimum of the
cost function, which we would not know.

Often, the dependent variable that we wish to predict has relationship with

4
more than one variable. The idea is the same as Simple Linear Regression. For
Gradient Descent of multiple variables, we will have to treat each variable sep-
arately by making all of the other variables constant and then find the partial
derivative of the function.

Gradient Descent is also used in Neural networks.

4.3 Artificial Neural Networks and Machine Learning

As the name suggests, Artificial Neural Networks are inspired by the brain.
They are sometimes also called models.

Imagine we have many neurons (or nodes). These neurons are organised in
layers. Each neuron holds a number and a bias value. The layers of neurons are
connected with neurons in other layers, forming a neural network.

A neural network accepts one or multiple inputs, processes it and give one
or multiple outputs.

A neuron of one layer interacts with neuron of the next layer through weighted
connections (a real valued number) between two neurons. Neuron in the next
layer receives values of the previous layer neurons, each multiplied by their con-
nection weight. The total sum of all those products plus that neuron’s bias value
is then put into an activation function, which then returns a value and assigns
to that neuron. Information is passed through the entire network. The key is
to decide on appropriate weights and bias values, which can be determined via
machine learning. How?

Suppose we have an artificial neural network. We define a cost function

C = C(W )

where W = (w1 , w2 , w3 , ..., wn ) is a matrix storing each weight/bias values. For

each i, wi is the value of one weight or bias in the network. C is our ”error” and
should be as low as possible. Clearly, we want to minimise the cost function,
and this can be done by computing partial derivative with respect to each of the
weights and bias in the network, which initially have any arbitrary initialised
values that are subjected to change. We then modify the weight and bias values
accordingly using the recurrence relation Wn+1 = Wn − ∇C(Wn ). By iterating
this process, it is likely we can get to a point Cmin = lim C(Wn ), At which C
n→∞
is minimised and the artificial neural network model is trained to give us our
desired result.

This method that we have just discussed is an old variant of modern Machine
Learning. It builds the foundation of more advanced concepts in Machine Learn-
ing. We will not be discussing that.

5
5 Concluding Remarks
It is truly remarkable how far technology has progressed over the last four
decades. Artificial Intelligence especially, no doubt holds great potential in ar-
eas such as research and new development. According to a new report from the
World Economic Forum (WEF), the growth of artificial intelligence could create
58 million net new jobs by the year 2022. Artificial Intelligence (AI) will define
the next phase of the world’s landscape, transforming our economy and society.
Prevalence of AI is likely to rise in the next few decade. And it will dominate
our lives, bringing immense benefits and also uncertainties into the future

Everything about AI is fascinating, yet without Mathematics, the study and

development of AI will be impossible. This is yet another instance that Math-
ematics is inseparable from our modern world. I would like to end with the
famous quote by Galileo Galilei: “In order to understand the universe you must
know the language in which it is written and that language is mathematics.”

Mathematics For Machine Learning: A Comprehensive Guide To Building Mathematical Foundations For AI and Data Science
No ratings yet
Mathematics For Machine Learning: A Comprehensive Guide To Building Mathematical Foundations For AI and Data Science
266 pages
AutomaticDifferentiation AppliedMaths
No ratings yet
AutomaticDifferentiation AppliedMaths
228 pages
Lec06 Matt
No ratings yet
Lec06 Matt
60 pages
Hota ML Regression
No ratings yet
Hota ML Regression
57 pages
Supervised Learning - Regression - Public - BS AI
No ratings yet
Supervised Learning - Regression - Public - BS AI
51 pages
Topic 07 - Data Modelling - Part I
No ratings yet
Topic 07 - Data Modelling - Part I
40 pages
Neural Network - Optimization DRAFT 3.11
No ratings yet
Neural Network - Optimization DRAFT 3.11
66 pages
DeepLearning Recap
No ratings yet
DeepLearning Recap
104 pages
Module2 Optimizations
No ratings yet
Module2 Optimizations
65 pages
APSC 258 Midterm Study Guide
No ratings yet
APSC 258 Midterm Study Guide
4 pages
L3 Linear Regression and Gradient Descent
No ratings yet
L3 Linear Regression and Gradient Descent
46 pages
Lecture 04
No ratings yet
Lecture 04
24 pages
Physics Informed Neural Networks For Numerical Analysis
No ratings yet
Physics Informed Neural Networks For Numerical Analysis
16 pages
Tut 01
No ratings yet
Tut 01
39 pages
Lecture 0.2 - Linear Methods For Regression, Optimization
No ratings yet
Lecture 0.2 - Linear Methods For Regression, Optimization
53 pages
Computing Gradient Using Backpropagation: ZV0GDF798E
No ratings yet
Computing Gradient Using Backpropagation: ZV0GDF798E
5 pages
Supervised Machine Learning
No ratings yet
Supervised Machine Learning
7 pages
Lecture12 Diff
No ratings yet
Lecture12 Diff
31 pages
Linear Regression
No ratings yet
Linear Regression
63 pages
Notes5 Regression
No ratings yet
Notes5 Regression
14 pages
A Proposal On Machine Learning Via Dynamical Systems
No ratings yet
A Proposal On Machine Learning Via Dynamical Systems
11 pages
Lecture20 Backprop
No ratings yet
Lecture20 Backprop
77 pages
What Is Machine Learning?
No ratings yet
What Is Machine Learning?
12 pages
11 Gradient Descent
No ratings yet
11 Gradient Descent
58 pages
What Is Machine Learning by Coursera
No ratings yet
What Is Machine Learning by Coursera
47 pages
Machine Learning Guide
No ratings yet
Machine Learning Guide
185 pages
Lec 03
No ratings yet
Lec 03
42 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
31 pages
07autodiff Nnets
No ratings yet
07autodiff Nnets
12 pages
Gradient Descent - Xiaowei Huang
No ratings yet
Gradient Descent - Xiaowei Huang
53 pages
FAI 4 Mathematical Concepts II
No ratings yet
FAI 4 Mathematical Concepts II
39 pages
Mit18 S096iap23 Lec06
No ratings yet
Mit18 S096iap23 Lec06
9 pages
Linear Regression
No ratings yet
Linear Regression
9 pages
Basic Machine Learning: Case Study
No ratings yet
Basic Machine Learning: Case Study
11 pages
Unit 2
No ratings yet
Unit 2
36 pages
Linear Regression by IntuitiveAI v2.5
No ratings yet
Linear Regression by IntuitiveAI v2.5
5 pages
Math YHPLinear Regression
No ratings yet
Math YHPLinear Regression
13 pages
Machine Learning Summary
No ratings yet
Machine Learning Summary
38 pages
Linear Classifier: by Dr. Sanjeev Kumar Associate Professor Department of Mathematics IIT Roorkee, Roorkee-247 667, India
No ratings yet
Linear Classifier: by Dr. Sanjeev Kumar Associate Professor Department of Mathematics IIT Roorkee, Roorkee-247 667, India
86 pages
Automatic Differentiation and Neural Networks
No ratings yet
Automatic Differentiation and Neural Networks
13 pages
Artificial Neural Networks: Multilayer Perceptrons Backpropagation
No ratings yet
Artificial Neural Networks: Multilayer Perceptrons Backpropagation
71 pages
Linear Regression
No ratings yet
Linear Regression
29 pages
CSE 440 AI Volume1 (p1)
No ratings yet
CSE 440 AI Volume1 (p1)
4 pages
Lecture 1, Part 1: Linear Regression: Roger Grosse
No ratings yet
Lecture 1, Part 1: Linear Regression: Roger Grosse
9 pages
L02 Linear Regression
No ratings yet
L02 Linear Regression
9 pages
Deep Learning Unit-II
No ratings yet
Deep Learning Unit-II
19 pages
ML:Introduction What Is Machine Learning?: Continuous and Discrete Data
No ratings yet
ML:Introduction What Is Machine Learning?: Continuous and Discrete Data
6 pages
Backward Forward Propogation
No ratings yet
Backward Forward Propogation
19 pages
Annette Paper
No ratings yet
Annette Paper
7 pages
Machine Learning
100% (1)
Machine Learning
185 pages
AML 04 Backpropagation
100% (1)
AML 04 Backpropagation
26 pages
ML Notes
No ratings yet
ML Notes
14 pages
Math Behind Machine Learning
No ratings yet
Math Behind Machine Learning
9 pages
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
No ratings yet
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
17 pages
Machine Learning Shortnote
No ratings yet
Machine Learning Shortnote
14 pages
Minsky y Papert
No ratings yet
Minsky y Papert
77 pages
Detection and Verification System of Handwriting and Signature Using Raspberry Picopy 220426094934
No ratings yet
Detection and Verification System of Handwriting and Signature Using Raspberry Picopy 220426094934
48 pages
Deep Learning - Intro, Methods & Applications
100% (1)
Deep Learning - Intro, Methods & Applications
37 pages
Unit 5
No ratings yet
Unit 5
46 pages
cst414 - Deep Learning
No ratings yet
cst414 - Deep Learning
34 pages
BTech 1ITSyllabusFinalYearVer1.1-1
No ratings yet
BTech 1ITSyllabusFinalYearVer1.1-1
76 pages
Ann TP
No ratings yet
Ann TP
40 pages
Learning in A Feed Forward Multiple Layer ANN - Backpropagation
No ratings yet
Learning in A Feed Forward Multiple Layer ANN - Backpropagation
18 pages
FPGA Based Artificial Neural Network
No ratings yet
FPGA Based Artificial Neural Network
11 pages
3rd Lecture
No ratings yet
3rd Lecture
21 pages
Tensors - Operations - and - Deep - Learning - Cycle - Jupyter Notebook
No ratings yet
Tensors - Operations - and - Deep - Learning - Cycle - Jupyter Notebook
24 pages
Applications of PINNs For Property Characterization of Complex Materials
No ratings yet
Applications of PINNs For Property Characterization of Complex Materials
11 pages
Fault Localization Thesis (147-169) Finl Edit by Me
No ratings yet
Fault Localization Thesis (147-169) Finl Edit by Me
78 pages
6 Benefits of DL Techniques For Credit Scoring
No ratings yet
6 Benefits of DL Techniques For Credit Scoring
14 pages
NNDL Assignment Ans
No ratings yet
NNDL Assignment Ans
15 pages
Unit I Architecture of Neural Network
No ratings yet
Unit I Architecture of Neural Network
74 pages
Neural Network Assignment 1 by Gourav Meena
No ratings yet
Neural Network Assignment 1 by Gourav Meena
14 pages
ML Unit 5
No ratings yet
ML Unit 5
19 pages
Machine Learning For Wireless Networks With Artificial Intelligence: A Tutorial On Neural Networks
No ratings yet
Machine Learning For Wireless Networks With Artificial Intelligence: A Tutorial On Neural Networks
98 pages
Aiml P2
No ratings yet
Aiml P2
7 pages
Md. Faisal 2024GE10 Assignment
No ratings yet
Md. Faisal 2024GE10 Assignment
13 pages
Project Report Full
No ratings yet
Project Report Full
60 pages
High-Performance Extreme Learning Machines - A Complete Toolbox For Big Data Applications PDF
No ratings yet
High-Performance Extreme Learning Machines - A Complete Toolbox For Big Data Applications PDF
15 pages
Deep Learning Tutorial 3
No ratings yet
Deep Learning Tutorial 3
12 pages
ANN-Based DEM Fusion and DEM Improvement Frameworks in Regions of Assam and Meghalaya Using Remote Sensing Datasets
No ratings yet
ANN-Based DEM Fusion and DEM Improvement Frameworks in Regions of Assam and Meghalaya Using Remote Sensing Datasets
12 pages
A Review of Optical Neural Networks
No ratings yet
A Review of Optical Neural Networks
11 pages
Data-Driven Modeling To Predict The Load vs. Displacement Curves of Targeted
No ratings yet
Data-Driven Modeling To Predict The Load vs. Displacement Curves of Targeted
14 pages
Face Recognition Using Neural Networks
No ratings yet
Face Recognition Using Neural Networks
20 pages
Artificial Intelligence Applied To Electrical Systems
No ratings yet
Artificial Intelligence Applied To Electrical Systems
10 pages
AIMLS2022
No ratings yet
AIMLS2022
2 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Sms Essay 2

Uploaded by

Sms Essay 2

Uploaded by

Multivariable Calculus and its Applications in

Artificial Intelligence (AI) is closely linked to Mathematics. Why? Some key-

Example 1: Let f (x, y) = x2 + y 2 . Then fx (x, y) = 2x, fy (x, y) = 2y

3.3 Gradient Vector and Directional Derivatives

∇f (x, y) = hfx (x, y), fy (x, y)i

and the directional derivative of f at (x0 , y0 ) in the direction of a unit vector u

Example: Let f (x, y) = x + y, then ∇f (x, y) = h1, 1i.

A useful property of ∇f (x0 , y0 ) is that it provides the direction of the greatest

We will discuss the first two.

4.1 Linear Regression and Multiple linear Regression

What above Multiple Linear Regression? Multiple Linear Regression predicts

4.2 Gradient Descent

Imagine a man is going down a mountain. There is haze contributed by neigh-

Since y¯i = mxi + c We rewrite (1) as

Gradient Descent is also used in Neural networks.

4.3 Artificial Neural Networks and Machine Learning

Suppose we have an artificial neural network. We define a cost function

where W = (w1 , w2 , w3 , ..., wn ) is a matrix storing each weight/bias values. For

Everything about AI is fascinating, yet without Mathematics, the study and

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.