Prepare for your exams
Get points
Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

UNDERSTANDING GRADIENT DESCENT AND BACK-PROPAGATION, Assignments of Machine Learning

Sikkim Manipal University Machine Learning

UNDERSTANDING GRADIENT DESCENT AND BACK-PROPAGATION

Typology: Assignments

2020/2021

Uploaded on 05/20/2021

SHREYANSH-SINHA 🇮🇳

(1)

1 document

1 / 5

This page cannot be seen from the preview

Don't miss anything!

DEEP-LEARNING

Partial preview of the text

Download UNDERSTANDING GRADIENT DESCENT AND BACK-PROPAGATION and more Assignments Machine Learning in PDF only on Docsity!

DEEP-LEARNING

To define the relationship between Gradient decent and back-propagation we must first understand what gradient decent is: Let us consider a single layered neural network: Here the X1 and X2 represent input layer often called layer 0, and 𝑦̂ as output layer. Thus, this is a neural network with just one hidden layer which can also be expressed as single layered neural network or single perceptron network. Let us see the expression of the expected output 𝑦̂ The 𝑦̂ is computed as the dot product of weight matrix with the input x with the addition of bias b

𝑦̂ = W X + b

Where w is the weight matrix and b being the bias of the network. This equation is supposed to be a linear function as we are using just one perceptron thus its working is same as of linear regression or we can say this is the working of a linear regression. Since our neural network is not bounded to just linear data or expression, we must use a non-linear activation function to increase the accuracy and let our neural network to work through the non- linearity of the problems. They let 𝜎 be the activation function thus our expression for the neural network will become:

𝑦̂ = 𝜎 (W X + b)

Let us w X + b be z. Then the expression changes as:

𝑦̂ = 𝜎 (z)

Here this 𝜎 can be a ReLu, sigmoid, tanh function. 𝑦̂ (i)^ = 𝜎 (z(i)) Let us formulate a cost function (skipping the derivation):

Cost function = J (W, b) = 1/m * ∑^ 𝑙(𝑦̂ , y)

𝑚 𝑖= 1

X

^

W

And the thus gradient decent tries to minimise the loss thus finding the global minima of the function and for that it takes small-small away from the weigh thus the expression of the gradient decent has subtraction of a small unit from the weight w. And tries to reach the global minima.

Thus, this small step taken by gradient decent is the learning rate 𝛼.

Now let us understand about back propagation. Let us consider a neural network with one perceptron and one input x: So, if we move around J considering it to be very small, how will that effect our loss/cost function this will what the gradient defines here. Appling chain rule:

For w2:

𝝏𝑱(𝑾) 𝝏𝒘𝟐

𝝏𝑱(𝑾) 𝝏𝒚̂

𝝏𝒚̂ 𝝏𝒘𝟐

Now for w1:

𝝏𝑱(𝑾) 𝝏𝒘𝟏

𝝏𝑱(𝑾) 𝝏𝒚̂

𝝏𝒚̂ 𝝏𝒛𝟏

𝝏𝒚̂ 𝝏𝒘𝟏 X y

W1 W2 ^

z1 (^) J(W) Fig : GD- 02 (credit MIT COURSE NUMBER 6S191)

Since the w1 is expanded even further thus partial derivative for z1 will also be considered. And this is done for just that simple neural network we considered so if we consider a large deep neural network the same can be repeated recursively. This is what the expression of back- propagation is. Theoretically if we say, The relationship between gradient decent and back-propagation is:

As we saw from the expression, we can consider the backpropagation as a subset of Gradient decent, which is the implementation of gradient descent in multi-layer neural networks.
Since the same training rule recursively occurs in each layer of the neural network, it can calculate as the contribution of each weight to the total error inversely from the output layer to the input layer. References: -

UNDERSTANDING GRADIENT DESCENT AND BACK-PROPAGATION, Assignments of Machine Learning

Related documents

Partial preview of the text

Download UNDERSTANDING GRADIENT DESCENT AND BACK-PROPAGATION and more Assignments Machine Learning in PDF only on Docsity!

DEEP-LEARNING

𝑦̂ = W X + b

𝑦̂ = 𝜎 (W X + b)

𝑦̂ = 𝜎 (z)

Cost function = J (W, b) = 1/m * ∑^ 𝑙(𝑦̂ , y)

X

X

^

W

W

Thus, this small step taken by gradient decent is the learning rate 𝛼.

For w2:

Now for w1:

W1 W2 ^

 Massachusetts Institute of Technology (MIT) DEEP-LEARNING course

number 6S191 – https://ocw.mit.edu/courses/electrical-engineering-and-

computer-science/6-s191-introduction-to-deep-learning-january-iap-2020/#

 DEEPLEARNING.AI course from COURSERA –

https://www.coursera.org/learn/neural-networks-deep-learning