Backpropagation DNN exercises

Computational graph in Tensorboard showing the components involved in a TF BP update

Neuron

A network consist of a concatenation of the following layers

Fully Connected layer with input \(x^{(1)}\), \(W^{(1)}\) and output \(z^{(1)}\).
RELU producing \(a^{(1)}\)
Fully Connected layer with parameters \(W^{(2)}\) producing \(z^{(2)}\)
SOFTMAX producing \(\hat{y}\)
Cross-Entropy (CE) loss producing \(L\)

The task of backprop consists of the following steps:

Sketch the network and write down the equations for the forward path.
Propagate the backwards path i.e. make sure you write down the expressions of the gradient of the loss with respect to all the network parameters.

NOTE: Please note that we have omitted the bias terms for simplicity.

Backward Pass Step	Symbolic Equation
(5)	\(\frac{\partial L}{\partial L} = 1.0\)
(4)	\(\frac{\partial L}{\partial z^{(2)}} = \hat y - y\)
(3a)	\(\frac{\partial L}{\partial W^{(2)}} = a^{(1)} (\hat y - y)\)
(3b)	\(\frac{\partial L}{\partial a^{(1)}} = W^{(2)} (\hat y - y)\)
(2)	\(\frac{\partial L}{\partial z^{(1)}} = \frac{\partial L}{\partial a^{(1)}}\) if \(a^{(1)} > 0\)
(1)	\(\frac{\partial L}{\partial W^{(1)}} = \frac{\partial L}{\partial z^{(1)}} \times x^{(1)}\)