Can you calculate this by hand? ✍️
There might be a mistake? ∂L / ∂z1 should be [ 1 , -2, 2 , -1] ?
Looks like ∂L / ∂z2 = [ 1 , 0 ] and ∂L / ∂z1 = [ 1 , 0 , 2 , 0 ] instead because of ReLU, no?
It seems that ReLU is not applied to gradients. Instead, ReLU only works on activation of the output.
There might be a mistake? ∂L / ∂z1 should be [ 1 , -2, 2 , -1] ?
Looks like ∂L / ∂z2 = [ 1 , 0 ] and ∂L / ∂z1 = [ 1 , 0 , 2 , 0 ] instead because of ReLU, no?
It seems that ReLU is not applied to gradients. Instead, ReLU only works on activation of the output.