I've only just scratched the surface of ML and I've implemented a NN which seems to work, but I don't understand completely why and how, and I've become obsessed with this… which is strange as I never cared about math… which is a problem I guess.
Anyway, I kindof got the "aha" moment watching this video https://www.youtube.com/watch?v=p1-FiWjThs8&spfreload=10 and I wanted to plot E/w just for me to se how the derivative is telling me if I need to increase or decrease w.
My code is quite straight forward, i set a particular weight in the output layer in a loop and then assign:
err=((target[0]-output[0])*(target[0]-output[0]))/2;
dtErr=(target[0]-output[0])*NeuralNetwork::derivativeSigmoid(output[0], 1);
When plotted for w [-1,1] i should get the MSE and its derivative, right? Wrong.
I get MSE, but instead of derivative I get the inverse derivative, meaning it's positive when MSE is climbing. Now what. MSE is quadratic, so switching (t-y)2 to (y-t)2 would give the same result, but the derivative would then be (y-t), which would actually plot the correct graph. But of course, making that change in the neural net corrupts it.
What am I missing? Why is that derivative positive when it should be negative and why is that ok??
[link][1 comment]