neural networks trying to understand cost function with respect to gradient

204 Views Asked by At
import numpy as np

def sigmoid(x, deriv=False):
    if deriv==True:
        return 1-sigmoid(x)

    return 1/(1+np.exp(-x))

X = np.array([[0.2]])
y = np.array([[1]])

np.random.seed(0)
w0 = np.random.normal(size=(1,1), scale=0.1)

for i in range(100):
    l0 = X
    l1 = sigmoid(l0.dot(w0)) # forward propagation

    error = y-l1 #now back propagation start
    delta1 = sigmoid(l1,True) #please read nr.1 bellow

    delta_error = error*sigmoid(l1,True) 

    w0+=l0.T.dot(delta_error)

print(l1)

I understand everything regarding forward propagation. Please note this is simple one hidden layer network with one input and one output and there is no bias needed for this process.

What I do not understand is following.

Nr1.

O.k. we get the error. This makes sense, completely. But I don't understand how derivative of sigmoid function works with respect to error function. In my case, error function is y-l1, but I do not understand what sigmoid derivative function really explains... In this case it is obvious, if l1 is higher, error is lower... so why we don't just update weights with condition if value error is bigger or smaller from previous epoch? I checked derivative value and error value for each epoch and there seems to be some linearity between them. If error gets lower, so does derivative... I know I am missing something in this puzzle, however I don't understand it yet.

0

There are 0 best solutions below