Gradient descent with random input implementation

248 Views Asked by At

I am trying to implement gradient descent on a dataset. Even though I tried everything, I couldn't make it work. So, I created a test case. I am trying my code on a random data and try to debug.

More specifically, what I am doing is, I am generating random vectors between 0-1 and random labels for these vectors. And try to over-fit the training data.

However, my weight vector gets bigger and bigger in each iteration. And then, I have infinities. So, I do not actually learn anything. Here is my code:

import numpy as np
import random

def getRandomVector(n):
   return np.random.uniform(0,1,n)

def getVectors(m, n):
   return [getRandomVector(n) for i in range(n)]

def getLabels(n):
   return [random.choice([-1,1]) for i in range(n)]

def GDLearn(vectors, labels):
   maxIterations = 100
   stepSize = 0.01

   w = np.zeros(len(vectors[0])+1)
   for i in range(maxIterations):
      deltaw = np.zeros(len(vectors[0])+1)
      for i in range(len(vectors)):
         temp = np.append(vectors[i], -1)
         deltaw += ( labels[i] - np.dot(w, temp) ) * temp
      w = w + ( stepSize * (-1 * deltaw) )
   return w

vectors = getVectors(100, 30)
labels = getLabels(100)

w = GDLearn(vectors, labels)
print w

I am using LMS for loss function. So, in all iterations, my update is the following,

enter image description here

where w^i is the ith weight vector and R is the stepSize and E(w^i) is the loss function.

Here is my loss function. (LMS)

enter image description here

and here is how I derivated the loss function,

enter image description here,

Now, my questions are:

  1. Should I expect good results in this random scenario using Gradient Descent? (What is the theoretical bounds?)
  2. If yes, what is my bug in my implementation?

PS: I tried several other maxIterations and stepSize parameters. Still not working. PS2: This is the best way I can ask the question here. Sorry if the question is too specific. But it made me crazy. I really want to learn the problem.

1

There are 1 best solutions below

0
On

Your code has a couple of faults:

  • In GetVectors() method, you did not actually use the input variable m;
  • In GDLearn() method, you have a double loop, but you use the same variable i as the loop variables in both loops. (I guess the logic is still right, but it's confusing).
  • The prediction error (labels[i] - np.dot(w, temp)) has the wrong sign.
  • Step size does matters. If I am using 0.01 as step size, the cost is increasing in each iteration. Changing it to be 0.001 solved the problem.

Here is my revised code based on your original code.

import numpy as np
import random

def getRandomVector(n):
   return np.random.uniform(0,1,n)

def getVectors(m, n):
   return [getRandomVector(n) for i in range(m)]

def getLabels(n):
   return [random.choice([-1,1]) for i in range(n)]

def GDLearn(vectors, labels):
   maxIterations = 100
   stepSize = 0.001

   w = np.zeros(len(vectors[0])+1)
   for iter in range(maxIterations):
      cost = 0
      deltaw = np.zeros(len(vectors[0])+1)
      for i in range(len(vectors)):
         temp = np.append(vectors[i], -1)
         prediction_error = np.dot(w, temp) - labels[i]
         deltaw += prediction_error * temp
         cost += prediction_error**2
      w = w -  stepSize * deltaw
      print 'cost at', iter, '=', cost
   return w

vectors = getVectors(100, 30)
labels = getLabels(100)

w = GDLearn(vectors, labels)
print w

Running result -- you can see the cost is decreasing with each iteration but with a diminishing return.

cost at 0 = 100.0
cost at 1 = 99.4114482617
cost at 2 = 98.8476022685
cost at 3 = 98.2977744556
cost at 4 = 97.7612851154
cost at 5 = 97.2377571222
cost at 6 = 96.7268325883
cost at 7 = 96.2281642899
cost at 8 = 95.7414151147
cost at 9 = 95.2662577529
cost at 10 = 94.8023744037
......
cost at 90 = 77.367904046
cost at 91 = 77.2744249433
cost at 92 = 77.1823702888
cost at 93 = 77.0917090883
cost at 94 = 77.0024111475
cost at 95 = 76.9144470493
cost at 96 = 76.8277881325
cost at 97 = 76.7424064707
cost at 98 = 76.6582748518
cost at 99 = 76.5753667579
[ 0.16232142 -0.2425511   0.35740632  0.22548442  0.03963853  0.19595213
  0.20080207 -0.3921798  -0.0238925   0.13097533 -0.1148932  -0.10077534
  0.00307595 -0.30111942 -0.17924479 -0.03838637 -0.23938181  0.1384443
  0.22929163 -0.0132466   0.03325976 -0.31489526  0.17468025  0.01351012
 -0.25926117  0.09444201  0.07637793 -0.05940019  0.20961315  0.08491858
  0.07438357]