top of page

Artificial Neural Network - Effect of momentum on learning

All codes used in this tutorial can be found at my Github repository

Before proceeding with this tutorial, prefer to go through my previous tutorial about Artificial Neural Network.

Code Compatibility : Python 2.7 , tested on ubuntu 16.04

 

As we saw in previous post, Network took pretty long time (~700 epochs) to converge. This is because it was very raw network without any optimization. In present tutorial we will use momentum while making update to weights. Momentum is known to converge network faster.

Momentum(β) term can be represented as :

momentum ( β ​) in Neural Network training step.

Figure 1. Including momentum ( β ) in Neural Network Update Step

where β is a positive number (1.01 - 1.02) called the momentum constant. Typically, the momentum constant is set to 1.01 Above said equation is called the generalized delta rule.


To introduce new momentum term we will change our base code [Shown in bold]. In present implementation we are using β = 1.01

  1. import math

  2. """

  3. defining XOR gate, [x1, x2 , y]

  4. """

  5. XOR = [[0, 1, 1], [1, 1, 0], [1, 0, 1], [0, 0, 0]]

  6. # initializing weights

  7. w13 = 0.5

  8. w14 = 0.9

  9. w23 = 0.4

  10. w24 = 1.0

  11. w35 = -1.2

  12. w45 = 1.1

  13. t3 = 0.8

  14. t4 = -0.1

  15. t5 = 0.3

  16. # defining learning rate

  17. alpha = 0.5

  18. # initializing squaredError

  19. squaredError = 0

  20. # initializing error per case

  21. error = 0

  22. # defining epochs

  23. Epochs = 2000

  24. count = 0

  25. beta = 1.001

  26. # run this repeatedly for number of Epochs

  27. for j in range(Epochs):

  28. print "squaredError", squaredError

  29. # initializing squaredError per epoch

  30. squaredError = 0

  31. for i in range(4): # iterating through each case for given iteration

  32. """

  33. calculating output at each perceptron

  34. """

  35. y3 = 1 / (1 + math.exp(-((XOR[i][0] * w13) + (XOR[i][1] * w23-t3))))

  36. y4 = 1 / (1 + math.exp(-(XOR[i][0] * w14 + XOR[i][1] * w24-t4)))

  37. y5 = 1 / (1 + math.exp(-(y3 * w35 + y4 * w45-t5)))

  38. """

  39. calculating error

  40. """

  41. error = XOR[i][2] - y5

  42. """

  43. calculating partial error and change in weight for output and hidden perceptron

  44. """

  45. del5 = y5 * (1 - y5) * error

  46. dw35 = alpha * y3 * del5

  47. dw45 = alpha * y4 * del5

  48. dt5 = alpha * (-1) * del5

  49. """

  50. calculating partial error and change in weight for input and hidden perceptron

  51. """

  52. del3 = y3 * (1 - y3) * del5 * w35

  53. del4 = y4 * (1 - y4) * del5 * w45

  54. dw13 = alpha * XOR[i][0] * del3

  55. dw23 = alpha * XOR[i][1] * del3

  56. dt3 = alpha * (-1) * del3

  57. dw14 = alpha * XOR[i][0] * del4

  58. dw24 = alpha * XOR[i][1] * del4

  59. dt4 = alpha * (-1) * del4

  60. """

  61. calculating weight and bias update

  62. """

  63. w13 = (beta * w13) + dw13

  64. w14 = (beta * w14) + dw14

  65. w23 = (beta *w23) + dw23

  66. w24 = (beta *w24) + dw24

  67. w35 = (beta *w35) + dw35

  68. w45 = (beta *w45) + dw45

  69. t3 = (beta *t3) + dt3

  70. t4 = (beta *t4) + dt4

  71. t5 = (beta * t5) + dt5

  72. """

  73. Since y5 will be in float number between (0 - 1)

  74. Here we have used 0.5 as threshold, if output is above 0.5 then class will be 1 else 0

  75. """

  76. if y5 < 0.5:

  77. class_ = 0

  78. else:

  79. class_ = 1

  80. """

  81. uncomment below line to see predicted and actual output

  82. """

  83. # print ("Predicted",class_," actual ",XOR[i][2])

  84. """

  85. calculating squared error

  86. """

  87. squaredError = squaredError + (error * error)

  88. if squaredError < 0.001:

  89. # if error is below 0.001, terminate training (premature termination)

  90. break

Upon plotting squared error obtained with simple XOR gate and XOR gate with momentum term, we will get below given graph.

Figure 2. Improvement in rate of learning after application of momentum term to learning

If you like this tutorial please share with your colleague. Discuss doubts, ask for changes on GitHub. It's free, No charges for anything. Let me to get inspired from your responses and deliver even better.

Never Miss My Post!
bottom of page