Back Propagation Algorithm

Anshul Jain

October 06, 2020

Machine learning, the technology taking the world by storm, relies heavily on various components and algorithms to accomplish successful classification and problem-solving. Among these, artificial neural networks (ANNs) are at its core, as it helps ANNs propagate human-like intelligence in machines and systems. Currently, ANN is being used across the world by organizations like Google to constantly improve search engine functionality, IBM for developing personalized treatment plans, Walmart to predict future product demand, and many other large and small scale companies for a variety of processes, including for deep learning.


What made Machine Learning so dependent on Artificial Neural Networks?

Well! It was the development of the backpropagation algorithm that drastically increased the popularity of ANN in machine learning, and consequently in artificial intelligence, as it enabled ANNs to deliver better prediction accuracy. However, the role of the backpropagation algorithm wasn’t limited to this.

To help you understand why this algorithm made ANN a game-changer in field artificial intelligence, here is a thorough discussion on back propagation algorithms.

Let’s get started!

What is Back Propagation?

Back Propagation or back propagation of error is an algorithm for supervised learning of artificial neural networks using gradient descent. It is, though, prominently used to train the multi-layered feedforward neural networks, the main objective of the backpropagation algorithm is to adjust the weights of the neurons in the neural networks, on the basis of the given the error function, to ensure the actual output is closer to the expected result. This is performed in the form of a derivation by applying the chain rule to the error function partial derivative.

First introduced in the 1970s as a general optimization method for performing automatic differentiation of complex nested functions, the backpropagation algorithm found its importance in machine learning only after the publication of a paper titled "Learning Representations by Back-Propagating Errors” by Rumelhart, Hinton & Williams, in 1986. Since then, researchers have been working towards unraveling the backpropagation algorithm to get maximum benefits.

Today, some common back propagation algorithm example include deep learning, machine learning, and natural language processing, all of which make use of the algorithm to improve the results delivered for a problem.

Now that we comprehend the basics of the backpropagation algorithm, let's move on to understanding how it works.

How Does Back Propagation Algorithm Works?

As we know in artificial neural networks, training occurs in various steps, from:

  • Initialization.
  • Forward propagation.
  • Error Function.
  • Backpropagation.
  • Weight Update.
  • Iteration.

It is the fourth step of the process, a backpropagation algorithm that calculates the gradient of a loss function of the weights in the neural network to ensure the error function is minimum. However, the backpropagation algorithm accomplishes this through a set of Back Propagation Algorithm Steps, which involves:

  • Selecting Input & Output: The first step of the backpropagation algorithm is to choose an input for the process and to set the desired output.
  • Setting Random Weights: Once the input and output are set, random weights are allocated, as it will be needed to manipulate the input and output values. After this, the output of each neuron is calculated through the forward propagation, which goes through:
    • Input Layer
    • Hidden Layer
    • Output Layer
  • Error Calculation: This is an important step that calculates the total error by determining how far and suitable the actual output is from the required output. This is done by calculating the errors at the output neuron.
  • Error Minimization: Based on the observations made in the earlier step, here the focus is on minimizing the error rate to ensure accurate output is delivered.
  • Updating Weights & other Parameters: If the error rate is high, then parameters (weights and biases) are changed and updated to reduce the rate of error using the delta rule or gradient descent. This is accomplished by assuming a suitable learning rate and propagating backward from the output layer to the previous layer. Acting as an example of dynamic programming, this helps avoid redundant calculations of repeated errors, neurons, and layers.
  • Modeling Prediction Readiness: Finally, once the error is optimized, the output is tested with some testing inputs to get the desired result.
    This process is repeated until the error reduces to a minimum and the desired output is obtained.

Important Derivatives: It is important to know that during this process, the backpropagation algorithm requires a differentiable activation function, among which types of Sigmoid function (tan-sigmoid and log-sigmoid) are the traditionally used activation functions and ramp and ReLU the common ones.

Types of Backpropagation:

There are two major types of back propagation algorithms, each of which is defined below:

  • Static Back Propagation: The static back propagation maps a static input for static output and is mainly used to solve static classification problems such as optical character recognition. Moreover, here mapping is more rapid compared to the other type of back propagation.
  • Recurrent Back-Propagation: This is the second type of back propagation, where the mapping is non-static. It is fed forward until it achieves a fixed value, after which the error is computed and propagated backward.

Why Do We Need Back Propagation Algorithms?

By now we know that while designing an artificial neural network, random values are allocated to the weights, which further plays a major role in enabling the network to learn. However, there is no guarantee that these random weighted inputs are accurate. Therefore, to mitigate errors caused by these inaccurate values, the backpropagation algorithm comes into play. It is, in short, needed for:

  • Calculating the errors in the model.
  • Minimizing the detected errors by updating the weights and biases.
  • Ensuring the model is ready for making predictions.

Advantage of Back Propagation Algorithm:

Apart from correcting trajectories in the weight and bias space through gradient descent, there is another reason for the resurgence of the popularity of back propagation algorithms, which is the widespread adoption of deep neural networks for functions like image recognition and speech recognition, in which this algorithm plays a major role.

But, that’s not it. There are various more advantages offered by this algorithm, as listed below:

  • It simplifies the network structure by removing weighted links.
  • Fast and easy to program.
  • Does not require prior knowledge about the networks.
  • There is no need to specify the features of the function to be learned.
  • Allows efficient computation of the gradient at each layer.

Disadvantages of Back Propagation Algorithm:

Though the advantages of backpropagation outnumber its disadvantages, it is still imperative to highlight these limitations. Therefore, here are the limitations of back propagation algorithms.

  • It relies on input to perform on a specific problem.
  • Sensitive to complex/noisy data.
  • It needs the derivatives of activation functions for the network design time.


It is evident from the above discussion that backpropagation has become an integral part of neural networks, as it relies on this algorithm to become self-sufficient and capable of handling complex problems and issues. Moreover, it offers neural networks the ability to learn accurately while being flexible. Currently, the popularity of this algorithm is such that the latest technologies like natural language processing, speech recognition, image recognition, and more use it to successfully perform their designated tasks.

So, whether you are building a machine that can enunciate words and sentences accurately or creating artificial neural networks, underlying each is this algorithm, Back Propagation Algorithm.