A neural network that cannot correct itself is just a calculator. Propagation is the mechanism that allows machines to learn from their mistakes.
1The Learning Loop
Neural networks don't simply 'know' the answers—they learn through a continuous, repetitive cycle of guessing and correcting. This cycle is the absolute core of machine learning, and it is divided into two distinct phases: Forward Propagation (the guess) and Backpropagation (the correction). Without this loop, a neural network is just a random number generator.
"""
Step 1: Guess (Forward)
Step 2: Check Error (Loss)
Step 3: Correct (Backward)
Repeat 1,000,000 times.
"""2Forward Propagation
Forward Propagation is the process where data flows strictly in one direction: from the input layer, through the hidden layers, to the output layer. During this phase, every neuron performs its weighted sum and activation function. The network uses its *current* weights to make its best possible prediction. Importantly, no learning happens during the forward pass; it is purely an inference step.
import torch
# Input data X flows through the model
# prediction = weight * X + bias
prediction = model(X_train)
print(f'Prediction: {prediction}')3Evaluating the Error (Loss)
Once the network has made its guess, we need to know how wrong it is. We compare the network's prediction to the actual ground truth using a Loss Function. The Loss is a single number representing the 'grade' the model receives. A high loss means the model is performing terribly; a loss approaching zero means the model has perfectly learned the pattern.
# Comparing prediction to reality
loss = criterion(prediction, y_train)
# Loss is the 'grade' the model receives.4Backpropagation
Now for the most important algorithm in AI: Backpropagation. If Forward Propagation is the guess, Backpropagation is the correction. It works backward from the output layer to the input layer. Using the Chain Rule from calculus, it calculates the Gradient—exactly how much each specific weight contributed to the final error. It mathematically distributes the 'blame' across the entire network.
# The Magic Step
loss.backward()
# Calculates the 'Gradient' for every weight.
# Gradient = Direction to reduce loss.5Applying the Gradients
Finally, the network uses the gradients like a compass. The gradient points in the direction that will *increase* the error, so the network takes a step in the exact opposite direction. An Optimizer (like SGD or Adam) updates the weights, turning the 'knobs' slightly to ensure the next guess is just a tiny bit more accurate. This complete cycle is called one Epoch.
# Gradient Descent
optimizer.step()
# New_Weight = Weight - (Learning_Rate * Gradient)