A table can only hold so much. When the world becomes complex, we use the power of Deep Learning to generalize and predict the future.
1The Approximator
In classical RL, a Q-Table is a discrete map. But in a game like Atari, the number of possible states (pixel combinations) is greater than the number of atoms in the universe. We can never visit every state. Instead, we use a Deep Neural Network to act as a Function Approximator. The network learns the underlying patterns of the environment, allowing it to predict accurate Q-values for states it has never even encountered before.
2Training the Brain
Training a DQN is essentially a regression task. We want our network to output values that match the Bellman Target: $Y = R + gamma cdot max_{a'} Q(s', a'; heta)$. We use Mean Squared Error (MSE) to measure the difference between our network's current prediction and this target. Through Backpropagation, we update the weights ($ heta$) of the network to minimize this error, slowly aligning the 'brain' with the optimal physics of the environment.
3Generalization
The true superpower of DQN is Generalization. Because the neural network identifies features (like 'there is a ball' or 'the wall is close'), it can make intelligent decisions in new situations. If the agent learns to dodge an obstacle in the middle of the screen, it will automatically know how to dodge a similar obstacle on the left, even if it has never seen a 'state' with pixels in those exact coordinates before.
