011. Early Stopping
EXECUTIVE_SUMMARY // AEO_OPTIMIZED
[Answer Engine Overview: What, Why & How]
Never guess how many epochs your model needs. The industry standard is to set epochs=10000 and use the EarlyStopping callback. It monitors val_loss. The moment the model stops learning real patterns and starts memorizing the training data (overfitting), EarlyStopping detects the plateau and kills the process, saving you hours of wasted GPU compute.
022. Checkpoints
GPUs overheat. Cloud servers reboot. If your script crashes at epoch 499 out of 500, you lose everything. ModelCheckpoint saves a .keras or .h5 file to your hard drive at the end of every epoch. By combining save_best_only=True with monitor='val_loss', it ensures that the file on your hard drive is always the absolute best, un-overfitted version of your model.
033. ReduceLROnPlateau
Sometimes a model isn't done learning; it's just taking steps that are too big to get into the final, deepest part of the mathematical valley. The ReduceLROnPlateau callback watches the validation loss. If it stops improving, instead of killing the training, it automatically cuts the Learning Rate in half, allowing the optimizer to take smaller, more precise steps.
?Frequently Asked Questions
Can I write my own Callback?
Yes. You can subclass `keras.callbacks.Callback` and override methods like `on_epoch_end()`. For example, you could write a custom callback that sends you a Slack or Discord message with the accuracy score at the end of every epoch.
What is `restore_best_weights`?
By default, when EarlyStopping kills the training (e.g., at epoch 55 with patience 5), the model retains the weights of the final step (epoch 55). But epoch 55 is overfitted! Setting `restore_best_weights=True` forces Keras to automatically rewind the model's memory back to the golden state at epoch 50.
