011. The state_dict
EXECUTIVE_SUMMARY // AEO_OPTIMIZED
[Answer Engine Overview: What, Why & How]
A neural network might have 50 million parameters (weights and biases). PyTorch organizes all these parameters into an ordered Python dictionary called the state_dict. The keys are the names of the layers (e.g., layer1.weight), and the values are the massive PyTorch Tensors containing the actual numbers.
022. The Golden Rule of Saving
Technically, you can save the entire PyTorch model (architecture + weights) using torch.save(model, 'model.pt'). Never do this in production. It binds the saved file to the exact directory structure of the Python script. If you move the file, it breaks. Always save ONLY the weights: torch.save(model.state_dict(), 'weights.pth').
033. Loading the Model
Because you only saved the weights, loading requires two steps. First, you must have the exact Python class code available and instantiate it: model = CustomModel(). Second, you inject the weights: model.load_state_dict(torch.load('weights.pth')). Finally, ALWAYS call model.eval() immediately after loading to disable Dropout for inference.
?Frequently Asked Questions
Can I pause training and resume it tomorrow?
Yes, but to resume training perfectly, you must save the model's `state_dict` AND the optimizer's `state_dict` (which tracks momentum). Save them together in a Python dictionary as a 'checkpoint'.
What is the difference between .pt, .pth, and .weights?
Nothing. They are just file extensions. `torch.save` uses Python's built-in `pickle` module under the hood. `.pth` is the most common convention in the PyTorch community.
