The ML Lifecycle: From Notebooks to Production
Code Syllabus
ML Infrastructure Architecture
"A model that only runs on your local machine brings zero value to the business. MLOps is the discipline of making Machine Learning predictable, scalable, and reliable."
Data Preparation & Versioning
Traditional software version control (Git) handles code perfectly, but it crashes when dealing with 10GB datasets. In MLOps, we use tools like DVC (Data Version Control) to track datasets alongside code. If a model behaves strangely in production, you must be able to rollback to the exact code and the exact data used to train it.
Model Training & Registry
Data scientists experiment with dozens of algorithms and hyperparameters. Without tracking (e.g., MLflow, Weights & Biases), knowing which parameters produced the best accuracy is impossible. Once a model is finalized, it's stored in a Model Registry—a centralized repository treating ML models as deployable artifacts.
Serving & Monitoring
Deployment often involves wrapping the model in an API (FastAPI) and containerizing it with Docker. But the lifecycle doesn't end there. In the real world, data changes (Data Drift) and model performance degrades (Concept Drift). Continuous Monitoring using Prometheus and Grafana ensures we know exactly when a model needs retraining.
⚙️ MLOps Frequently Asked Questions
What is the Machine Learning Lifecycle?
The Machine Learning Lifecycle is the iterative process of taking an ML project from conception to production. It consists of Scoping, Data Prep (Ingestion/Cleaning), Modeling (Training/Evaluation), Deployment (Serving), and Monitoring. It is highly cyclical; monitoring feeds back into data prep for retraining.
Why is MLOps necessary compared to standard DevOps?
Standard DevOps focuses on versioning code. MLOps is far more complex because it must version three dimensions: Code, Data, and Models. Furthermore, code doesn't typically degrade on its own, whereas ML models naturally degrade over time as real-world data distributions drift away from the training data.
What is Data Drift vs Concept Drift?
Data Drift happens when the statistical properties of the input data (features) change (e.g., users get older).
Concept Drift happens when the relationship between input features and the target variable changes (e.g., the definition of a "fraudulent transaction" evolves because hackers change tactics). Both require model retraining.