Complete end-to-end MLOps pipeline orchestration from data preparation through model deployment. This skill provides comprehensive guidance for building production ML pipelines that handle the full lifecycle: data ingestion → preparation → training → validation → deployment → monitoring. - Building new ML pipelines from scratch
ML Pipeline Workflow
Complete end-to-end MLOps pipeline orchestration from data preparation through model deployment.
Overview
This skill provides comprehensive guidance for building production ML pipelines that handle the full lifecycle: data ingestion → preparation → training → validation → deployment → monitoring.
When to Use This Skill
Building new ML pipelines from scratch
Designing workflow orchestration for ML systems
Implementing data → model → deployment automation
Setting up reproducible training workflows
Creating DAG-based ML orchestration
Integrating ML components into production systems
What This Skill Provides
Core Capabilities
Pipeline Architecture
End-to-end workflow design
DAG orchestration patterns (Airflow, Dagster, Kubeflow)
Component dependencies and data flow
Error handling and retry strategies
Data Preparation
Data validation and quality checks
Feature engineering pipelines
Data versioning and lineage
Train/validation/test splitting strategies
Model Training
Training job orchestration
Hyperparameter management
Experiment tracking integration
Distributed training patterns
Model Validation
Validation frameworks and metrics
A/B testing infrastructure
Performance regression detection
Model comparison workflows
Deployment Automation
Model serving patterns
Canary deployments
Blue-green deployment strategies
Rollback mechanisms
Reference Documentation
See the references/ directory for detailed guides:
data-preparation.md - Data cleaning, validation, and feature engineering
model-training.md - Training workflows and best practices
model-validation.md - Validation strategies and metrics
model-deployment.md - Deployment patterns and serving architectures
Assets and Templates
The assets/ directory contains:
pipeline-dag.yaml.template - DAG template for workflow orchestration
training-config.yaml - Training configuration template