Learn machine learning engineering from regression and classification to deployment and deep learning.
Join #course-ml-zoomcamp Channel on Slack • Telegram Announcements • Course Playlist • FAQ • Tweet about the Course
- How to Join
- Quick Start Checklist
- What This Course Is About
- Prerequisites
- Syllabus
- Community & Getting Help
- Certificates
- Sponsors
- About DataTalks.Club
- Starts: September 15, 2025
- Duration: 4 months
- Time commitment: ~10 hours per week for coursework and projects
- What's included:
- Structured learning path with deadlines
- Peer interaction and community support
- Opportunity to earn a certificate
- Register: Sign up here
- Calendar: Subscribe to updates
All materials are freely available on GitHub. You can:
- Follow along with the syllabus below
- Complete homework at your own pace (solutions included)
- Work on projects to practice what you learn
Note: Self-paced learning gives you access to all course materials and recordings, but you need to join a live cohort to earn a certificate.
- Check this repository content and star it (all materials live here).
- Subscribe to DataTalks.Club on YouTube and review the course playlist.
- Read the frequently asked questions to save time later.
- Join the Slack course channel for discussions.
- Join the Telegram channel for announcements.
This is a practical course where you'll learn to build and deploy machine learning systems. We focus on the engineering side from training models to getting them to work in production.
You'll learn:
- Core ML algorithms and when to use them
- How to prepare data and engineer features
- Model evaluation and selection
- Deploying models with Flask, Docker, and cloud platforms
- Using Kubernetes for ML model serving
- MLOps practices
Technical setup: For machine learning modules, you only need a laptop with an internet connection. For deep learning sections, we'll use cloud resources (like Saturn Cloud) for more intensive computations.
You'll need:
- Prior programming experience (at least 1+ year)
- Comfort with command line basics
You don't need any prior experience with machine learning. We'll start from the basics.
Learn the fundamentals: what ML is, when to use it, and how to approach ML problems using the CRISP-DM framework.
Topics:
- ML vs rule-based systems
- Supervised learning basics
- CRISP-DM methodology
- Model selection concepts
- Environment setup
Build a car price prediction model while learning linear regression, feature engineering, and regularization.
Topics:
- Linear regression (from scratch and with scikit-learn)
- Exploratory data analysis
- Feature engineering
- Regularization techniques
- Model validation
Create a customer churn prediction system using logistic regression and learn about feature selection.
Topics:
- Logistic regression
- Feature importance and selection
- Categorical variable encoding
- Model interpretation
Learn how to properly evaluate classification models and handle imbalanced datasets.
Topics:
- Accuracy, precision, recall, F1-score
- ROC curves and AUC
- Cross-validation
- Confusion matrices
- Class imbalance handling
Turn your models into web services and deploy them with Docker and cloud platforms.
Topics:
- Model serialization with Pickle
- Flask web services
- Docker containerization
- Cloud deployment (AWS)
Learn tree-based models and ensemble methods for better predictions.
Topics:
- Decision trees
- Random Forest
- Gradient boosting (XGBoost)
- Hyperparameter tuning
- Feature importance
Apply everything you've learned in a complete project: find a dataset, train models, and deploy a web service.
Introduction to neural networks using TensorFlow and Keras, including CNNs and transfer learning.
Topics:
- Neural network fundamentals
- TensorFlow & Keras
- Convolutional Neural Networks
- Transfer learning
- Model optimization
Deploy deep learning models using serverless technologies like AWS Lambda.
Topics:
- Serverless concepts
- AWS Lambda for ML
- TensorFlow Lite
- API Gateway
Learn to serve ML models at scale using Kubernetes and TensorFlow Serving.
Topics:
- Kubernetes basics
- TensorFlow Serving
- Model deployment and scaling
- Load balancing
Advanced model serving with KServe for production ML systems.
Choose a problem that interests you, find a suitable dataset, and develop your model. Deploy your model into a web service (local deployment or cloud deployment for bonus points).
- Slack:
#course-ml-zoomcamp
channel - FAQ: Common questions and answers
- Study Groups: Connect with other learners
- Check the FAQ first
- Follow our question guidelines
- Be helpful and respectful
- Share your learning journey
We encourage sharing your progress! Write blog posts, create videos, post on social media with #mlzoomcamp. It helps you learn better and builds your professional network.
Bonus: You can earn extra points for sharing your learning experience publicly.
Learn more: Learning in Public
To receive a certificate, you'll need to:
- Join a live cohort (self-paced learners cannot earn certificates)
- Complete 2 out of 3 projects:
- Midterm Project: Choose a problem that interests you, find a suitable dataset, and develop your model
- Capstone Project: Complete either Capstone Project 1 or Capstone Project 2 (includes deploying a model as a web service)
- Review 3 peers' projects by the deadline
Important: Projects must be completed individually, and you can join after the course has started if you miss some homework deadlines.
Ready to start? Join the 2025 cohort or start with Module 1
Thanks to our sponsors who make this course possible:
Interested in sponsoring? Contact [email protected].
DataTalks.Club is a global online community of data enthusiasts. It's a place to discuss data, learn, share knowledge, ask and answer questions, and support each other.
Website • Join Slack Community • Newsletter • Upcoming Events • Google Calendar • YouTube • GitHub • LinkedIn • Twitter
All the activity at DataTalks.Club mainly happens on Slack. We post updates there and discuss different aspects of data, career questions, and more.