Learn the fundamentals of machine learning to help you correctly apply various classification and regression machine learning algorithms to real-life problems.
Machine learning classification and regression techniques have potential uses in various engineering disciplines. These machine learning models allow you to make predictions for a category (classification) or for a number (regression) given sensor data, and can be used in, for example, predicting properties of objects (such as their weight or shape).
You will get insight into:
Machine learning and its variants, such as supervised learning, semi-supervised learning, unsupervised learning and reinforcement learning.
Regression techniques such as linear regression, K-nearest neighbor regression, how to deal with outliers and evaluation metrics such as the mean squared error (MSE) and mean absolute error (MAE).
Classification techniques such as the histogram method, the nearest mean (or nearest medoid) method and the nearest neighbor classifier. We cover the classification setting and important concepts such as the Bayes classifier and the Bayes error, the optimal classifier in theory.
Training models using (stochastic) gradient descent and its variants, we learn how to tune this optimizer, and how to use it to construct a logistic regression classification model.
Overfitting means a classifier works well on a training set but not on unseen test data. We discuss how to build complex non-linear models, and we analyze how we can understand overfitting using the bias-variance decomposition and the curse of dimensionality. Finally, we discuss how to evaluate fairly and tune machine learning models and estimate how much data they need for an efficient performance.
Regularization methods can help to mitigate overfitting. We discuss two regularization techniques for estimating the linear regression coefficients: ridge regression and LASSO. The latter can also be used for variable selection.
Classifier evaluation metrics such as the ROC curve and confusion matrix can give more insight into the performance of classifiers. We also discuss what constitutes a “good” accuracy; this is given by so-called dummy-classifiers which are naïve baselines.
Support Vector Machines (SVMs) are more advanced classification models that can provide good performance even in high-dimensional spaces and with little data. We discuss their different variants such as the soft-margin SVM, the hard-margin SVM and the nonlinear kernel SVM.
Decision Trees are simple models that can easily be understood by lay people. They are easy to use and visualize, and instead of a black box they can be easily understood as an interpretable white box model, making them suitable for various applications.
What You’ll Learn
- Apply common operations (pre-processing, plotting, etc.) to datasets using Python.
- Explain the concept of supervised, semi-supervised, unsupervised machine learning and reinforcement learning.
- Explain how various supervised learning models work and recognize their limitations.
- Analyze which factors impact the performance of learning algorithms.
- Apply learning algorithms to datasets using Python and Scikit-learn and evaluate their performance.
- Optimize a machine learning pipeline using Python and Scikit-learn.
-
Subjects
- Module 00. Welcome to Supervised Machine Learning!
- Module 01. Introduction to Supervised Machine Learning
- Module 02. Regression
- Module 03. Classification
- Module 04. Training Models
- Module 05. Overfitting
- Module 06. Cross Validation & Regularization
- Module 07. Classifier Evaluation
- Module 08. Support Vector Machines (SVMs)
- Module 09. Decision Trees
- Module 10. Wrap-up
AI skills for Engineers: Supervised Machine Learning by TU Delft OpenCourseWare is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Based on a work at https://online-learning.tudelft.nl/courses/ai-skills-for-engineers-supervised-machine-learning/