Skip to 0 minutes and 14 seconds In addition to the three basic types of machine learning, there are many other types, including semi-supervised learning, self-supervised learning, one-shot learning, and zero-shot learning. Due to the time limit we don’t talk the details here. The data type is important for selecting algorithms.
Skip to 0 minutes and 33 seconds Data can be classified into four types: nominal, ordinal, interval and ratio. The nominal and ordinal data belong to discrete data, while interval and ratio belong to continuous data. Discrete data can’t be measured but it can be counted Continuous data can’t be counted but they can be measured. Nominal data are labeling variables without any quantitative value, which can be simply called labels! Nominal data are encoded using one-hot encoding. For example, your gender or the language you speak can be represented by nominal data. The ordinal data are discrete like nominal data, but the order is important. One of the examples is your educational background. Interval scales are continuous in which we know both the order and the exact differences between the values.
Skip to 1 minute and 28 seconds The classic example of an interval scale is Celsius temperature, because the difference between each degree is the same. But the problem of interval data is that they don’t have a true zero. Ratio data are interval data with absolute zero. Good examples of ratio variables include height, weight, and duration. Now we know the difference between the four types of data. Before starting to introduce algorithms, I want to talk about the metrics first. The metrics are used to evaluate the performance of our models. You can only really understand your model if you know the meanings behind your metrics. Here is a confusion matrix form wikipedia. The confusion matrix also known as an error matrix
Skip to 2 minutes and 14 seconds The matrix is used to visualize if the model is confusing two classes: positive and negative class. The rows represent the predicted class, while the columns represent the labels, which is called the true condition here. The most important part of this model, is that there are two types of errors, the first one is called false positive, also called type-I error, which means the model misclassifies negative class as positive; the other one is false negative, or type II error, which means the model misclassifies positive class as negative. Keep those two types of errors in mind. Based on those two types of errors, there are many different metrics.
Skip to 3 minutes and 6 seconds Some metrics focus on minimizing type I error, some focus on minimizing type II error, some want to minimize both errors. Depending on your applications, you will favor different type of matrix for more details, please refer to Wikipedia
Machine Learning Basics: Part2
Continuing on the previous step, Prof.Lai will continue to explain the types of machine learning. There are many other types of Machine learning, including semi-supervised learning, self-supervised learning, one-shot learning, and zero-shot learning.
We will first see the introduction of Data Types (Measurement Scales). Data types play an important role in selecting algorithms. We have talked about different algorithms in the first week.
Next, he will introduce four types: nominal, ordinal, interval, and ratio data. In summary, the nominal and ordinal data belong to discrete data, while interval and ratio belong to continuous data. Prof. Lai will explain more about the concept of Nominal Data, Ordinal Data, Interval Data, Ratio, and Metrics.
For the rest of the types of Machine learning, you could check on these references if you are interested.
Understanding each term will take a relatively longer time than usual studies. We would only recommend you try if you are interested. If you can’t finish, it will not interfere with your learning. Do continue to the next topic. We hope you find them useful for your benefits.