# From regression to classification

Classification is a supervised learning task with a categorical output. Watch Prof. Ni explain how to extend the regression method to classification.

In this video, Prof. Hao Ni explains how to extend the regression framework to solve the classification problem.

The framework of the classification is similar to the regression framework, which includes the following main components, i.e., dataset, model, empirical loss function, optimization and prediction. However, due to the difference in the output type (categorial output in classification and continuous output in regression), we need to adjust the model choice, loss function, prediction method and evaluation method to cope with the classification tasks. The classification framework is summarized as follows:

• Dataset: (mathcal{D} = {(x_{i}, y_{i})}_{i = 1}^{N}, (x_i, y_i) in mathcal{X} times mathcal{Y}), where (mathcal{Y}) is a finite set.
• Model: The model (f_{theta}) fully parameterized parameters (theta) is used to approximate the conditional probability of the output being (y) given the input (x), i.e., (f_{theta}(x)approx mathbb{P}[y vert x]).
• Empirical Loss: Cross entropy is a commonly used loss function in classification problems.
• Optimization: (theta^{*} = argmin_{theta} (L(theta vert mathcal{D}))) where (L) is a loss function.
• Prediction: (hat{y}_{*} = text{arg}max_{i in mathcal{Y}} f^{i}_{theta^{*}}(x_{*}).)
• Validation: Popular test metrics include accuracy, confusion matrix, etc.