Cenk Yildiran

Writing about lots of unrelated topics…


Supervised Learning

Supervised learning is a type of machine learning where the algorithm is trained on labeled data. This means that for each input, there is a corresponding output, and the model learns a mapping between the two. It can be divided into two primary types: regression and classification.

Image by Brian Penny from Pixabay

Regression

Regression tasks predict a continuous numerical value based on input features. For example, predicting house prices, stock values, or the temperature. The goal is to find the relationship between the input variables (features) and the continuous output.

  • Example: Predicting house prices based on features like the size of the house, number of bedrooms, and location.
  • Output: A continuous variable like the price of a house ($300,000, $450,000, etc.).
  • Common algorithms: Linear regression, polynomial regression, support vector regression, and decision trees.

The key idea behind regression is minimizing the error between predicted and actual values. The error is often measured by metrics such as Mean Squared Error (MSE) or Root Mean Squared Error (RMSE).

Classification

Classification tasks predict a discrete label or category based on input features. This is used when the output variable is categorical, such as binary (yes/no) or multiclass (different categories).

  • Example: Predicting whether an email is spam or not spam based on features like word frequency, sender, and email content.
  • Output: A discrete label like “spam” or “not spam” (binary classification), or more than two labels (multiclass classification).
  • Common algorithms: Logistic regression, decision trees, random forests, support vector machines, and neural networks.

In classification, the model typically uses metrics like accuracy, precision, recall, F1-score, and AUC-ROC to evaluate performance.

Two approaches in Supervised Learning: Interpretation or Prediction

Interpretation Approach

When the goal is interpretation, the primary objective is to understand the relationships between the input features (independent variables) and the target variable (dependent variable). In this case, the parameters of the model are crucial because they help explain how and why certain inputs affect the outcome. This approach is often used when decision-makers need insights or explanations for the patterns that the model discovers.

Prediction Approach

When the focus is on prediction, the primary objective is to make the most accurate predictions possible for the outcome variable on new, unseen data. The parameters of the model become less important, and instead, the emphasis is on the overall predictive power of the model. This approach is common in scenarios where we care more about the accuracy of the results than about understanding why the model made certain decisions.

If you enjoyed this post, you might also find this one interesting: https://cenk-yildiran.com/2024/09/27/deep-learning-machine-learning-and-ai-explained/



Leave a Reply

About Me

My name is Cenk, and I am an economist. I write on this internet site on economics, econometrics, finance, value-investing, programming, calculus, basketball, history, foods, books, self-improvement, well-being and productivity. This internet site is a personal blog, and the posts reflect my personal views and do not represent where I have been working.
For my academic works, please visit this site: https://cenkufukyildiran.academia.edu/
Posts related to financial markets, trading, investing and similar posts are not for financial advice purposes.

Newsletter

Discover more from Cenk Yildiran

Subscribe now to keep reading and get access to the full archive.

Continue reading