Overview

In linear regression, we aim to model the relationship between input features and output targets using a linear function. The performance of this model is evaluated using a cost function, which quantifies the error between predicted and actual values.

Model Definition

Hypothesis (Model): $f_{w,b}(x) = wx + b$
Parameters: $w$ (weight), $b$ (bias)
Training Data: $\{(x^{(i)}, y^{(i)})\}_{i=1}^{m}$ , where $m$ is the number of examples

Cost Function

The cost function measures the average squared error between the predicted values and the actual target values:

$J(w, b) = \frac{1}{2m} \sum_{i=1}^{m} \left( f_{w,b}(x^{(i)}) - y^{(i)} \right)^2$

Objective: Find parameters $w$ and $b$ that minimize $J(w, b)$ .

Simplified Case: No Bias Term

For simplicity, consider the model without the bias term:

$f_w(x) = wx$

Then the cost function becomes:

$J(w) = \frac{1}{2m} \sum_{i=1}^{m} \left( f_w(x^{(i)}) - y^{(i)} \right)^2$

Goal: Minimize $J(w)$ with respect to $w$ .

Visualizing the Model

Below is a simple plot of the linear function $f_w(x) = x$ , representing the case where $w = 1$ :

Next Steps

To deepen our understanding of the cost function, we will now explore how $J(w)$ behaves as we vary the parameter $w$ . This analysis is crucial for building intuition around optimization techniques such as gradient descent.

Interpreting the Functions

The model function $f_w(x) = wx$ describes how the input $x$ is transformed into a predicted output $y$ using the parameter $w$ . For a fixed value of $w$ , $f_w(x)$ is a function of the input $x$ , meaning the predicted value of $y$ depends directly on the input.
In contrast, the cost function $J(w)$ is a function of the parameter $w$ . It quantifies the error between the predicted values $f_w(x^{(i)})$ and the actual target values $y^{(i)}$ across all training examples. The parameter $w$ determines the slope of the line defined by $f_w(x)$ , and thus directly influences the prediction accuracy and the resulting cost.

Model vs Cost Function

We compare the behavior of the model function $f_w(x) = wx$ and the cost function $J(w)$ for different values of $w$ . The model predicts outputs based on input $x$ , while the cost function evaluates how well the model fits the data.

On the left-hand side, you see the graph of the model function $f_w(x) = wx$ . The red “x” marks represent the observed data points.

When $w = 1$ , the model perfectly fits the data, shown by the green line. On the right-hand side, the cost function $J(w)$ reaches its minimum value, $J(1) = 0$ , represented by the green dot.
When $w = 0.5$ , the model is shown by the blue line, and the cost increases to $J(0.5) = 0.58$ , marked by the blue dot on the cost function graph.
When $w = 0$ , the model becomes a flat purple line, and the cost is even higher. This corresponds to the purple dot on the cost function graph.

Cost Function Table

Below is a table showing the predicted values $f_w(x)$ , the input $x$ , and the corresponding cost $J(w)$ for selected values of $w$ , using the data points $(1,1), (2,2), (3,3)$ . The rows are color-coded to match the lines and dots in the graphs:

$w$	Input $x$	Predictions $f_w(x)$	Cost $J(w)$
0.0	[1, 2, 3]	[0.0, 0.0, 0.0]	2.33
0.5	[1, 2, 3]	[0.5, 1.0, 1.5]	0.58
1.0	[1, 2, 3]	[1.0, 2.0, 3.0]	0.00
1.5	[1, 2, 3]	[1.5, 3.0, 4.5]	0.58
2.0	[1, 2, 3]	[2.0, 4.0, 6.0]	2.33
2.5	[1, 2, 3]	[2.5, 5.0, 7.5]	5.42

Observation

As shown in the plot and table, the cost function $J(w)$ forms a U-shaped curve, with the minimum cost at $w = 1$ , where the model perfectly fits the data. This illustrates the principle behind optimization: finding the parameter $w$ that minimizes the cost.

Cenk Yildiran

Understanding the Cost Function in Linear Regression

Overview

Model Definition

Cost Function

Simplified Case: No Bias Term

Visualizing the Model

Next Steps

Interpreting the Functions

Model vs Cost Function

Cost Function Table

Observation

Like this:

Leave a ReplyCancel reply

About Me

Recent Posts

Newsletter

Understanding the Cost Function in Linear Regression

Overview

Model Definition

Cost Function

Simplified Case: No Bias Term

Visualizing the Model

Next Steps

Interpreting the Functions

Model vs Cost Function

Cost Function Table

Observation

Share this:

Like this:

Leave a ReplyCancel reply

About Me

Recent Posts

Newsletter

Discover more from Cenk Yildiran