Logistic Regression
Videos:
- Binary Classification
- Logistic Regression
- Logsitic Regression Cost Function
What is logistic regression?
What is logistic regression? Simply put, logistic regression is an algorithm for binary classification, it uses a sigmoid activation function to output a probability score between 0 and 1.
In deep learning, logistic regression can be seen as a building block for neural networks, where it serves as the output layer for binary classification tasks. Logistic regression is also used in combination with other layers and activations to create more complex models, such as multiclass classification and object detection.
Relating back to notation and our cat identification example, logistic regression is used to output a 0 or 1 based on the input images.
The Math 🌝
First, a couple symbols:
- ∈ - means “belongs to”
- ℝ - means “real number”
- ŷ - (“y hat”) - an estimation of the relation between
x
andy
. - P - Probability
- 𝛔 - sigmoid function.
Building on the binary classification idea + notation, we have logistic regression.
In logistic regression, the basic concept looks like this:
Given X, we want ŷ = P(y=1 | x)
x ∈ ℝᴺˣ
So in binary classification, we want to find ŷ
, in which ŷ
is a probability.
To find ŷ
, we have parameters:
w ∈ ℝᴺˣ, b ∈ ℝ
I’m not sure what exactly this means, but w
and b
are our main paramaters. These represent weights
and biases
, basically the internal workings of our neural network.
w
is an nx
dimensional vector (like, one chunk of a cat image) and b
is a real number. Real numbers includes a vast range of numbers, including positive and negative numbers, integers, fractions, decimals, and many others.
What we actually want for our logistic regression algorithm is this:
ŷ = 𝛔(wᵀx + b)
That is to say: ŷ
is equal to the result of the sigmoid function of wᵀx
+ b
.
Loss Function (ℒ)
The loss function (sometimes referred to as just ℒ
) is also referred to as the error function.It’s a mathemetical function that calculates the amount of error between an expected output (y
) and an output given by a neural network. Basically an accuracy score for your neural network, for one piece of data in a training set.
Generally speaking, the goal is to minimize the loss function during training by adjusting the model’s parameters, so that the model can make better predictions on unseen data.
In logistic regression, the loss function we use looks like this:
ℒ(y, ŷ) = -(y * log(ŷ) + (1 - y) * log(1 - ŷ))
Let’s break that down:
y
is the true binary label (0 or 1) of the input sample, the ground truth.ŷ
is the estimated probability of the input sample belonging to1
(not 0), as output by the logistic regression model.log
denotes the natural logarithm, inverse of exponential.
Cost Function
Cost function (somtimes just J
) is similar in some ways to loss function, but not the same! Cost function is in regards to the training data set as a whole. It measures how the model is doing as a whole. It may be, for example, the average of the loss functions for a training set.