Logistic Regression:From Scratch

Akshar Rastogi
2 min readJun 29, 2021

Statutory Warning- Logistic Regression is a Classification Algorithm!

Just because there is a word Regression in it’s name it dosen’t mean i

Logistic Regression is a classification algorithm it is an of advancement to overcome a shortcoming of linear regression.Linear Regression is used to find regression results but what if we have a problem where we need to classify in favour(1) and non-favour(0). An simple idea which comes into mind is to just set a threshhold .The problem which comes here is upon adding of more data points the threshold value will shift .Hence to avoid this we use an algorithm called the Logistic Regression which is a binary classification algorithm to stepover these practical problems that hold back Linear Regression for classification.

Logistic Regression is a function which gives output between zero and one but not absolutely zero and one. Logistic Regression is used to get Binary Output results. Logistic Regression performs a logistic transformation on the linearily esatblished relationship in order to achieve binary results. This below figure will give you what logistic regression does behind the scene.

Logistic Regression Behind the Scenes.

Math Behind Logistic Regression

This equation below shows

y = e^(b0 + b1*x) / (1 + e^(b0 + b1*x))

Given,

y = Output/Dependent Variable/ Prediction

x = Input values/ Prediction/ Independent Variable

b0 = Intercept Terms

b1 = co-efficient of input values

From the above stated equation we can say that,

P(X) = e^(b0 + b1*x) / (1 + e^(b0 + b1*x))

Multiplying with inverse power on both numerator and denominator

P(X) = 1 / (1 + e^-(b0 + b1*x))

Multiplying Denominator on both sides we get,

(1 + e^(b0 + b1*x))*P(X) = e^(b0 + b1*x)

Upon solving this equation further we get,

e^(b0 + b1*x) = P(X)/(1-P(X))

Taking log on both sides

b0 + b1*x = ln(P(X)/(1-P(X)))

  • Logit = ln(P(X)/(1-P(X)))
  • ODDS = P(X)/(1-P(X)

Speaking in a Broader Sense,

Logistic Regression = Sigmoid(Gradient Descent)

How to Maximize Logistic Regression Performance

  • The output variables must always be binary because Logistic Regression assumes output as Binary.
  • Logistic Regression assumes there exist no noise ,so remove all the noise from data.
  • Logistic Regression assumes Gaussian Distribution and there exist a linear relationship between our input and output variable.
  • Remove all the correlated input variables. If correlated input variables occur model can highly overfit.

--

--