• Sheikh Aman

Logistic Regression in Machine Learning | Easy explanation with code

Updated: Aug 10, 2020

What is Logistic Regression in Machine Learning?

Logistic Regression is a technique used by machine learning and it has been borrowed from statistics. It is basically preferred for binary classifications, which means problems related to two class values.

This is a classification algorithm. It is not a regression algorithm as the name says.

It predicts discrete values or two class values like 0/1, T/F, Y/N.

It also estimates the probability of occurring of an event by data fitting to function called logit function. Therefore, it is also called Logit Regression. It predicts probability so the output lies between 0 and 1.

Logistic Regression with example

Suppose there is a system that detects spam email. So, there are two possibilities YES / NO.

Here logistic regression algorithm performs the task and predicts that if the condition is NO then email is spammed and if the condition is YES then not spammed.

There are many examples out there. This just a simple one if you want more then click me.

Important terms for Logistic Regression

  • The output is predicted using a non-linear function called Logistic Function.

  • This function appears like big "S" and it changes every value into 0 and 1. you can see the above figure to understand this line more clearly.

  • Predictions made by this algorithm can also be used as a probability of given data because it gives the output as 0 OR 1.

  • It works better when all the unrelated attributes of output and similar attributes are removed. (It will be mentioned in Python code)

Logistic Regression cost function

Logistic Function

Logistic Regression in Python

In this given dataset there is information like EstimatedSalary, Purchased, UserID, Gender, Age. We will use this to predict whether a user will buy the company's newly launched product or not.

Logistic Regression Data set

Here I have used this dataset. If you wanna try your own then click me to get the dataset for free.

Logistic Regression Python code

Libraries used

import pandas as pd 
import numpy as np
import matplotlib.pyplot as plt

Loading dataset – User_Data

dataset = pd.read_csv("../.../User_Data.csv")

Checking dataset



output of previous line

Extracting depending And independent variables

# input 
x = dataset.iloc[:, [2, 3]].values 
# output 
y = dataset.iloc[:, 4].values 

Splitting Dataset into Train and Test

 from sklearn.model_selection import train_test_split 
 x_train, x_test, y_train, y_test= train_test_split(x, y, test_size= 0.25, random_state=0) 

Feature Scaling

from sklearn.preprocessing import StandardScaler    
st_x= StandardScaler()    
x_train= st_x.fit_transform(x_train)    
x_test= st_x.transform(x_test)

Fitting Logistic Regression to the training set

from sklearn.linear_model import LogisticRegression  
classifier= LogisticRegression(random_state=0)  
classifier.fit(x_train, y_train)


Output of previous line

Predicting the test set

y_pred= classifier.predict(x_test)  

Creating Confusion Matrix

from sklearn.metrics import confusion_matrix 
cm = confusion_matrix(y_test, y_pred) 
print ("Confusion Matrix : \n", cm) 


Output of previous line

Measuring accuracy

from sklearn.metrics import accuracy_score 
print ("Accuracy : ", accuracy_score(y_test, y_pred)) 


Output of previous line

Visualizing the performance of the model.

from matplotlib.colors import ListedColormap 
X_set, y_set = x_test, y_test 
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, 
                               stop = X_set[:, 0].max() + 1, step = 
                      np.arange(start = X_set[:, 1].min() - 1,
                               stop = X_set[:, 1].max() + 1, step = 
                               0.01))                       plt.contourf(X1, X2, classifier.predict( 
             np.array([X1.ravel(), X2.ravel()]).T).reshape(
             X1.shape), alpha = 0.75, cmap = ListedColormap(('red', 
plt.xlim(X1.min(), X1.max()) 
plt.ylim(X2.min(), X2.max()) 
  for i, j in enumerate(np.unique(y_set)):
      plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
                  c = ListedColormap(('red', 'green'))(i), label = j) 
plt.title('Classifier (Test set)') 
plt.ylabel('Estimated Salary') 


output of previous line


Here you have spent your quality time. You have now basic knowledge about logistic regression, examples, equations, codes in python. Have a great time and thank you for giving you valuable time.

80 views1 comment

Something Interesting



Subscribe to Our Newsletter
Copyright © 2020 MR. Machine. All Rights Reserved
  • Facebook
  • Twitter
  • Pinterest
  • Instagram