- Sheikh Aman

# Logistic Regression in Machine Learning | Easy explanation with code

Updated: Aug 10, 2020

**What is Logistic Regression in Machine Learning?**

Logistic Regression is a technique used by machine learning and it has been borrowed from statistics. It is basically preferred for binary classifications, which means problems related to two class values.

This is a classification algorithm. It is not a regression algorithm as the name says.

It predicts discrete values or two class values like 0/1, T/F, Y/N.

It also estimates the probability of occurring of an event by data fitting to function called logit function. Therefore, it is also called **Logit Regression**. It predicts probability so the output lies between 0 and 1.

**Logistic Regression with example**

Suppose there is a system that detects spam email. So, there are two possibilities YES / NO.

Here logistic regression algorithm performs the task and predicts that if the condition is NO then email is spammed and if the condition is YES then not spammed.

There are many examples out there. This just a simple one if you want more then __click me____.__

**Important terms for Logistic Regression**

The output is predicted using a non-linear function called Logistic Function.

This function appears like big "S" and it changes every value into 0 and 1. you can see the above figure to understand this line more clearly.

Predictions made by this algorithm can also be used as a probability of given data because it gives the output as 0 OR 1.

It works better when all the unrelated attributes of output and similar attributes are removed. (It will be mentioned in Python code)

**Logistic Regression cost function**

**Logistic Function**

**Logistic Regression in Python**

In this given dataset there is information like EstimatedSalary, Purchased, UserID, Gender, Age. We will use this to predict whether a user will buy the company's newly launched product or not.

**Logistic Regression Data set**

Here I have used this dataset. If you wanna try your own then __click me__ to get the dataset for free.

**Logistic Regression Python code**

**Libraries used**

```
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
```

**Loading dataset – User_Data**

`dataset = pd.read_csv("../.../User_Data.csv")`

**Checking dataset**

`dataset.head()`

**Output**

**Extracting depending And independent variables**

```
# input
x = dataset.iloc[:, [2, 3]].values
# output
y = dataset.iloc[:, 4].values
```

**Splitting Dataset into Train and Test**

```
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test= train_test_split(x, y, test_size= 0.25, random_state=0)
```

**Feature Scaling**

```
from sklearn.preprocessing import StandardScaler
st_x= StandardScaler()
x_train= st_x.fit_transform(x_train)
x_test= st_x.transform(x_test)
```

**Fitting Logistic Regression to the training set**

```
from sklearn.linear_model import LogisticRegression
classifier= LogisticRegression(random_state=0)
classifier.fit(x_train, y_train)
```

**Output**

**Predicting the test set**

`y_pred= classifier.predict(x_test) `

**Creating Confusion Matrix**

```
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
print ("Confusion Matrix : \n", cm)
```

**Output**

**Measuring accuracy**

```
from sklearn.metrics import accuracy_score
print ("Accuracy : ", accuracy_score(y_test, y_pred))
```

**Output**

**Visualizing the performance of the model.**

```
from matplotlib.colors import ListedColormap
X_set, y_set = x_test, y_test
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1,
stop = X_set[:, 0].max() + 1, step =
0.01),
np.arange(start = X_set[:, 1].min() - 1,
stop = X_set[:, 1].max() + 1, step =
0.01)) plt.contourf(X1, X2, classifier.predict(
np.array([X1.ravel(), X2.ravel()]).T).reshape(
X1.shape), alpha = 0.75, cmap = ListedColormap(('red',
'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
```

```
for i, j in enumerate(np.unique(y_set)):
plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Classifier (Test set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()
```

**Output**

**Conclusion**

Here you have spent your quality time. You have now basic knowledge about logistic regression, examples, equations, codes in python. Have a great time and thank you for giving you valuable time.