• Sheikh Aman

# Naive Bayes Classifier Tutorial. | Step by Step in Python.

## A Brief Introduction.

FBI !! THERE IS A SITUATION.

Suppose you are working on a classification problem have only one hour left with your project and your model is not giving proper accuracy, or your boss wanna see the first approach of the model in one hour and given a large number of the dataset. What would you do?

Well if I were in your position I would have chosen Naive Bayes. It is an extremely fast rather than other classification algorithms. The magic here is it works Bayes theorem of probability to predict the class of unknown data.

So, Relax because whenever you will face this kind of situation next time you will figure out the solution easily. In this article, I will discuss all the basics as well as an important question, topics related to Naive Bayes.

Without wasting any time, Grab your cup of coffee take a deep breath and let's begin our journey.

## Topic you will learn now!!

• What is Naive Byes Algorithm?

• How Naive Bayes Algorithm works?

• What are the Pros and Cons?

• Application of Naive Bayes Algorithm.

• Naive Bayes code in Python using Scikit-Learn.

• Endnote.

## What is Naive Byes Algorithm?

First thing first. Bayes theorem has been named after Sir Thomas Bayes form 1700s. It works on the concept of conditional probability as expressed by Bayes Theorem.

Let us undergo a number of the straightforward concepts of probability that we'll use. Consider the subsequent example of tossing two coins. If we toss two coins and appearance in the least of various possibilities, we have the sample space as {HH, HT, TH, TT}

While calculating the maths on probability, we usually denote probability as P. a number of the possibilities during this event would be as follows:

The probability of getting two heads = 1/4

The probability of at least one tail = 3/4

The probability of the second coin being head given the first coin is tail = 1/2

The probability of getting two heads given the first coin is head = 1/2

The Bayes theorem gives us the conditional probability of event A, as long as event B has occurred. during this case, the primary coin toss is going to be B and therefore the second coin toss A. this might be confusing because we've reversed the order of them and go from B to A rather than A to B.

According to Bayes Theorem.

Now Let me give you a one more practical example

of Naive Bayes application of in real life. Suppose you are out of money and you need money for you any kind of personal need so you apply for a loan. In this case, the loan giving company will decide that you are an eligible applicant or not depending on your previous loan, age, income, transaction history, Qualification, location etc. Although these features are independent still these features are considered independently. This assumption on the basis of your features simplifies computation and that's why it is a Naive approach. This assumption also called class conditional independence.

## How Naive Bayes Algorithm works?

Now it's time to understand the Naive Bayes Algorithm with an example. Below we will discuss the conditions of playing outside based on the weather condition. Here, we will calculate of playing sports i.e whether players will play outside or not based weather condition.

Our approach is categorised into two types :

• In the case of a single feature.

• In case of multiple features.

### In the case of a single feature.

Steps for calculation of the probability of an event :

• Step 1: We have to calculate the prior probability of a given class.

• Step 2: Then find likelihood probability with each attribute of each class.

• Step 3: Now put the values in the Bayes formula and then calculate posterior probability.

• Step 4: Now observe which class has a higher probability.

Frequency table and likelihood table helps you to calculate the prior and posterior probability.

Now, we will calculate the probability of playing outside when the weather is overcast.

Probability of playing outside.

P(Yes | Overcast) = P(Overcast | Yes) P(Yes) / P (Overcast) .....................(1)
Calculation of Prior Probabilities: P(Overcast) = 4/14 = 0.29 P(Yes)= 9/14 = 0.64
Calculation of Posterior Probabilities: P(Overcast |Yes) = 4/9 = 0.44
Putting Prior and Posterior probabilities in equation (1) P (Yes | Overcast) = 0.44 * 0.64 / 0.29 = 0.98(Higher)

### Similarly, we will calculate the probability of not playing outside in overcast.

P(No | Overcast) = P(Overcast | No) P(No) / P (Overcast) .....................(2)
Calculation of Prior Probabilities: P(Overcast) = 4/14 = 0.29 P(No)= 5/14 = 0.36
Calculation of Posterior Probabilities: P(Overcast |No) = 0/9 = 0
Putting Prior and Posterior probabilities in equation (2) P (No | Overcast) = 0 * 0.36 / 0.29 = 0

Here, you can see the probability of yes is higher. So we can conclude if the weather if overcast players will play.

### In case of multiple features.

Now, let's make it more interesting. Here we will calculate the probability when the weather is overcast and the temperature is mild. Cool right!! Let's go for the calculation.

### Probability of playing outside:

P(Play= Yes | Weather=Overcast, Temp=Mild) = P(Weather=Overcast, Temp=Mild | Play= Yes)P(Play=Yes) ..........(1)
P(Weather=Overcast, Temp=Mild | Play= Yes)= P(Overcast |Yes) P(Mild |Yes) ………..(2)
Calculation of Prior Probabilities: P(Yes)= 9/14 = 0.64
Calculation of Posterior Probabilities: P(Overcast |Yes) = 4/9 = 0.44 P(Mild |Yes) = 4/9 = 0.44
Putting Posterior probabilities in equation (2) P(Weather=Overcast, Temp=Mild | Play= Yes) = 0.44 * 0.44 = 0.1936(Higher)
Putting Prior and Posterior probabilities in equation (1) P(Play= Yes | Weather=Overcast, Temp=Mild) = 0.1936*0.64 = 0.124

### Similarly, for not playing:

P(Play= No | Weather=Overcast, Temp=Mild) = P(Weather=Overcast, Temp=Mild | Play= No)P(Play=No) ..........(3)
P(Weather=Overcast, Temp=Mild | Play= No)= P(Weather=Overcast |Play=No) P(Temp=Mild | Play=No) ………..(4)
Calculation of Prior Probabilities: P(No)= 5/14 = 0.36
Calculation of Posterior Probabilities: P(Weather=Overcast |Play=No) = 0/9 = 0 P(Temp=Mild | Play=No)=2/5=0.4
Putting posterior probabilities in equation (4) P(Weather=Overcast, Temp=Mild | Play= No) = 0 * 0.4= 0
Putting prior and posterior probabilities in equation (3) P(Play= No | Weather=Overcast, Temp=Mild) = 0*0.36=0

Here also the probability of yes is higher. So, we can conclude that the probability of playing during overcast is high.

## Let's discuss the pros and cons.

### Pros:

• It is easy and fast to predict the category of the test data set. It also performs well in multi-class prediction

• When the idea of independence holds, a Naive Bayes classifier performs better compared to other models like logistic regression and you would like less training data.

• It performs well just in case of categorical input variables compared to a numerical variable(s). For a numerical variable, the conventional distribution is assumed (bell curve, which may be a strong assumption).

### Cons:

• If a categorical variable contains a category (in the test data set), which wasn't observed in training data set, then the model will assign a 0 (zero) probability and can be unable to form a prediction. this is often referred to as “Zero Frequency”. to resolve this, we will use the smoothing technique. one among the only smoothing techniques is named Laplace estimation.

• On the opposite side, naive Bayes is additionally referred to as a nasty estimator, therefore the probability outputs from predict_proba aren't to be taken too seriously.

• Another limitation of Naive Bayes is that the assumption of independent predictors. In the real world, it's almost impossible that we get a collection of predictors which are completely independent.

## Application and Implementation.

• Naive Bayes is a very eager learner process and that's why it's is fast. So we can use this for predictions in real-time.

• It is also good for multiclass prediction. So, we can predict the multiple class of given target variable.

• Used in text classification like spam filtering and sentiment analysis.

• Used to build a recommendation system.

## Naive Bayes code in Python using Scikit-Learn.

Again, scikit learn (python library) will help here to create a Naive Bayes model in Python. There are three sorts of Naive Bayes model under the scikit-learn library:

Gaussian: it's utilized in classification and it assumes that features follow a traditional distribution.

Multinomial: it's used for discrete counts. for instance, let’s say, we've a text classification problem. Here we will consider Bernoulli trials which is one step further and rather than “word occurring within the document”, we've “count how often word occurs within the document”, you'll consider it as “number of times outcome number x_i is observed over the n trials”.

Bernoulli: The binomial model is beneficial if your feature vectors are binary (i.e. zeros and ones). One application would be text classification with ‘bag of words’ model where the 1s & 0s are “word occurs within the document” and “word doesn't occur within the document” respectively.

Here we will use the Gaussian technique but don't worry about the remaining technique it is arriving soon. So, I will suggest you subscribe to my mailing list present in the footer section.

Now. BOOM!!! PYTHON CODE

Here applied Gaussian technique. I think you have observed that I haven't done much optimization with the dataset. That is also an important topic and cover this in a different article.

## Let's end here

I think this is enough for Naive Bayes. Here you have got your basic concept of how it works, it's an application, maths behind this, Python code etc. I hope you have loved it so show your love by hitting the love button share it. Remember sharing knowledge is also a part of success.

See All