Naive Bayes Algorithm

- May 01, 2024

It is used to solve classification problems.

This Algorithm is based on Bayes theorem, but before understanding bayes theorem, let's revise some probability concepts:

Independent Event :

ex, rolling a dice

Prob. of 1 on top : 1/6

Prob. of 2 on top : 1/6

Prob. of 5 on top : 1/6

Here, all events are independent from each other.

Dependent Event :

ex, A box has 3 red balls and 2 blue balls.

Probability of taking a Red ball : 3/5

Probability of taking a Blue ball : 2/4

As one ball is removed, for the next event, we reduces the total no. of balls by 1. Since, Dependent Events.

P(A and B) = P(A)P(B|A)

We know that,

P(A and B) = P(B and A)

P(A)P(B|A) = P(B)P(A|B)

Baye's Theorem :

$P(A\mid B)=\frac {P(B\mid A) \cdot P(A)}{P(B)}$

	=	events
	=	probability of A given B
	=	probability of B given A
	=	the independent probabilities of A and B

For,

Dependemt Features : X1,X2,......Xn

Independent Feature : y

here, base is same for P(yes/Xi) and P(No/Xi) and its constant, so we can ignore it

These probabilities involves multiplying probabilities of individual features. In such cases, the probabilities $𝑃 (Yes ∣ 𝑋_{𝑖})$ and $𝑃 (No ∣ 𝑋_{𝑖})$ may not sum up to 1 directly because they are computed independently.

So, we need to normalize it,

Example :-

Qs : If the weather is Sunny & Hot, will he play tennis or not?

Here, as our o/p column has values yes and no, it is a binary classification problem.

Basically we need to find P(Yes | sunny,Hot) and P(No | sunny,Hot)

P(Yes | sunny,Hot) = P(Yes) P(Sunny | Yes)P(Hot | Yes)

P(No | sunny,Hot) = P(No) P(Sunny | No)P(Hot | No)

Total No. of Yes in PlayTennis = 9

Total No. of No in PlayTennis = 5

Total : 9+5 = 14

P(Yes) = 9/14 P(No) = 5/14

P(Sunny | Yes) = 2/9

P(Sunny | No)=3/5

P(Hot | Yes) = 2/9

P(Hot | No) = 2/5

So, now put these values in our equation :

P(Yes | sunny,Hot) = P(Yes) P(Sunny | Yes)P(Hot | Yes)

= (9/14)(2/9)(2/9)

= 0.0317

P(No | sunny,Hot) = P(No) P(Sunny | No)P(Hot | No)

= (5/14)(3/5)(2/5)

= 0.0857

Now, Let's normalize this :

P(Yes | sunny,Hot) = 0.0317 / (0.0317 + 0.0857)

= 0.27

P(No | sunny,Hot) = 0.0857 / (0.0317 + 0.0857)

= 0.73

Here,

P(No | sunny,Hot) > P(Yes | sunny,Hot)

So, He will not play Tennis if it's Sunny & Hot.

Search This Blog

Machine Learning

Naive Bayes Algorithm

Comments

Post a Comment

Popular posts from this blog

Extracting Tables and Text from Images Using Python

Positional Encoding in Transformer

Chain Component in LangChain