Posts

Showing posts from May, 2024

Decision Tree for Classification

Image
  Now, two question arises. Q1 : How do we know that we have get Purity / Pure split?               There are 2 methods to find it out :                    1) Entropy (Good)                    2) Gini Impurity (Better because it's faster) Q2 : How does the features are selected?               Information Gain 1) Entropy : H ( S ) = − ∑ i = 1 c ​ p i ​ lo g 2 ​ ( p i ​ ) Where: 𝑐 c is the number of classes in the output. Example : As you can see in the above diagram, for the component 'Overcast', we have 4 yes, 0 No. Total =4 Now, let's calculate Entropy for this Component. H ( S ) = − p yes ​ lo g 2 ​ ( p yes ​ ) − p no ​ lo g 2 ​ ( p no ​ ) H ( S ) = − 4 4 ​ lo g 2 ​ ( 4 4 ​ ) − 4 0 ​ lo g 2 ​ ( 4 0 ​ )               = -1(0) - 0      ...