Machine Learning

Posts

Showing posts from June, 2024

Random Forest

- June 25, 2024

The core idea behind Random Forest is to create a "forest" of decision trees, each built on different subsets of the data and using a random subset of features. It's used for both classification (like spam detection) and regression (like predicting house prices). A type of bagging that uses decision trees to improve prediction accuracy and robustness. How Does It Work? Data Sampling : Randomly pick samples from your data. Tree Building : Build decision trees on these samples. Voting/Averaging : For classification, trees vote for the most common class. For regression, the average of predictions is taken. Example: Predicting House Prices Imagine we want to predict house prices based on features like size, location, and age of the house. Step-by-Step Collect Data : Gather data on house prices along with features like size, location, and age. Create Subsets : Randomly create multiple subsets of this data. Build Trees : For each subset, build a decision tree. Each tree might ...

Ensembling Techniques

- June 19, 2024

Ensembling is a technique in machine learning where multiple models are combined to make better predictions than any single model could achieve on its own. The idea is that by using multiple models, you can reduce errors and improve accuracy. Suppose, you are confused about choosing science or commerce for higher studies. So, you seek advice. Possible Ways: A: You may ask one of your friends. Here, you are likely to make decision based on one person's belief and pov. Moreover, it might be the case that your friend has chosen science, and he wants you to choose the same, so that you both can be together. B: Another way could be by asking 5 classmates of yours. This should provide a better idea. This method may provide a more honest opinion as you are getting multiple perspectives. However, the problem still exists if most of your classmates have a similar background or bias. C: How about asking 50 people? Now, if you ask 50 people, including students, teachers, and ...

Decision Tree For Regression

- June 05, 2024

What is a Decision Tree for Regression? They help us predict continuous values. How Does It Work? Start at the Root : Begin with the entire dataset at the root node. Split the Data : Choose the best feature to split the data into two groups. The goal is to minimize the error in each group. Repeat : Continue splitting each group until you meet a stopping criterion (like a maximum depth or minimum samples per leaf). Make Predictions : The value at each leaf node is the predicted value for data points that fall into that leaf. Example with Calculation Let's take a simple example to illustrate how a decision tree works for regression. Dataset Consider a small dataset of house prices based on the size of the house, first, sort the data , By sorting the data and evaluating potential split points, a decision tree for regression accurately predicts continuous values. : House Size (sq ft) Price (in $1000) 1100 199 1400 245 1425 319 1550 219 1600 312 1700 279 1700 255 1875 308 2350 405 2450 ...