Posts

Showing posts from July, 2025

ML Notes Personal2

Image
Handling Mixed Variables | Feature Engineering Mixed variable = numeric + categorical data in a single cell/column cell: solution: column :  solution: Handling date and time related data: Handling missing values in dataset Solution 1 : CCA - Complete Case Analysis remove entire row if any of the column has null value in that column. when to use this? - lets say we have 1000 rows, for 50 rows, age column has missing value. we will remove these 50 rows only if they are at random space, not like all or them are in top or bottom, this is called MCAR (Missing Completely At Random)  - when we have <5% missing data combining all cols - This same can work for column too, lets say one column has 95% missing data, just remove that column Example :  remove the rows where these columns contains missing values.  Solution 2: mean/median Imputation When the data is spread properly and in center, use mean If skewed from left or right, go with median you can use fillna() function....