I am trying to learn data analysis and machine learning by trying out some problems.
I found a competition "House prices" which is actually a playground competition. Since I am very new to this field, I got confused after exploring the data. The data has 81 columns out of which 1 is the target column which is the house value. This data contains multiple columns where majority of values are "NaN". When I ran:
nulls = data.isnull().sum() nulls[nulls > 0]
This shows the columns with missing values:
LotFrontage 259 Alley 1369 MasVnrType 8 MasVnrArea 8 BsmtQual 37 BsmtCond 37 BsmtExposure 38 BsmtFinType1 37 BsmtFinType2 38 Electrical 1 FireplaceQu 690 GarageType 81 GarageYrBlt 81 GarageFinish 81 GarageQual 81 GarageCond 81 PoolQC 1453 Fence 1179 MiscFeature 1406
At this point I am totally lost and I don't know how to get rid of these "NaN" values.
Any help would be appreciated.