Im working on selecting most effective features from a dataset with over that 2000 features. Im using different algorithms for that (selectKBest with chi-square, Extra Trees, Correlation etc.) But when I look the features ranking I saw that selectKBest with chi-square are generating excatly same results as Correlation. Is it possible or am I doing something wrong?
My all features consist of 64bit float continuous numbers, between [-8,11] and my target column is binary which can be only 0 or 1.
Updated on 05.09.19: I am STILL searching how can it be possible? I mean I can guess that both methods based on same formula and developed by same person but I need a proof for understand that clearly.
cor = data.corr() # Class is my target column cor_target = abs(cor["Class"]) # Want to get correlation values for every feature without target column relevant_features = cor_target[cor_target > 0].drop(labels=["Class"]) #Top 1000 features relevant_features = pd.Series(relevant_features, index=data.columns).nlargest(1000).index.values
bestfeatures = SelectKBest(score_func=chi2, k="all") fit = bestfeatures.fit(dataValues, dataTargetEncoded) feat_importances_chi = pd.Series(fit.scores_, index=dataValues.columns).nlargest(1000).index.values
And the result relevant_features and feat_importances_chi have excatly same results.