I have a set of variables that I want to use for a regression or a classification problem. Having computed the correlation matrix of these variables, I discovered that some of them has an inter-variables Pearson correlation values as high as 1.
- Does this mean that these variables hold redundant information for the learner?
- Is it safe to remove one of them without risking information loss? if yes, how to chose the one to remove?