Let me start with reverse order which feature extraction and why there is need of feature selection and dimensionality reduction.
Starting with the usage of feature extraction which is mainly for classification purposes. The classification is the process of making a decision on which category particular object belongs. It has two phases i) training phase, where given the data or objects their properties are learned using some process (feature extraction) ii) testing phase, where the unknown object is classified using the features learned in the previous (training) phase.
Feature extraction as the name suggests given the data aim is to find the underlying pattern. This underlying pattern which is term as feature corresponding to that respective data. There are various methodologies existing for feature extraction such as Support Vector Machine(SVM).
Now, feature extraction should generate features which should be
- optimal set of features
Feature Selection: A specific set of data can be represented either by a single feature or set of features. In the classification process, a system is trained for at least two classes. So the training system will either generate a single feature or set of features. These features should possess the properties stated above.
The problem comes when there is a feature set for each class and there exists correlation between some of the features. That implies among those correlating features one or few are sufficient for representation and that is where feature selection comes in to picture. Also, these features need to be stored with the increase in feature set memory requirement also increases.
Then comes the dimensionality reduction which is nothing but the part of feature selection process. It is the process of choosing the optimal set of features which best describe the data. There are many techniques for the same such as principal component analysis, independent component analysis, and matrix factorization etc.