Automatically assess training data quality for land cover classification system


I am working on a Land cover classification system, wherein, Sentinel-hub imagery is being used to categorize the land cover by using a time series of multispectral imagery. Training data is being captured manually by drawing polygons of a particular Land cover type (using QGIS).

Again, since the polygons are being captured manually, there is a slight chance that a polygon has been labeled incorrectly or it comprises of more than 2 land cover classes. I need a way to filter out the polygons which have a high probability of being mislabelled so that the prediction-stage model receives only the most reliable samples for training. So in a way, it does an Auto Quality Check. Are there any metrics or approaches (preferably unsupervised) that can be used for this use-case?

Faiz Kidwai

Posted 2019-06-18T11:47:49.450

Reputation: 215

No answers