Unsupervised Learning::Satellite Images::Single Bands


Has anyone has success with building models using KMeans for classification? I have images that only have one band and it continues to fail. My guess is that the issue is with both size of the image as well as the single band.

For example:

from osgeo import gdal,gdal_array
import numpy as np

src = '/Path/ImgA.TIF'
img_A = gdal.Open(src)

#Getting bands (count)
bands_n = img_A.RasterCount  #returns 1

band = img_A.GetRasterBand(1)

#read as array
band_arr = band.ReadAsArray()

band_sh = band.shape

#eg. output#

//This is where I am getting stuck. If I pass only two inputs (rows/cols) to KMeans, it fails as it requires a 2D array not 1D. It also fails when I manually set the band://

#Attempt 1
rows, cols = band_sh

#Attempt 2
rows, cols, band = band_sh, 1 

X = band.reshape(rows*cols,band) // X = band.reshape(rows*cols,1)

from scikit.cluster import KMeans

kmeans = KMeans(n_classes = 2, random_state=2).fit(X)

Any ideas? This works just fine with RGB but fails each time when dealing with rasters that are single band.


Posted 2019-12-11T08:45:23.387

Reputation: 133



it fails as it requires a 2D array not 1D

The X in .fit(X) is 2D, where each row is one training point and the columns are the features of that training point.

It should be a list of lists:

X = [
  [feature_1, feature_2,....feature_n], # training point 1
  [feature_1, feature_2,....feature_n]  # training point 2
  [feature_1, feature_2,....feature_n]  # training point 123

The K-Means will train with all the examples in X.

However, you are only giving 1 training example, namely img_A. So your X is like this at the moment:

X = [feature_1, feature_2,....feature_n] # img_A

So, to .fit() K-Means, you need more image examples.

Bruno Lubascher

Posted 2019-12-11T08:45:23.387

Reputation: 2 833

Negative-O Bruno, but you are on to something. The issue is that X needed to be adjusted to account for 1 band. This was achieved by flattening the array (`.ReadAsArray') ((-1,1)). Now I just need to figure out how to create a label dataset to run and compare a supervised model against...hmmm – OctoCatKnows – 2019-12-12T10:00:02.567