In graph clustering, we want to cluster the nodes of a given graph, such that nodes in the same cluster are highly connected (by edges) and nodes in different clusters are poorly or not connected at all.

A simple (hierarchical and divisive) algorithm to perform clustering on a graph is based on first finding the minimum spanning tree of the graph (using e.g. Kruskal's algorithm), $T$. It then proceeds in iterations. At each iteration, we remove from $T$ the edge with the highest weight. Given that $T$ is a tree, the removal of an edge from $T$ will create a forest (with connected components). So, after the removal of the edge of highest weight from $T$, we will have two connected components. These two connected components will represent two clusters. So, after one iteration, we will have two clusters. At the next iteration, we remove the edge with the second highest weight, and this will create other connected components, and so on, until, possibly, all nodes are in their own cluster (that is, all edges have been removed from $T$).

There are several limitations of this algorithm. For example, it only considers the edges of the initial graph that are shared with $T$ (that is, it only considers the edges of the minimum spanning tree). It also requires the edges of the graph to be weighted. It does not require the number of clusters to be known in advance (like any other hierarchical clustering algorithm), but we still need to choose the optimal number of clusters (after the algorithm has terminated). We can do that in several ways. A way to do it would be to have a threshold value $t$ that is used to decide when we should stop removing edges from $T$: more specifically, we will keep removing edges from $T$ until the next highest weight is higher than this threshold $t$.

There are numerous applications of this type of clustering. For example, we might want to discover certain groups of people in social networks.

The paper Graph clustering (by Satu Elisa Schaeffer, 2007) provides a readable and quite detailed overview of this field.

There are algorithms based on k-means that can also work on graphs. See e.g. Graph-based k-means Clustering: A Comparison of the Set Median versus the Generalized Median Graph (by Ferrer et al.).

1

I could have asked this question on https://cs.stackexchange.com/ and it would have likely been a more appropriate action (given the generality of this question), but, given that this is also related to unsupervised learning and thus ML and AI, I decided to ask it here, in order to enrich our website.

– nbro – 2019-03-20T17:02:50.5531

Not to be confused with clustering multiple different graphs based on e.g. their edit distance or geometric properties, as in http://www.eurocg2019.uu.nl/papers/45.pdf

– Discrete lizard – 2019-03-21T11:57:29.0801@Discretelizard Yes, this is also noted in the paper I mentioned in my answer (maybe I should have pointed this out in my answer). – nbro – 2019-03-21T14:11:22.150

@Discretelizard Nice! (We'd welcome a second answer, hint hint:) – DukeZhou – 2019-03-21T18:54:44.330

@DukeZhou Well, eh I could definitely tell something about clustering multiple graphs instead of the points in them, but that does not seem to be what this question is about. – Discrete lizard – 2019-03-21T18:57:50.180

In addition to the methods mentioned by nbro for clustering the nodes of a graph,"spectral clustering" for non-graph data conceptually proceeds by assembling a graph the nodes of which are the data points. It then uses the highest-eigenvalue eigenvectors of a diffusion operator on thisgraph to assign nodes to clusters. For data that is already in graph form, a procedure that skips the first part might be of interest. – tsbertalan – 2019-06-23T19:38:22.333