Finding groups of friends in social network data


I have some social network data in which I'd like to identify people who belong to a group of 5 friends. These are users who are friends with 4 other people who all are also friends with each other.

In the graph sense, this task involves identifying components of the graph where each node has an edge between every other node in the component and where the component contains at least 5 nodes.

Is there a well known algorithm to do this? I was looking at measures like centrality, clustering coefficients, and strongly connected components but none of those are quite what I want.


Eric Conner

Posted 2016-04-27T18:19:33.597

Reputation: 141


This is the community detection problem. Here's a solid survey: Community detection in graphs [PDF].

– Emre – 2016-04-27T23:00:20.823



Community Detection and Clique Percolation:

This is a community detection problem. Here is a very detailed review article surveying the state of the art.

The Clique Percolation Method is also useful to explore as it pretty much solves what you may need to know. You can also go through what Matching is, and link the concept with Blossom Algorithm. Though, this finds the maximum matching, you may slightly modify the algorithm to get the minimum matchings of 5 at least.

Syed Ali Hamza

Posted 2016-04-27T18:19:33.597

Reputation: 346

How exactly is matching relevant here? – Valentas – 2016-05-09T11:11:05.967


Not so much an 'algorithm' but Neo4j allows you to query a graph network, and this type of question features quite heavily in their demos. Here is an example of the Cypher language syntax you might have to use in your social network case:

MATCH (a:Person)-[:FRIENDS_WITH]->(b:Person), (b)-[:FRIENDS_WITH]->(c:Person) 
WITH AS Friends, count(c) as MutualFriends    
WHERE = "Ollie" AND NOT (a)-[:FRIENDS_WITH]->(c) AND MutualFriends > 5
RETURN Friends, MutualFriends

The code might not be perfect, but it's for show. This kind of query returns results very quickly with medium to large graphs, and especially so if you can load your data and use indexes to search by name. Depends if you want to start from scratch!

Oliver Frost

Posted 2016-04-27T18:19:33.597

Reputation: 111