how can i collect data set from social networks like instagram?


My name is Reza, a Master Student at Golestan University of Technical and engineering, Iran and my M.Sc. thesis is about detect fake account on social networks. so i need data set. how can i collect data set from social networks like instagram????


Posted 2017-10-03T18:51:51.500

Reputation: 11

1What have you done so far? What have souy searched? How do you define a "fake account"? Have you even thought whether you could get training data before jumping in the research for your thesis? – tagoma – 2017-10-03T19:06:40.387

I am in the step of preparing data and I read several articles about this. fake account is not human account(its robot). – Reza – 2017-10-04T07:52:01.577



I am not aware about Instagram, or if there is a ready-to-use dataset for your purposes, but in general in order to get data from (almost) any online source you should a method called "web scraping" (This might help you while you google).

It might also be useful to you, if you decide what kind of tools you are going to use for your analysis.

I have tried something simple, but it was for Twitter though

Here is a rather simple piece of code that gets twitter 500 tweets concerning Barbie using R



ck <- "4bKJU1Z4B60SzbPPIva7OknnQ"
cs <- "cCxAPYDN3d0APvlurEBoXMNCKl0DKgrneV0w6KQCQenNpmM7fa"
at <- "715872051559575552-qMYaugTPHJeRg9fh70ABNWCjdl0Lp6T"
as <- "zxn6lOyCw2dcPyllkiHJq4zk0wHwGQ4tuKa9IpR96TNvO"

setup_twitter_oauth(ck, cs, access_token = at, access_secret = as)

t_stream <- searchTwitter('barbie', resultType="recent", n=500)

df <-"rbind", lapply(t_stream,

The ck, cs, at and as can be obtained as it is explained here in step 2

Also in that site you can see a guide of how to web scrap data from Twitter using Python.

You might also find useful this (Python), this (Python) and this (R)


Posted 2017-10-03T18:51:51.500

Reputation: 303


I'm no expert. But not only Instagram, other social networks has an api that allows you to obtain users activity. I would suggest going with collecting the activity data for a set of actual users and some bots(which as of now you have to find it yourself). This would be enough to make a dataset. Then, proceed with an algorithm of your choice. Hope, this helps.

Santhosh Kumar M

Posted 2017-10-03T18:51:51.500

Reputation: 111