If a time series has random time events, how to detect patterns?



My app receives messages with a random number of bits at a random time. But two weeks ago I started to notice some almost regular patterns on the metrics of my app. I suspect they are some bots sending artificially generated data to my app. Specifically, I'm looking for sequential subsets of messages in a time series where messages has almost the same number of bits.

I read about some methods but they use data where time is not a random variable. I appreciate any help you can provide, including books, web pages, tutorials (in Python if possible), etc.


Posted 2016-05-23T20:33:17.557

Reputation: 143

I was looking for a solution and I found in the book Bayesian Methods for Hackers an example "Inferring Behavior from Text-Message Data". Maybe what I need to find the switchpoint in the time series. Like in this question in stackoverflow. What do you people think? Is there another method?

– jocerfranquiz – 2016-05-24T21:42:05.487

Welcome to Datascience.SE! It's not so much a change detection problem as an anomaly detection problem. Here is a presentation.

– Emre – 2016-05-26T07:05:53.970



As a first step, to segregate the messages that appear to be a bot, you could first try binning by message size. For example, if messages sent by bots are likely to be around 128 bytes to 140 bytes, assign these to a unique bin.

Next, create a time series based on this bin. Try to decompose the time series using an additive or multiplicative method such as Holt Winters. A strong seasonal component would help you identify regular and repetitive messages which are being generated automatically.

Sandeep S. Sandhu

Posted 2016-05-23T20:33:17.557

Reputation: 2 087