Time series data: How I measure influence of new product sales on existing product sales (statistically)?



Here my goal is…

  • Find Product 5 (New Product) is really influencing other product sales (product 1 to 4) or not?
  • If it is influencing other product sales, how much?

New to R and tried several related posts but didn’t find exact answer to my question. I love R and learning every day something new which helping us in taking data driven decisions.

My sample Dataset is like below (Week and Product 1 to Product 5 Sales per each week) Here my new product is Product 5 and launched on Week 5.

Week    Product-1   Product-2   Product-3   Product-4   Product-5
1   2   4   5   5   
2   4   4   6   4   
3   4   4   6   5   
4   4   4   6   6   4
5   4   6   5   3   5
6   2   7   6   4   3
7   3   8   7   5   6
8   2   9   9   3   6

Here my questions are

  • What is the best process or model to show the influence of product 5 (statistically)?
  • Do I need to run co-integration tests before I run correlation? Example some of these products are never be correlated with Product 5 (example: growth in cockle growth vs. growth in electricity demand)
  • How I know correlation vs. causation in this mix?
  • Since my new product launched on week 5, where I can start my correlations? Is it from week 5 or from earlier weeks?
  • Do I need to test for stationarity first? and bring the data to stationary?


Posted 2016-01-22T05:10:26.040

Reputation: 263



You could build an ARIMAX model. This would permit to include autoregressive (AR) terms as well as well as the sales in product 5 as an Exogenous Input (X). This would give you a potential model where the sales for a product $i$ at time $t$ is given by $s^i_t$ then,

$s^1_t=\alpha_1 s^1_{t-1} + \alpha_2 s^1_{t-2} + \ldots + \beta_0 s^5_t + \beta_1 s^5_{t-1} + \ldots $

Note that you may need to make the series stationary first, but see more on that below. You could estimate this model with the seasonal R package that relies on the X-13ARIMA-SEATS software developed by the US Census Bureau.

I would recommend to ensure that your time series are all stationary, see for example this post before you use X13. I would also run cointegration tests. See for more explanation this excellent post.

Since you only have data on week 5 I would start modeling in week 5 but you could include autoregressive (AR) terms related to the sales of product 1 prior to week 5.


Posted 2016-01-22T05:10:26.040

Reputation: 1 333


  • How I know correlation vs. causation in this mix?

Finding the causal affect of one variable on another is a difficult one because there are probably a few hidden variables that are the driving factors behind product 5 and all others. For example the true causal effect could be that the weather improved causing sales to increase for product 5 and others, making them correlated but not having any causal relationship.

One way of trying to remove bias in determining causality is through the use of the following https://en.wikipedia.org/wiki/Instrumental_variable.


Posted 2016-01-22T05:10:26.040

Reputation: 156