Generate timeseries data



Training would be bad if training data is not sufficient. Techniques like SMOTE or ADASYN can be used for oversampling. For image data, we can blur or change the angle to generate more samples from the same image.

My question is: how do you generate fake time series data?

vipin bansal

Posted 2019-05-26T01:58:15.280

Reputation: 1 322



ATM I know of TSimulus and TimeSynth to generate data programatically in a controlled manner (instead of generating random data).

TSimulus allows to generate data via various generators.

TimeSynth is capable of generating signal types

Harmonic functions(sin, cos or custom functions)
Gaussian processes with different kernels
    Squared exponential
    Rational quadratic
Pseudoperiodic signals
Autoregressive(p) process
Continuous autoregressive process (CAR)
Nonlinear Autoregressive Moving Average model (NARMA)

and noise types

White noise
Red noise

If you are looking for a graphical way to generate data TimeSeriesMaker is the only tool able to do this.


Posted 2019-05-26T01:58:15.280

Reputation: 193


If you know Python use Faker.


Posted 2019-05-26T01:58:15.280

Reputation: 1 285

Thanks for the reply. I explored the documentation, but could not find any info about time series data. Could you please provide me more info on that? – vipin bansal – 2019-05-27T12:56:40.737

Easily for time series use numpy and method random. – fuwiak – 2019-05-30T19:56:10.073

1My bad. Fake data, I mean over sampling of data, which is based upon the actual data only. For example, by tweaking "Euclidean Distance" of various feature, we can generate fake data which is approximately close to actual data. Similarly, for image, blurring, rotating, scaling will help us in generating some data which is again based upon the actual data. On the same way, I want to generate Time-Series data. Using Random method will generate purely un-relational data, which I don't want. – vipin bansal – 2019-05-31T06:04:39.323


You may apply Wolfram Language to your project. There is a free Wolfram Engine for developers and if you are developing in Python then with the Wolfram Client Library for Python you can use these functions in Python.

A good place to start is the Time Series Processing guide or the Random Processes guide; both of which contain a link to the Time Series Processes guide.

Use RandomFunction with any of the processes you wish to simulate. For example, with a MAProcess

res1 = RandomFunction[MAProcess[2, {.3, -.5}, 1], {0, 100}]

Mathematica graphics

The returned TemporalData object is known throughout the language so further process or analysis can continue from here. For example, visualise the result with ListLinePlot

ListPlot[res1, Filling -> Axis]

Mathematica graphics

The values can be directly viewed with the "Values" property. Using Short to limit the output.

res1["Values"] // Short
{2.00335, 1.15942, <<98>>, 2.7685}

Multiple paths can be generated as well. For example, generate two paths with ARMAProcess

res2 = RandomFunction[ARMAProcess[0.1, {0.8, 0.1}, {-0.4}, 0.1], {0, 100}, 2]

Mathematica graphics

Visualise with ListLinePlot


Mathematica graphics

"Paths" property will return the paths as list.

res2["Paths"] // Short
{{{0,0.275976}, <<99>>, {100, 0.766177}}, {<<1>>}}

"Components" property returns TemporalData objects for each path.


Mathematica graphics

ListLinePlot /@ res2["Components"]

Mathematica graphics

Hope this helps.


Posted 2019-05-26T01:58:15.280

Reputation: 625


If your data is financial time series (or similar to it) and your language of choice is Octave/Matlab, then you might be interested in a recent blog post of mine. In this post there are short descriptions of 4 different ways of creating synthetic time series data with links to other posts which contain both code and more detailed descriptions of the methods.

The blog post is


Posted 2019-05-26T01:58:15.280

Reputation: 214


If you want to train a model with simulated timeseries data, you first need to obtain the characterstics/properties of your underlying data (your "sample"). Means you have to check /make assumptions about the mean and variance. Remember you use the returns/relative change -- NOT the actual LEVEL INPUT in time series!

Why? Cause level time series has "memory" effect - this means you will get correlations even if you are comparing 2 time series that are drawn from two independent and random distributions. This "mistake" is often called in statistics "spurious regression".

Check for more details/examples.

However as mentioned before you can use:

  • (G)ARCH /ARMA -- Python implementation here
    • Maybe check first if your data follows such a process


  • Simulate data using Monte Carlo Simulation -- Python implementation here


Posted 2019-05-26T01:58:15.280

Reputation: 508


@vipin bansal You can also try out TimeSynth . Allows you to create both signals and noise e.g. Gaussian Noise, Sinusoidal signal.


Posted 2019-05-26T01:58:15.280

Reputation: 1