## Trying to use TensorFlow to predict financial time series data

11

3

I'm new to ML and TensorFlow (I started about a few hours ago), and I'm trying to use it to predict the next few data points in a time series. I'm taking my input and doing this with it:

/----------- x ------------\
.-------------------------------.
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
'-------------------------------'
\----------- y ------------/


What I thought I was doing is using x as the input data and y as the desired output for that input, so that given 0-6 I could get 1-7 (the 7 in particular). However, when I run my graph with x as the input, what I get is a prediction that looks more like x than y.

Here's the code (based on this post and this post):

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plot
import pandas as pd
import csv

print("Opening CSV file")
with open(filename) as csvfile:
return [[[float(p)] for p in row] for row in reader]

flatten = lambda l: [item for sublist in l for item in sublist]

prediction_size = 10
num_test_rows = 1
num_data_rows = len(data_points) - num_test_rows
row_size = len(data_points[0]) - prediction_size

# Training data
data_rows = data_points[:-num_test_rows]
x_data_points = np.array([row[:-prediction_size] for row in data_rows]).reshape([-1, row_size, 1])
y_data_points = np.array([row[prediction_size:] for row in data_rows]).reshape([-1, row_size, 1])

# Test data
test_rows = data_points[-num_test_rows:]
x_test_points = np.array([[data_points[0][:-prediction_size]]]).reshape([-1, row_size, 1])
y_test_points = np.array([[data_points[0][prediction_size:]]]).reshape([-1, row_size, 1])

tf.reset_default_graph()

num_hidden = 100

x = tf.placeholder(tf.float32, [None, row_size, 1])
y = tf.placeholder(tf.float32, [None, row_size, 1])

basic_cell = tf.contrib.rnn.BasicRNNCell(num_units=num_hidden, activation=tf.nn.relu)
rnn_outputs, _ = tf.nn.dynamic_rnn(basic_cell, x, dtype=tf.float32)

learning_rate = 0.001

stacked_rnn_outputs = tf.reshape(rnn_outputs, [-1, num_hidden])
stacked_outputs = tf.layers.dense(stacked_rnn_outputs, 1)
outputs = tf.reshape(stacked_outputs, [-1, row_size, 1])

loss = tf.reduce_sum(tf.square(outputs - y))
training_op = optimizer.minimize(loss)

init = tf.global_variables_initializer()

iterations = 1000

with tf.Session() as sess:
init.run()
for ep in range(iterations):
sess.run(training_op, feed_dict={x: x_data_points, y: y_data_points})
if ep % 100 == 0:
mse = loss.eval(feed_dict={x: x_data_points, y: y_data_points})
print(ep, "\tMSE:", mse)

y_pred = sess.run(stacked_outputs, feed_dict={x: x_test_points})

plot.rcParams["figure.figsize"] = (20, 10)

plot.title("Actual vs Predicted")
plot.plot(pd.Series(np.ravel(x_test_points)), 'g:', markersize=2, label="X")
plot.plot(pd.Series(np.ravel(y_test_points)), 'b--', markersize=2, label="Y")
plot.plot(pd.Series(np.ravel(y_pred)), 'r-', markersize=2, label="Predicted")
plot.legend(loc='upper left')
plot.xlabel("Time periods")
plot.tick_params(
axis='y',
which='both',
left='off',
right='off',
labelleft='off')
plot.show()


The result shown in the graph below is a prediction that follows x, rather than being shifted to the left (and including the predicted points on the right) as it should be to resemble y. Obviously the desire is for the red line to be as close to the blue one as possible.

I have no idea what I'm doing with all this, so please ELI5.

Oh, also, my data points are fairly small numbers (order of 0.0001). If I don't multiply them by, say, 1000000, the results are so small that the red line is almost flat at the bottom of the chart. Why? I'm guessing it's because of the squaring in the fitness function. Should data be normalized before use, and if so, to what? 0-1? If I use:

normalized_points = [(p - min_point) / (max_point - min_point) for p in data_points]


my prediction fluctuates more wildly as it progresses:

Edit: I'm being dumb and only giving it one example to learn from, not 500, aren't I? So I should be giving it multiple 500-point samples, right?

I have the same problem - namely that the output of the RNN follows the input (X) and not the target (Y). Strangely when the input to the same RNN is a simple sine series it learns correctly, i.e. predicts the Y. – Ryszard Cetnarski – 2018-06-15T13:38:46.157

3

Ok let's go part by part. There's quite a few parts here where you do not take into consideration the bias in your network.

# Choosing your inputs and output

If the vector 0-6 is determined there really is no need to output 1-7. The 1-6 is already known and adding additional outputs will only add complexity to your model. Unless you have substantial amounts of data you want to keep your model as simple as possible in order to get good performance. Thus, I would output a simple neuron with a continuous value. You can use RMSE as your loss function with a regression output from your neural network.

Additionally, you should supplement the samples you put into your input space with some additional information that you might think would contain information about the trend line. For example, if I had 2 different products, bitcoin and gold, and their input vector was the same, I might expect the gold to have very little fluctuation but the bitcoin to have very high fluctuation.

Your input features to your network contain all the information from which your network will learn. Thus, you want to make sure that you are supplying sufficient information to have a meaningful prediction.

# Deep learning is data hungry

You will need approximately 100,000+ instances. Each instance is a set of features. These should be drawn independently and such that they are identically distributed. In other words, you want to get multiple trendlines from a varied source of data that you wish to use your network with and then you will randomly draw 0-6 points, that is your features, and 7 which will be your label.

Consider the data distribution you are trying to learn. If you want your network to classify cats/dogs, you need to give a wide range of different looking cats and dogs such that the network can identify the variance which exists in both of these classes. If you restrict the data source too much it will have high bias and will not generalize to novel data that you will later feed into it.

Try these things and let us know what happens.

3

Perhaps the prediction being the same as the input reflects that your network is under-trained. So called persistence model for time series prediction, is often used a baseline for other models. Persistence model is using the last observation as a prediction. It is simple and often yields reasonable accuracy. My guess is that your network starts by learning the persistence model, and only if you train it more and it is possible to make a better model, it will learn it - but this requires a lot of training.