RNN time-series predictions with multiple features containing non-numeric features and numeric features?



The question RNN's with multiple features is ambiguous and not explicitly in differentiating different features. I want to understand how to use RNN to predict time-series with multiple features containing non-numeric data as well. As a deep learning model, I assume I don't need to quantify the non-numeric elements.

Suppose sales data

Sales | Weather | Holiday | Temperature
100   | Windy   | Yes     | 3
2000  | Sunny   | Yes     | 20
200   | Sunny   | No      | 30
-5    | Stormy  | No      | 3
-50   | Cold    | No      | -50
500   | Cold    | Yes     | -20

where I want to predict the Sales column with the other columns. I have found demos such as here about using RNN with numeric data, without enriching the data with non-numeric data.


How can I predict time series with multiple features contain numeric features and non-numeric features?

Other interesting questions related to RNN

  1. Multivariate time-series RNN -question in SO with numeric data

  2. Time series prediction using ARIMA vs LSTM


Posted 2017-08-17T13:23:01.120

Reputation: 171



Possible ways to expose these categorical variables as part of a time-step to be fed to an RNN include producing One_Hot Tensors, Hash Table Representations, or Embedding Layer Outputs from the categorical fields and concatenating all of the feature tensors at each time-step to feed to your RNN.

If, instead of discrete categories, you're looking to interpret the strings that make up the fields as sequences themselves, you could encode them numerically at a word- or character level and pass them through a subsidiary RNN to produce the internal representation at each timestep. This would be appropriate only for cases where you have a very large variety among strings in those fields.

Apologies for the Tensorflow links if you're not using that framework.

Thomas Cleberg

Posted 2017-08-17T13:23:01.120

Reputation: 1 437


This thread might interest you: Adding Features To Time Series Model LSTM.

You have basically 3 possible ways:

For example, weather data from two different cities: Paris and San Francisco. You want to predict the next temperature based on historical data. But at the same time, you expect the weather to change based on the city. You can either:

  • Combine the auxiliary features with the time series data, at the beginning or at the end (ugly!).
  • Concatenate the auxiliary features with the output of the RNN layer. It's some kind of post-RNN adjustment since the RNN layer won't see this auxiliary info.
  • Or initialize the RNN states with a learned representation of the condition (e.g. Paris or San Francisco).

I wrote a library to condition on auxiliary inputs. It abstracts all the complexity and has been designed to be as user-friendly as possible:


The implementation is in tensorflow (>=1.13.1)

Hope it helps!

Philippe Remy

Posted 2017-08-17T13:23:01.120

Reputation: 181