8

2

I have a time-series dataset that records some participants' daily features from wearable sensors and their daily mood status.

The goal is to use one day's daily features and predict the next day's mood status for participants with machine learning models such as linear regression.

I think cross-validation could be a good way for me to evaluate the performances. However, would shuffling the data randomly be fine?

Someone told me that because I am using a time-series dataset and I am trying to do a prediction task, shuffling the data randomly will cause some mix-up of future and past, which makes my model meaningless. However, I think I can still use the strategy of randomly shuffling the dataset because the learning model is not a time-series model and, for each step, the model only learns from exactly 1 label value instead of a series of labels.

what method will you use exactly? – Peter – 2019-06-21T18:15:05.103

Thanks! I am currently using a linear regression model and a multitask linear regression model. – Han – 2019-06-22T17:12:42.760