5

1

I am preparing a data for machine learning model. I want to deal with time series data as normal supervised learning prediction. Let's say I have a data for car speed and I have several cars models such as

```
+-----+---------+-------------+
| day | Model | Speed |
+-----+---------+-------------+
| 1 | Bentley | 20.47 km/h |
| 2 | Bentley | 32.22 km/h |
| 3 | Bentley | 23.11 km/h |
| 1 | BMW | 37.60 km/h |
| 2 | BMW | 27.90 km/h |
| 3 | BMW | 40.47 km/h |
```

so I want to deal with several model in training so that predict the speed for Bentley and BMW.

I have converted the data for training like this :

```
+---------+------------+------------+-------------------+
| Model | day_1 | day_2 | label == day_3 |
+---------+------------+------------+-------------------+
| Bentley | 20.47 km/h | 32.22 km/h | 23.11 km/h |
| BMW | 37.60 km/h | 27.90 km/h | 40.47 km/h |
+---------+------------+------------+-------------------+
```

Is it a correct approach?

Do you always have the same number of days, like 3 in your example? And I assume that your training set would have several instances with the same car model right? – Erwan – 2019-12-03T01:28:14.887

@Erwan yes always have the same days for all cars , and yes I have several other instances like mode_year, model_type like this . But I'm not sure if my above approach is correct or not ? – angela – 2019-12-03T06:26:58.253

Do you have any duplication, such as data for 2 different BMW's? Also, do you have access to other possible features, such as engine size, driver age, etc? – Donald S – 2020-06-14T05:10:10.643