Common rule in machine learning is to **try simple things first**. For predicting continuous variables there's nothing more basic than **simple linear regression**. "Simple" in the name means that there's only one predictor variable used (+ intercept, of course):

```
y = b0 + x*b1
```

where `b0`

is an intercept and `b1`

is a slope. For example, you may want to predict lemonade consumption in a park based on temperature:

```
cons = b0 + temp * b1
```

Temperature is in well-defined **continuous** variable. But if we talk about something more abstract like "weather", then it's harder to understand how we measure and encode it. It's ok if we say that the weather takes values `{terrible, bad, normal, good, excellent}`

and assign values numbers from -2 to +2 (implying that "excellent" weather is twice as good as "good"). But what if the weather is given by words `{shiny, rainy, cool, ...}`

? We can't give an order to these variables. We call such variables **categorical**. Since there's no natural order between different categories, we can't encode them as a single numerical variable (and linear regression expects numbers only), but we can use so-called **dummy encoding**: instead of a single variable `weather`

we use 3 variables - `[weather_shiny, weather_rainy, weather_cool]`

, only one of which can take value 1, and others should take value 0. In fact, we will have to drop one variable because of collinearity. So model for predicting traffic from weather may look like this:

```
traffic = b0 + weather_shiny * b1 + weather_rainy * b2 # weather_cool dropped
```

where either `b1`

or `b2`

is 1, or both are 0.

Note that you can also encounter non-linear dependency between predictor and predicted variables (you can easily check it by plotting `(x,y)`

pairs). Simplest way to deal with it without refusing linear model is to use polynomial features - simply add polynomials of your feature as new features. E.g. for temperature example (for dummy variables it doesn't make sense, cause `1^n`

and `0^n`

are still 1 and 0 for any `n`

):

```
traffic = b0 + temp * b1 + temp^2 * b2 [+ temp^3 * b3 + ...]
```

Hi ffriend, thanks for your detailed answer. I did not go into much details into the independent variable on purpose, because the focus of my question is the fact that I am using a

singlevariable to predict another and I wanted to know the most suitable data mining techniques for this case. You confirmed my feeling that simple statistics (or not so simple) could be appropriated in this case, but the best is to try out things. Regarding the "weather" variable I was actually planning to use one metric and continuous variable such as visibility or rain, just to keep things simple. – doublebyte – 2014-10-15T08:42:21.740In fact, using 2 or more variables for linear regression is almost as simple as using only one, but can lead to a much more accurate predictions. Introduction to Statistical Learning (free PDF is available) is a great introduction to linear regression, its use cases and estimation metrics. If you are using specialized software like R, modelling different dependencies is deadly simple, boiling down to only few lines of code.

– ffriend – 2014-10-15T09:05:25.753