Parameter estimation: reduce time

4

1

I have a two-class prediction model; it has n configurable (numeric) parameters. The model can work pretty well if you tune those parameters properly, but the specific values for those parameters are hard to find. I used grid search for that (providing, say, m values for each parameter). This yields m ^ n times to learn, and it is very time-consuming even when run in parallel on a machine with 24 cores.

I tried fixing all parameters but one and changing this only one parameter (which yields m × n times), but it's not obvious for me what to do with the results I got. This is a sample plot of precision (triangles) and recall (dots) for negative (red) and positive (blue) samples:

enter image description here

Simply taking the "winner" values for each parameter obtained this way and combining them doesn't lead to best (or even good) prediction results. I thought about building regression on parameter sets with precision/recall as dependent variable, but I don't think that regression with more than 5 independent variables will be much faster than grid search scenario.

What would you propose to find good parameter values, but with reasonable estimation time?

oopcode

Posted 2015-07-17T14:17:23.207

Reputation: 183

Answers

2

If an exhaustive nonlinear scan is too expensive and a linear scan doesn't yield the best results then I suggest you try a stochastic nonlinear search i.e. a random search for hyperparameter optimization.

Scikit learn has a user friendly description in its user guide.

Here is a paper on random search for hyperparameter optimization.

AN6U5

Posted 2015-07-17T14:17:23.207

Reputation: 6 358

I totally agree with this answer. I just wanted to add that another advantage of random hyperparameter optimization is the possibility of exploring interesting areas of the hyperparameter space that may be ignored by the grid search (take a look to Fig. 1 in the paper linked by @AN6U5). – Pablo Suau – 2015-07-24T07:43:49.900