138 When should I use lasso vs ridge? 2010-07-28T01:10:18.423

76 When to use regularization methods for regression? 2010-11-06T17:53:05.250

62 Why L1 norm for sparse models 2012-12-11T07:25:01.253

57 What problem do shrinkage methods solve? 2011-12-27T22:35:02.860

52 Unified view on shrinkage: what is the relation (if any) between Stein's paradox, ridge regression, and random effects in mixed models? 2014-10-30T15:08:26.507

52 Why is ridge regression called "ridge", why is it needed, and what happens when $\lambda$ goes to infinity? 2015-05-07T18:54:59.627

51 Why does ridge estimate become better than OLS by adding a constant to the diagonal? 2014-10-11T18:52:15.717

44 Why does shrinkage work? 2015-11-02T20:29:19.020

42 Is ridge regression useless in high dimensions ($n \ll p$)? How can OLS fail to overfit? 2018-02-14T16:31:47.080

33 How to estimate shrinkage parameter in Lasso or ridge regression with >50K variables? 2012-04-16T12:02:03.690

29 Why do we only see $L_1$ and $L_2$ regularization but not other norms? 2017-03-23T09:28:50.730

29 If only prediction is of interest, why use lasso over ridge? 2018-03-05T10:19:20.977

26 When is nested cross-validation really needed and can make a practical difference? 2015-10-22T14:11:46.143

25 Is Tikhonov regularization the same as Ridge Regression? 2016-09-10T04:44:58.657

24 Ridge, lasso and elastic net 2014-04-09T14:40:52.000

23 Interpretation of ridge regularization in regression 2014-12-22T14:20:21.327

23 Is regression with L1 regularization the same as Lasso, and with L2 regularization the same as ridge regression? And how to write "Lasso"? 2016-03-07T19:24:48.317

21 How to derive the ridge regression solution? 2013-09-04T15:49:12.783

21 Why is glmnet ridge regression giving me a different answer than manual calculation? 2014-12-15T14:22:15.930

20 confidence intervals' coverage with regularized estimates 2015-06-20T00:44:26.127

19 What is ridge regression? 2013-03-19T01:31:07.070

19 When will L1 regularization work better than L2 and vice versa? 2015-11-28T16:57:27.460

19 What is elastic net regularization, and how does it solve the drawbacks of Ridge (L2) and Lasso (L1)? 2015-11-28T17:38:15.797

18 Estimating R-squared and statistical significance from penalized regression model 2011-02-15T00:38:52.977

17 Is there a clear set of conditions under which lasso, ridge, or elastic net solution paths are monotone? 2015-05-31T18:06:40.257

16 Implementing ridge regression: Selecting an intelligent grid for $\lambda$? 2012-07-13T17:39:52.077

16 Reason for not shrinking the bias (intercept) term in regression 2014-02-18T09:50:45.643

16 Bridge penalty vs. Elastic Net regularization 2016-07-19T14:33:37.317

15 Why does ridge regression classifier work quite well for text classification? 2011-10-29T18:14:54.547

15 Confusion with Vowpal Wabbit's multiple-pass behavior when performing ridge-regression 2014-01-07T23:43:53.400

15 Regression in $p>n$ setting: how to choose regularization method (Lasso, PLS, PCR, ridge)? 2014-07-20T18:54:30.103

15 What are the assumptions of ridge regression and how to test them? 2015-09-01T14:07:10.467

15 The proof of shrinking coefficients using ridge regression through "spectral decomposition" 2016-06-23T04:08:15.943

15 Reversing ridge regression: given response matrix and regression coefficients, find suitable predictors 2016-09-21T15:17:10.123

14 How can I estimate coefficient standard errors when using ridge regression? 2010-08-25T22:34:45.943

14 Ridge regression coefficients that are larger than OLS coefficients or that change sign depending on $\lambda$ 2013-03-11T06:05:53.713

14 Under exactly what conditions is ridge regression able to provide an improvement over ordinary least squares regression? 2014-11-06T16:00:10.543

13 L1 regression estimates median whereas L2 regression estimates mean? 2012-08-19T06:16:50.577

12 Relationship between ridge regression and PCA regression 2014-01-06T16:02:00.070

12 Difference between Primal, Dual and Kernel Ridge Regression 2014-04-05T22:25:50.630

12 AIC, BIC and GCV: what is best for making decision in penalized regression methods? 2014-07-20T03:44:52.977

12 Why will ridge regression not shrink some coefficients to zero like lasso? 2015-10-12T16:04:53.830

11 Lagrangian relaxation in the context of ridge regression 2010-11-09T22:45:45.627

11 When does LASSO select correlated predictors? 2012-06-14T23:12:50.983

11 Ridge regression results different in using lm.ridge and glmnet 2013-10-31T04:07:29.983

11 Ridge regression – Bayesian interpretation 2014-04-27T14:16:12.110

11 What's the typical range of possible values for the shrinkage parameter in penalized regression? 2014-08-15T18:20:53.870

11 Regularization for ARIMA models 2015-05-13T20:01:17.503

11 AIC of ridge regression: degrees of freedom vs. number of parameters 2016-12-02T11:34:16.093

11 If shrinkage is applied in a clever way, does it always work better for more efficient estimators? 2017-01-17T16:43:42.613

11 How to interpret the results when both ridge and lasso separately perform well but produce different coefficients 2017-03-14T09:46:30.640

10 Ridge and LASSO given a covariance structure? 2012-07-20T08:18:24.037

10 Why Lasso or ElasticNet perform better than Ridge when the features are correlated 2017-02-25T17:23:25.393

10 What are some of the most important "early papers" on Regularization methods? 2017-06-09T08:00:20.570

10 Using regularization when doing statistical inference 2017-07-14T20:34:58.963

9 Difference between ridge regression implementation in R and SAS 2011-01-31T03:14:14.757

9 How to calculate regularization parameter in ridge regression given degrees of freedom and input matrix? 2011-03-15T10:34:15.403

9 Understanding ridge regression results 2011-07-27T18:11:07.420

9 Kernel Ridge Regression Efficiency 2013-02-08T02:14:23.257

9 Phoney data and ridge regression are the same? 2015-02-10T10:30:05.450

9 Lucid explanation for "numerical stability of matrix inversion" in ridge regression and its role in reducing overfit 2016-02-28T20:32:53.793

8 Calculate prediction interval for ridge regression? 2011-07-21T08:39:45.820

8 Selection of k knots in regression smoothing spline equivalent to k categorical variables? 2014-04-14T17:27:37.447

8 K-fold or hold-out cross validation for ridge regression using R 2014-06-26T16:43:35.107

8 Ridge & LASSO norms 2014-10-15T13:46:12.620

8 Why can't ridge regression provide better interpretability than LASSO? 2015-11-01T02:08:52.967

8 How to find regression coefficients $\beta$ in ridge regression? 2016-01-16T18:08:40.900

8 Ridge regression: regularizing towards a value 2016-10-23T06:47:35.113

8 In Ridge regression and LASSO, why smaller $\beta$ would be better? 2017-03-16T01:33:39.043

7 Scalable multinomial regression implementation 2011-02-28T21:24:23.323

7 Regularized fit from summarized data: choosing the parameter 2011-04-12T08:21:07.817

7 Confused by MATLAB's implementation of ridge 2012-02-17T20:58:48.453

7 PRESS statistic for ridge regression 2012-07-18T13:12:23.503

7 How do you interpret the results from ridge regression? 2012-10-17T14:10:41.860

7 Applying ridge regression for an underdetermined system of equations? 2014-01-21T01:59:26.030

7 Why does Ridge Regression work well in the presence of multicollinearity? 2014-06-25T20:26:37.587

7 Grid fineness and overfitting when tuning $\lambda$ in LASSO, ridge, elastic net 2015-09-22T13:27:44.687

7 Bias and variance properties of $L^1$ vs $L^2$ normalization 2017-03-08T20:09:25.113

7 Is multicollinearity really a problem? 2017-03-21T19:59:14.260

6 How to apply a soft coefficient constraint to an OLS regression? 2011-09-23T20:43:14.490

6 Why does regularization of coefficient magnitude improve the generalization of linear regression? 2013-07-11T14:27:06.053

6 Selecting optimal set of eigenvectors for Principal Components Regression 2013-10-11T11:56:09.380

6 Gradient of multivariate Gaussian log-likelihood 2014-03-15T18:54:12.590

6 comparing OLS, ridge and lasso 2014-07-19T16:24:29.807

6 Ridge regression in multivariate Gaussian distribution 2015-03-17T10:59:28.620

6 Estimating the prediction variance in kernel ridge regression 2015-07-20T15:07:34.667

6 Can someone explain what the foldid argument in glmnet does? 2015-07-23T18:49:01.823

6 Computing cross-validated $R^2$ from mean cross-validation error 2015-12-17T17:28:32.920

6 Ridge regression formulation as constrained versus penalized: How are they equivalent? 2016-01-20T16:34:42.697

6 How to perform non-negative ridge regression? 2016-03-25T15:55:55.340

6 Lasso and Ridge tuning parameter scope 2016-04-29T16:19:02.390

6 Maximum penalty for ridge regression 2016-08-03T13:28:32.697

6 The Regularization Path for Smoothing Splines 2016-10-20T19:58:27.383

6 Standardization vs. Normalization for Lasso/Ridge Regression 2017-06-26T13:40:05.367

5 Regularized fit from summarized data 2011-04-08T07:18:45.413

5 Bayesion priors in ridge regression with scikit learn's linear model 2012-03-19T14:18:31.593

5 Large scale ridge regression 2013-10-28T20:23:57.770

5 demonstration of benefits of ridge regression over ordinary regression 2014-07-17T22:46:01.560

5 How to decide which penalty measure to use ? any general guidelines or thumb rules out of textbook 2014-07-21T19:56:44.653