211 Is $R^2$ useful or dangerous? 2011-07-20T20:32:55.510

199 What happens if the explanatory and response variables are sorted independently before regression? 2015-12-07T17:22:05.580

197 Interpretation of R's lm() output 2010-12-04T11:28:14.300

138 When should I use lasso vs ridge? 2010-07-28T01:10:18.423

133 In linear regression, when is it appropriate to use the log of an independent variable instead of the actual values? 2010-07-20T13:11:50.297

107 How exactly does one “control for other variables”? 2011-10-20T20:52:12.243

101 Numerical example to understand Expectation-Maximization 2013-10-14T22:37:36.997

100 What skills are required to perform large scale statistical analyses? 2011-03-02T19:05:46.350

97 What if residuals are normally distributed, but y is not? 2011-06-23T06:00:00.923

97 What is the difference between linear regression and logistic regression? 2012-05-28T18:17:36.760

87 When is it ok to remove the intercept in a linear regression model? 2011-03-07T09:14:00.487

87 How are the standard errors of coefficients calculated in a regression? 2012-12-01T10:16:05.267

86 Why is ANOVA taught / used as if it is a different research methodology compared to linear regression? 2010-07-23T15:17:56.770

82 What's the difference between correlation and simple linear regression? 2010-08-25T23:53:00.417

81 How can a regression be significant yet all predictors be non-significant? 2011-08-19T04:50:55.947

79 Interpreting plot.lm() 2013-05-04T21:34:13.147

76 When to use regularization methods for regression? 2010-11-06T17:53:05.250

76 Including the interaction but not the main effects in a model 2011-05-20T01:19:45.107

74 Is there an intuitive explanation why multicollinearity is a problem in linear regression? 2010-08-02T22:42:32.947

74 What is the difference between linear regression on y with x and x with y? 2012-02-13T05:15:55.637

73 What is the lasso in regression analysis? 2011-10-19T04:24:43.957

70 What are modern, easily used alternatives to stepwise regression? 2011-07-31T23:45:58.013

66 When should linear regression be called "machine learning"? 2017-03-20T22:10:20.387

65 PCA and proportion of variance explained 2012-02-10T05:36:10.673

64 Rules of thumb for minimum sample size for multiple regression 2011-04-28T06:40:32.977

63 How should outliers be dealt with in linear regression analysis? 2010-07-19T23:39:49.730

63 Diagnostics for logistic regression? 2012-12-03T23:15:51.790

62 Why L1 norm for sparse models 2012-12-11T07:25:01.253

62 What are some of the most common misconceptions about linear regression? 2016-06-09T19:10:43.170

58 Do all interactions terms need their individual terms in regression model? 2012-05-04T02:10:29.487

58 What is the benefit of breaking up a continuous predictor variable? 2013-08-31T05:32:29.880

58 What is wrong with extrapolation? 2016-06-19T05:56:17.563

57 How can adding a 2nd IV make the 1st IV significant? 2012-05-14T18:02:13.987

56 What does a "closed-form solution" mean? 2013-09-23T23:31:26.477

55 What is a complete list of the usual assumptions for linear regression? 2011-10-03T04:19:19.057

55 Solving for regression parameters in closed-form vs gradient descent 2012-02-20T01:47:19.123

55 How does the correlation coefficient differ from regression slope? 2012-07-17T14:43:59.217

55 Why would parametric statistics ever be preferred over nonparametric? 2015-07-30T11:48:44.030

54 Difference between confidence intervals and prediction intervals 2011-10-04T18:35:49.743

53 Alternatives to logistic regression in R 2010-08-31T10:02:07.947

53 Why is it possible to get significant F statistic (p<.001) but non-significant regressor t-tests? 2010-10-13T09:40:17.420

53 What are disadvantages of using the lasso for variable selection for regression? 2011-03-06T23:21:24.703

53 Why does a time series have to be stationary? 2011-12-12T21:11:54.887

52 Why does the Lasso provide Variable Selection? 2013-11-04T14:39:19.147

52 Unified view on shrinkage: what is the relation (if any) between Stein's paradox, ridge regression, and random effects in mixed models? 2014-10-30T15:08:26.507

51 How to visualize what canonical correlation analysis does (in comparison to what principal component analysis does)? 2013-07-26T20:28:15.647

51 Why does ridge estimate become better than OLS by adding a constant to the diagonal? 2014-10-11T18:52:15.717

51 How to calculate Area Under the Curve (AUC), or the c-statistic, by hand 2015-04-09T17:53:46.377

50 When is R squared negative? 2011-07-11T17:07:34.553

50 Logistic regression in R resulted in perfect separation (Hauck-Donner phenomenon). Now what? 2012-12-12T23:59:50.220

49 Does it make sense to add a quadratic term but not the linear term to a model? 2012-05-18T13:34:43.363

48 What correlation makes a matrix singular and what are implications of singularity or near-singularity? 2013-09-24T10:55:04.087

48 Why isn't Logistic Regression called Logistic Classification? 2014-12-07T18:44:41.497

47 What is a "saturated" model? 2010-07-20T12:09:08.457

47 Regression with multiple dependent variables? 2010-11-14T02:50:03.993

47 Box-Cox like transformation for independent variables? 2012-09-05T10:37:28.063

46 Explain the difference between multiple regression and multivariate regression, with minimal use of symbols/math 2010-09-03T18:54:17.230

46 Why do we care so much about normally distributed error terms (and homoskedasticity) in linear regression when we don't have to? 2014-12-30T22:22:00.057

46 Can simple linear regression be done without using plots and linear algebra? 2016-04-01T12:48:16.650

44 Fast linear regression robust to outliers 2012-12-19T10:47:00.917

44 Where to start with statistics for an experienced developer 2015-10-13T01:57:02.817

43 Understanding regressions - the role of the model 2011-01-04T09:29:01.153

43 Efficient online linear regression 2011-02-05T18:25:52.210

43 Multivariate linear regression vs neural network? 2012-10-27T08:06:23.977

43 A more definitive discussion of variable selection 2016-07-14T16:30:08.713

42 Is adjusting p-values in a multiple regression for multiple comparisons a good idea? 2010-09-30T14:07:56.490

42 Are splines overfitting the data? 2013-02-01T09:36:38.343

40 Random forest assumptions 2013-05-15T14:13:52.850

39 What algorithm is used in linear regression? 2010-08-18T13:30:31.750

39 What does having "constant variance" in a linear regression model mean? 2013-03-13T12:51:16.463

39 Shape of confidence interval for predicted values in linear regression 2014-02-06T00:15:17.603

38 Intuitive explanation of the bias-variance tradeoff? 2010-11-07T10:57:29.053

38 What are good RMSE values? 2013-04-16T21:03:02.497

38 What is a contrast matrix (a term, pertaining to an analysis with categorical predictors)? 2013-12-02T21:19:40.847

38 Bayes regression: how is it done in comparison to standard regression? 2016-12-20T17:35:54.453

37 How are regression, the t-test, and the ANOVA all versions of the general linear model? 2013-05-15T00:46:04.307

36 If the t-test and the ANOVA for two groups are equivalent, why aren't their assumptions equivalent? 2010-08-13T09:41:13.160

36 Interpretation of log transformed predictor 2011-11-16T10:03:24.363

36 Why not approach classification through regression? 2012-02-05T05:43:32.493

36 How to simulate artificial data for logistic regression? 2012-12-25T14:59:47.420

36 Is there a difference between 'controlling for' and 'ignoring' other variables in multiple regression? 2013-12-07T02:14:23.860

35 Regression: Transforming Variables 2010-11-23T17:41:19.050

34 What is difference-in-differences? 2010-07-23T16:57:50.063

34 Is it valid to include a baseline measure as control variable when testing the effect of an independent variable on change scores? 2011-09-18T04:22:55.947

34 Effect of switching response and explanatory variable in simple linear regression 2012-01-03T19:24:29.060

34 How to read Cook's distance plots? 2012-02-02T12:02:58.120

34 When do Poisson and negative binomial regressions fit the same coefficients? 2013-09-30T21:39:20.263

34 How do I know which method of cross validation is best? 2014-06-15T15:25:14.840

34 Why is ANOVA equivalent to linear regression? 2015-10-02T18:40:16.733

33 Least-angle regression vs. lasso 2010-11-18T07:28:22.207

33 When and how to use standardized explanatory variables in linear regression 2011-02-11T23:09:54.510

33 Significance of coefficients in linear regression: significant t-test vs non-significant F-statistic 2012-03-15T19:56:37.380

33 Principled way of collapsing categorical variables with many categories 2015-04-17T13:31:28.447

33 Why use gradient descent for linear regression, when a closed-form math solution is available? 2017-05-10T16:52:30.517

32 Data mining: How should I go about finding the functional form? 2011-05-05T16:26:00.037

32 Different ways to write interaction terms in lm? 2011-12-02T20:23:26.297

32 Regression for an outcome (ratio or fraction) between 0 and 1 2012-05-23T22:13:50.053

32 What do "endogeneity" and "exogeneity" mean substantively? 2013-05-21T06:22:22.660

32 Can a random forest be used for feature selection in multiple linear regression? 2015-07-30T21:52:22.147