98 How to get correlation between two categorical variable and a categorical variable and continuous variable? 2014-08-03T13:07:24.143

58 Neural networks: which cost function to use? 2016-01-19T11:48:29.337

45 Data Science in C (or C++) 2015-03-20T14:56:23.420

37 Calculation and Visualization of Correlation Matrix with Pandas 2016-03-01T05:56:37.497

27 Books about the "Science" in Data Science? 2014-06-11T13:28:35.980

27 Any Online R console? 2014-10-13T21:13:48.447

25 Is Python a viable language to do statistical analysis in? 2020-06-29T03:59:04.197

21 What statistical model should I use to analyze the likelihood that a single event influenced longitudinal data 2014-06-20T03:18:59.477

16 High-dimensional data: What are useful techniques to know? 2015-01-25T22:52:23.437

16 Beginner math books for Machine Learning 2018-01-09T18:28:08.657

15 How to specify important attributes? 2014-05-19T15:55:24.983

15 How many features to sample using Random Forests 2017-10-10T10:50:22.720

14 When are p-values deceptive? 2014-05-14T22:12:37.203

14 Analyzing A/B test results which are not normally distributed, using independent t-test 2014-08-04T22:27:10.837

13 Datasets understanding best practices 2014-06-24T07:29:57.787

12 ngram and RNN prediction rate wrt word index 2015-10-27T09:55:31.540

12 Overfitting in Linear Regression 2020-08-27T08:52:05.400

11 Data Science oriented dataset/research question for Statistics MSc thesis 2014-06-14T19:54:53.193

11 Best languages for scientific computing 2014-06-16T19:14:38.553

11 Is GLM a statistical or machine learning model? 2014-06-19T18:02:24.650

11 Statistics + Computer Science = Data Science? 2014-07-22T08:39:33.810

11 Relationship between KS, AUROC, and Gini 2014-11-23T01:05:06.473

10 How do various statistical techniques (regression, PCA, etc) scale with sample size and dimension? 2014-08-05T18:36:12.753

10 How to group identical values and count their frequency in Python? 2016-04-21T18:49:50.497

9 Ways to reconstruct shuffled pixels of a video file? 2017-09-30T15:57:53.597

8 Linearly increasing data with manual reset 2014-07-04T05:12:44.707

8 Evaluating Recommendation engines 2014-11-26T04:40:17.840

8 Why use bootstrapping? 2016-02-26T06:04:11.280

8 Am I doing a log transformation of data correctly? 2017-09-11T18:03:47.500

8 A/B testing: How to calculate p-value on post test segments? 2017-11-14T02:14:42.247

7 How to numerically estimate MLE estimators in python when gradients are very small far from the optimal solution? 2014-08-25T00:28:09.003

7 Sensitivity to scaling of features in a multivariate gaussians 2015-06-18T16:41:43.320

7 Reducing the effect of down voters with rating system 2015-11-27T17:44:20.980

7 IID violation in machine learning 2016-03-07T23:03:06.100

7 Time Series Machine Learning Feature Selection Problem 2016-08-05T00:41:17.617

7 Computational aspects are typically ignored by statisticians 2018-07-19T08:46:10.267

7 Why 100% accuracy on test data is not good? 2018-12-30T17:40:02.613

7 When to use mean vs median 2019-03-06T03:30:10.710

7 How to find out if two datasets are close to each other? 2019-06-09T05:10:23.613

7 bias and variance trade off related question 2019-08-21T03:35:07.247

7 Which statistical test tells which classifier performs better than the other? 2019-12-20T04:51:32.500

6 Is Data Science just a trend or is a long term concept? 2014-05-18T19:46:44.653

6 Data Science as a Social Scientist? 2014-06-13T07:28:37.763

6 How to normalize results of Singular Value Decomposition (SVD) between 0 and 1? 2014-06-26T19:23:46.043

6 Statistical Commute Analysis in Java 2014-09-17T13:13:59.147

6 Looking for a strong Phd Topic in Predictive Analytics in the context of Big Data 2014-09-25T20:18:46.880

6 How to detect overfitting of a stock screener 2015-03-02T23:02:45.583

6 How to find a confidence level given the z value 2016-02-05T13:52:08.660

6 What is the rationale for discretization of continuous features and when should it be done? 2017-06-17T04:23:17.953

6 A clear visualization of a two-way ANOVA 2017-10-02T17:34:08.457

6 Testing independence of random variables in Python 2018-07-20T12:30:15.843

6 What is the Time Complexity of Linear Regression? 2018-07-20T15:47:26.760

6 Least Squares optimization 2019-02-10T06:33:37.720

6 Covariance as inner product 2019-04-08T15:40:16.273

6 Question about sklearn's StratifiedShuffleSplit 2019-04-30T05:46:53.663

6 When is the sum of models the model of the sum? 2019-05-15T16:53:46.510

6 Is there any difference between a weak learner and a weak classifier? 2020-01-05T22:15:41.233

6 Why ML model produces different results despite random_state defined? And how to set global random seed for sklearn 2020-01-12T11:34:54.883

6 Are Undergraduate Statistics Concepts Used in Practice? 2020-02-07T07:18:51.633

6 How do I test a difference between two proportions representing fatality rate for Covid 19 in Philippines and World (except Philippines)? 2020-03-14T02:49:54.487

6 How distribution of data effects model performance? 2020-05-25T07:16:58.357

5 What are good sources to learn about Bootstrap? 2014-06-17T18:13:46.230

5 Standardize numbers for ranking ratios 2014-06-25T02:33:22.673

5 Methods for standardizing / normalizing different rank scales 2014-10-10T00:59:46.703

5 Relation mining of multivariant categorical timeseries without excluding the temporal nature 2014-11-21T15:23:37.360

5 Mahout Similarity algorithm comparison 2014-12-02T11:12:06.103

5 Method for solving problem with variable number of predictors 2014-12-03T21:47:34.907

5 Best Python library for statistical inference 2015-02-12T10:46:41.280

5 How do you compare term counts between two different periods, with different underlying corpus sizes, without bias? 2015-05-11T15:07:47.143

5 Testing bimodality of data 2015-07-31T17:20:10.873

5 Implementation of Tree Kernels in Python 2015-10-30T01:03:04.533

5 Completing MDS manually in R 2015-11-16T01:53:50.673

5 What kind of statistical analyses can I do with my data? 2015-12-22T09:36:33.997

5 change detection 2016-04-20T08:06:09.363

5 How to decided which test of normality to use 2016-07-18T15:49:11.970

5 AB testing : When AA testing doesn't work 2016-08-08T18:17:42.123

5 How to find the relationship between categorical variables? 2018-03-19T16:00:09.523

5 Anomaly detection using RNN LSTM 2018-04-24T12:51:14.393

5 Why is sampling useful in machine learning? 2018-07-31T19:22:24.890

5 How to interpret two continous variables output using GAM? 2019-07-25T09:24:23.017

5 Why is 10 considered the default value for k-fold cross-validation? 2020-06-10T15:16:23.277

5 Stacking and Ensembling methods in Data Science 2020-06-29T11:00:23.687

5 Understand the equations of quantile regression forest (Meinshausen)? 2020-09-09T13:39:43.187

4 Is there a replacement for small p-values in big data? 2014-05-15T00:26:11.387

4 When is there enough data for generalization? 2014-08-04T19:10:57.187

4 Anomaly detection in multiple parameters 2014-11-02T07:20:32.603

4 Research in high-dimensional statistics vs. machine learning? 2015-08-20T17:42:23.987

4 Measure of correlation for term frequency 2015-10-06T21:55:44.410

4 how to find probability of one or more events to happen from an incomplete data set 2015-11-16T15:40:51.043

4 Impact of unlabelled documents for label prediction via SVM 2015-11-19T15:19:35.033

4 How to deal with analyzing optional survey data 2016-01-04T04:49:51.893

4 Assessing significance / confidence of a crossvalidated performance measure 2016-01-28T13:13:52.680

4 What are the methods to ensure that the population split for A/B test is random? 2016-02-26T13:27:48.553

4 A/B testing randomization step 2016-05-30T04:48:24.397

4 Computing confidence interval for average from individual predictions 2016-12-25T19:35:16.417

4 Statistics - Train and test data split 2017-03-03T09:02:37.583

4 Where to find statistically relevant documentation of common Python packages? 2017-05-17T22:12:24.887

4 Maths of Xavier initialization 2018-04-16T23:08:19.493

4 Geometric and harmonic means in ensembling methods 2018-04-24T13:21:25.807