Tag: data-mining

81 K-Means clustering for mixed numeric and categorical data 2014-05-14T05:58:21.927

48 Are Support Vector Machines still considered "state of the art" in their niche? 2014-07-09T12:22:22.400

41 Open source Anomaly Detection in Python 2015-07-22T14:26:58.660

24 What are some standard ways of computing the distance between documents? 2014-07-05T16:10:21.580

21 Why are NLP and Machine Learning communities interested in deep learning? 2014-10-11T10:24:01.393

20 How to do SVD and PCA with big data? 2014-09-25T08:40:59.467

19 Is Data Science the Same as Data Mining? 2014-05-14T01:25:59.677

17 What statistical model should I use to analyze the likelihood that a single event influenced longitudinal data 2014-06-20T03:18:59.477

17 Gini coefficient vs Gini impurity - decision trees 2014-09-09T12:44:16.967

15 Meaning of latent features? 2014-07-16T09:24:51.780

14 K-means: What are some good ways to choose an efficient set of initial centroids? 2015-04-30T13:42:05.343

13 Big data case study or use case example 2014-06-11T06:07:45.767

13 How to deal with time series which change in seasonality or other patterns? 2014-12-22T03:30:45.673

13 What is Hellinger Distance and when to use it? 2017-08-31T02:11:38.127

12 Item based and user based recommendation difference in Mahout 2014-12-04T05:18:03.720

12 Neo4j vs OrientDB vs Titan 2014-12-18T04:36:06.107

12 Recognize a grammar in a sequence of fuzzy tokens 2016-08-08T13:01:19.127

11 One-Class discriminatory classification with imbalanced, heterogenous Negative background? 2014-06-11T10:11:59.397

10 Airline Fares - What analysis should be used to detect competitive price-setting behavior and price correlations? 2015-05-17T20:12:48.760

10 Why are ensembles so unreasonably effective 2016-05-25T13:08:06.693

10 Why do we need XGBoost and Random Forest? 2017-10-14T12:33:00.527

9 Working with HPC clusters 2014-07-08T13:45:07.583

9 Scalable Outlier/Anomaly Detection 2014-10-17T10:47:13.197

9 How to scrape imdb webpage? 2015-04-15T23:53:13.957

9 LinkedIn web scraping 2015-05-13T21:01:03.070

9 Which is faster: PostgreSQL vs MongoDB on large JSON datasets? 2015-06-03T20:29:40.490

8 Clustering customer data stored in ElasticSearch 2014-05-14T08:38:07.007

8 Is there any APIs for crawling abstract of paper? 2014-05-17T08:45:08.420

8 How to debug data analysis? 2014-06-15T12:26:50.060

8 Relational Data Mining without ILP 2014-06-17T13:46:06.367

8 Learning signal encoding 2014-06-18T03:19:07.557

8 Is FPGrowth still considered "state of the art" in frequent pattern mining? 2014-07-12T17:25:52.907

8 Why might several types of models give almost identical results? 2014-08-18T14:56:13.800

8 What initial steps should I use to make sense of large data sets, and what tools should I use? 2014-08-19T17:50:52.583

8 Relationship between KS, AUROC, and Gini 2014-11-23T01:05:06.473

8 How do I calculate the delta term of a Convolutional Layer, given the delta terms and weights of the previous Convolutional Layer? 2015-06-02T20:16:43.627

8 How to model user's buying behavior on Amazon? 2015-11-05T17:06:27.647

7 Human activity recognition using smartphone data set problem 2014-05-27T10:41:33.220

7 NASDAQ Trade Data 2014-07-19T20:46:52.740

7 What is the use of user data collection besides serving ads? 2014-07-31T18:52:56.307

7 How to build a textual search engine? 2014-09-12T11:48:21.617

7 Using NLP to automate the categorization of user description 2014-12-09T20:49:37.093

7 How to avoid overfitting in random forest? 2015-07-07T18:05:23.903

7 User-product positive (click data) available. How to generate negative (no-click data)? 2015-11-17T16:10:20.000

7 what is difference between one hot encoding and leave one out encoding? 2016-03-23T03:25:53.170

7 Python: Handling imbalance Classes in python Machine Learning 2016-04-25T07:26:53.743

6 Computing Image Similarity based on Color Distribution 2014-07-27T21:54:05.003

6 Is it possible to identify different queries/questions in sentence? 2014-10-16T05:44:40.183

6 Matrix properties and machine learning/data mining 2014-10-30T18:22:18.907

6 How to connect data-mining with machine learner process 2014-12-03T15:56:50.687

6 What would be a good way to use clustering for outlier detection? 2014-12-06T15:04:03.823

6 How does SQL Server Analysis Services compare to R? 2015-03-27T08:41:13.680

6 How to create a good list of stopwords 2015-05-24T21:45:02.207

6 Working with inaccurate (incorrect) dataset 2015-06-24T16:36:32.730

6 Is it advisable to rerun LASSO multiple (2) times? 2015-12-16T21:20:08.390

6 Visualizing items frequently purchased together 2016-10-06T21:27:28.460

6 How to extract paragraphs from text document? 2016-11-11T06:06:35.760

6 Is it possible to train a neural network to solve polynomial equations? 2017-02-09T16:01:38.533

6 How much data are sufficient to train my machine learning model? 2017-06-26T21:26:04.680

6 Word2Vec vs. Sentence2Vec vs. Doc2Vec 2017-06-30T07:05:33.707

5 What are good sources to learn about Bootstrap? 2014-06-17T18:13:46.230

5 Efficient solution of fmincg without providing gradient? 2014-06-21T04:59:06.620

5 Stochastic gradient descent in logistic regression 2014-07-07T11:43:48.430

5 Looking for a strong Phd Topic in Predictive Analytics in the context of Big Data 2014-09-25T20:18:46.880

5 How to do this complicated data extrapolation, prediction modeling? 2014-10-12T05:27:17.687

5 Relation mining of multivariant categorical timeseries without excluding the temporal nature 2014-11-21T15:23:37.360

5 Evaluating Recommendation engines 2014-11-26T04:40:17.840

5 How does the naive Bayes classifier handle missing data in training? 2014-12-16T13:07:55.063

5 What kind of research can be done with an email data set? 2015-05-10T09:58:56.670

5 I am trying to classify/cluster users profile but don't know how with my attributes 2015-05-19T23:34:25.213

5 Different methods for clustering skills in text 2015-05-24T11:54:43.347

5 What is the best way to propose an item from a set based on previous choices? 2015-06-23T16:29:23.320

5 Anonymizing Datasets 2015-07-30T07:16:07.637

5 Limitations on the number of items to use in apriori algorithm? 2015-08-05T05:18:55.957

5 Connection between Regularization and Gradient Descent 2015-09-02T15:21:23.047

5 What kind of statistical analyses can I do with my data? 2015-12-22T09:36:33.997

5 Any case studies using Bayesian Networks for system design trades? 2016-01-18T19:48:29.680

5 Amount of data needed and hypothesis for SVD 2016-01-31T18:10:33.100

5 General way to reduce features 2016-02-24T06:07:13.507

5 Training Dataset for Sentiment Analysis of Movie Reviews 2016-04-15T03:33:55.863

5 A few ideas to parse "events" from a text document 2016-06-03T21:57:04.387

5 Building a Tic Tac Toe game which learns by itself 2017-02-24T18:29:38.860

5 Pandas v. SFrame in learning data science 2017-03-09T12:33:59.773

5 tensorflow in production 2017-03-09T23:38:07.887

5 Understanding how distributed PCA works 2017-04-19T08:58:18.707

5 Logic in sentence : tree representation 2017-08-27T16:38:59.617

5 Missing Values in Data 2017-08-31T10:08:51.103

5 Decision Tree used for Calculating Precision, Accuracy, and Recall, class breakdown question 2018-01-28T05:12:10.113

4 When is there enough data for generalization? 2014-08-04T19:10:57.187

4 Understanding output stepAIC 2014-08-12T17:11:45.447

4 Machine Learning - Where is the difference between one-class, binary-class and multinominal-class classification? 2014-10-20T06:38:16.490

4 Rough vs Fuzzy vs Granular Computing 2014-10-26T13:12:23.597

4 Mahout Similarity algorithm comparison 2014-12-02T11:12:06.103

4 Method for solving problem with variable number of predictors 2014-12-03T21:47:34.907

4 Dataset to give same eigenvectors? 2015-03-06T04:09:56.030

4 Could one algorithm fetch keywords from texts of different natural languages? 2015-06-03T12:19:59.147

4 How to preprocess different kinds of data (continuous, discrete, categorical) before Decision Tree learning 2015-08-07T10:43:50.747

4 One Hot encoding for large number of values 2015-10-03T18:37:16.597

4 filling missing data with other than mean values 2015-10-06T10:51:52.883