Tag: text

23 Sentence similarity prediction 2017-10-22T07:36:15.920

19 How do you apply SMOTE on text classification? 2018-02-10T11:18:25.340

8 How to use TFIDF vectors with multinomial naive bayes? 2017-04-05T17:10:51.403

8 Which type auto encoder gives best results for text 2018-03-25T20:43:33.720

6 How to implement hierarchical labeling classification? 2019-02-25T12:17:32.290

5 Text similarity using RNN 2017-01-24T11:13:22.463

5 Data transformations in hierarchical classification 2019-12-05T00:03:56.430

4 Automatic annotation of medical text data 2016-04-10T13:00:00.167

4 How can I group texts with similar content together? 2016-05-03T14:33:40.117

4 Extract 2 pieces of information from a string - what to use? 2016-09-26T09:52:12.410

4 Doc2vec to calculate cosine similarity - absolutely inaccurate 2017-11-06T11:03:51.883

3 Word analysis in Python 2016-04-03T08:11:14.220

3 How to classify support call texts? 2016-04-20T21:05:09.387

3 Choosing the right parameters to train a Tf-Idf vectoriser 2016-07-23T22:33:04.860

3 How does ,the Mutlinomial Bayes's alpha parameter, affects the text classification task? 2018-04-18T10:28:15.793

3 What is the minimum number of times a word needs to appear in word2vec training corpus for quality results? 2018-05-07T20:01:01.310

3 Is there any clear tutorial for how to use AutoEncoders with text as input 2018-05-16T11:25:47.463

3 Changing multiple models into 1 model 2018-06-07T11:45:19.017

3 Bidirectional Encoder Representations from Transformers in R 2019-05-13T11:59:26.847

3 Multiclass classification of textual data 2019-06-18T08:11:10.813

3 Classifying one particular class of documents from the rest 2020-01-31T11:49:12.633

3 Extract date/duration from text 2021-01-14T21:55:42.297

2 What are some function/package in R to find similarity of individual words not in the context of sentences? 2017-04-09T16:33:00.020

2 Discovering string "motifs" in python 2017-10-04T14:19:26.723

2 One hot encoding at character level with Keras 2018-05-02T23:29:23.750

2 Why is spam detection a classification problem and not a class modelling problem 2018-05-05T04:19:52.633

2 Accuracy reduces drastically when using TruncatedSVD with hashingvector 2018-05-30T11:38:00.393

2 CNN to many outputs 2019-01-15T13:45:11.243

2 How to extract and classify data from a column in excel? 2019-02-11T11:44:54.367

2 Match a two items from two different receipts 2019-02-28T06:55:48.700

2 Implementing back translation as a data augmentation for text classification 2019-03-29T07:25:32.243

2 How to extract keywords from a list of URLs? 2019-11-25T23:50:31.813

2 What is the best approach for classifying non-English text 2019-12-17T09:51:55.237

2 Why do probabilities sum to one and how can I set optimal threshold level? 2020-01-17T10:32:17.510

2 How to separate words that are together in a large data set 2020-01-19T17:43:10.803

2 Needed: Java library to calculate text readability/complexity 2020-02-04T15:28:09.283

2 Text vectorizer that capture feature offset in the text? 2020-03-19T14:39:39.517

2 Are there any open-source text annotation for multi label classification tools? 2020-05-08T09:07:38.737

2 Clustering mixed data types - numeric, categorical, arrays, and text 2020-06-14T13:52:54.303

1 Dense text representations 2015-12-21T06:50:07.703

1 Generating a text training dataset from a grammar 2016-06-12T14:34:28.933

1 Text similarities 2016-09-21T19:31:50.930

1 Text processing 2017-02-03T07:58:43.357

1 Section/Topic segmentation in HTML and plaintext documents 2017-03-19T13:48:16.153

1 Methods for string classifications 2017-05-08T13:29:19.747

1 Where can I find datasets with labeled duplicate text documents? 2017-09-10T17:20:28.910

1 How to add incorporate meta data into text classification? 2017-11-30T14:08:58.723

1 Grouping of similar looking text 2017-12-18T08:34:29.863

1 Text classification problem using Python or R 2017-12-20T04:44:23.207

1 Unsupervised clustering of unstructured text by document type 2018-01-09T18:14:26.963

1 Multiclass classification with many classes and wide range of sample sizes 2018-01-29T00:19:52.100

1 Categorize text as Body, Heading in a loosely formatted document 2018-02-28T07:52:29.153

1 Grouping company information 2018-05-18T16:00:07.773

1 Where can I find a dataset for long sequence text chunking? 2018-05-29T21:00:57.370

1 Ordering quotes in a list based on user input and text analysis 2018-07-30T15:41:47.570

1 Can I treat text review analysis as a regression problem? 2018-09-08T21:09:31.850

1 What options are out there to extract text from a group of PDFs where each PDF is formatted differently but contains the general same content 2019-01-31T19:58:54.387

1 Word classification in the context 2019-02-11T11:39:35.920

1 python - Identify variable in similar sentences 2019-03-03T02:44:51.323

1 Input data of variable length - two scenarios 2019-03-06T20:22:49.313

1 Fuzzy matching of author names 2019-04-24T12:07:45.890

1 Complete IPv4 Address Space 2019-05-28T15:33:36.830

1 Is it OK to train a binary classifier using all the extremely imbalanced data if the majority class is negative? 2019-05-31T13:50:08.240

1 Is there any library available for balancing imbalanced text dataset? 2019-06-28T09:14:42.737

1 Technical term for using regular expressions to classify text? 2019-07-08T05:31:28.993

1 How to match a word from column and compare with other column in pandas dataframe 2019-07-18T12:07:09.200

1 How do I identify specific parts of a PDF document? 2019-08-21T13:28:43.923

1 Text classification for data with multiple labels per observation 2019-11-18T16:38:00.463

1 how to resize image without changing DPI in opencv for detecting text and feeding into OCR? 2020-01-03T12:57:38.233

1 Need some info regarding string matching algorithms? 2020-01-10T09:33:47.917

1 Extract editing history from Microsoft Word documents? 2020-01-12T20:31:17.620

1 Multimodal end-to-end deep learning 2020-02-18T02:48:21.767

1 Computer science corpus for training a language model 2020-02-20T13:06:52.740

1 How to implement LSTM using Doc2Vec vectors to get representation? 2020-04-02T21:21:09.387

1 Paragraph extraction from text 2020-04-10T18:47:34.660

1 how to classify text based on more than one column 2020-06-16T13:16:57.887

1 Chinese word segmentation using neural networks 2020-08-12T17:08:18.540

1 How can I get a value of context vector in GPT? 2020-09-02T13:35:20.123

1 NLP Classification labels have many similarirites, converge to and replace to only have one 2020-11-05T17:13:21.640

1 What's the difference between sequence preprocessing and text preprocessing in Keras? 2020-11-17T12:20:04.920

1 distribution difference between image and text 2020-11-23T19:39:48.023

1 Model to detect specific semantic content without labeled data 2020-12-03T08:52:56.067

1 How to Identify Repeating Data Entries when the Repeated Entries are Spelled or Constructed Differently 2020-12-16T03:07:24.543

1 Which phrase should be returned in case of multiple matches when comparing text? 2020-12-23T10:36:08.770

0 How to down-weight non-words in text classification? 2016-06-11T18:24:31.133

0 Which approach for user classification on chat text (classifier, representation, features)? 2016-10-24T05:34:37.707

0 Text Mining of Research Paper Abstracts 2017-01-04T15:14:30.100

0 How to use binary text classifier(built using SVM with TF-IDF) to classify new text document? 2017-03-31T11:16:22.653

0 Using training data generated with pure regular expressions - Can machine learning surpass the accuracy of your regular expression? 2017-04-15T01:15:53.130

0 What methods to create singular content classification from inconsistent inbound info? 2017-07-27T11:50:45.587

0 Classify text labels in to a similar category 2017-12-30T11:47:11.090

0 Street address clustering? 2018-07-10T07:30:45.213

0 Use text similarity (cosine) instead of machine learning to classify companies into industries 2018-11-17T16:33:35.587

0 Multi-Class Text Classification: Doc2Vec performing very bad compared to Hashing Vector 2018-11-28T14:35:57.197

0 How to use correlation matrix when the dataset contains multiple columns with text data? 2019-01-15T17:59:51.877

0 Size of Output vector from AvgW2V Vectorizer is less than Size of Input data 2019-01-19T17:02:25.817

0 Extracting structure and content from invoices 2019-05-22T06:05:44.237

0 Complete IPv4 Address List 2019-05-26T01:01:50.020

0 nltk's stopwords returns "TypeError: argument of type 'LazyCorpusLoader' is not iterable" 2019-06-08T14:25:42.203