Tag: encoding

44 What is the positional encoding in the transformer model? 2019-04-28T14:43:17.090

41 Sparse_categorical_crossentropy vs categorical_crossentropy (keras, accuracy) 2018-12-01T06:28:06.650

38 Encoding features like month and hour as categorial or numeric? 2017-03-22T07:43:57.223

28 Difference between OrdinalEncoder and LabelEncoder 2018-10-07T18:55:40.833

25 How to deal with string labels in multi-class classification with keras? 2017-03-11T13:42:10.793

14 One hot encoding alternatives for large categorical values 2017-11-14T17:20:58.253

13 What is the difference between global and universal compression methods? 2014-06-18T15:27:23.313

12 Why does frequency encoding work? 2019-11-25T15:36:36.253

11 One Hot Encoding vs Word Embeding - When to choose one or another? 2018-04-03T14:13:28.643

10 In a Transformer model, why does one sum positional encoding to the embedding rather than concatenate it? 2019-07-18T08:34:46.710

7 Muti-hot encoding vs Label-Encoding 2018-08-21T12:03:12.603

6 Always drop the first column after performing One Hot Encoding? 2018-02-27T12:28:35.403

6 Mapping of categorical features into binary indicator features 2019-06-24T11:43:35.437

6 How do I encode the categorical columns if there are more than 15 unique values? 2020-12-24T20:11:58.290

5 How to handle columns with categorical data and many unique values 2019-04-08T11:04:22.033

4 Words to numbers faster lookup 2017-01-16T17:04:05.850

4 Is it effective to use one-hot encoding when its dimension is as large as thousands 2018-03-20T08:52:11.477

4 One hot encoding large dataset 2018-06-10T22:19:02.213

4 How to use one hot encoding of string categorical features in keras? 2019-01-07T20:11:28.000

4 In which cases shouldn't we drop the first level of categorical variables? 2019-03-19T19:55:36.497

4 How to automate the encoding process? 2019-06-12T08:57:30.523

3 Pandas categorical variables encoding for regression (one-hot encoding vs dummy encoding) 2017-03-20T19:26:11.217

3 For a multi-class classification problem, how to transform the target variable to a form that is usable by sklearn algorithms? 2019-02-25T19:07:04.787

3 One-hot encode multi-class multi-label sequences 2019-02-26T12:32:50.310

3 Target Encoding: missing value imputation before or after encoding 2019-03-16T10:57:11.730

3 Aggregating target-encoded array-like categorical features? 2019-04-09T18:41:03.810

3 why One-Hot Encoder can avoid the situation that the model will misunderstand the data to be in some kind of order if the data has been Label Encoding 2019-04-25T11:55:23.743

3 Applying mean encoding before or after splitting into train and test set 2019-05-19T14:37:20.370

3 One hot encoding with too many features (~ 10,000) 2019-07-21T10:15:05.903

3 Target mean encoding worse than ordinal encoding with GBDT ( XGBoost, CatBoost ) 2019-08-22T17:10:33.457

3 How to work with different Encoding for Foreign Languages 2020-07-04T07:30:05.010

3 Is Label Encoding with arbitrary numbers ever useful at all? 2020-07-17T15:23:27.123

3 On gradient boosting and types of encodings 2020-07-21T16:22:43.643

2 Encoding features in sklearn 2016-08-29T09:21:29.577

2 Do categorical features always need to be encoded? 2016-09-13T13:15:01.727

2 why leave-one-out encoding? 2017-03-24T20:44:07.477

2 OneHotEncoder vs. LabelEncoder vs. LabelBinarizer 2017-09-10T14:26:10.097

2 What approach for creating a multi-classification model based on all categorical features (1 with 5,000 levels)? 2018-01-09T20:25:37.097

2 One hot encoding vs Word embedding 2018-03-20T13:37:26.070

2 One hot encoding at character level with Keras 2018-05-02T23:29:23.750

2 How to encode data with a feature having multidimensional features (colors)? 2018-07-02T10:37:35.117

2 "Binary Encoding" in "Decision Tree" / "Random Forest" Algorithms 2018-10-03T08:14:24.893

2 Equivalent of numeric encoding when rows can contain multiple values 2018-10-03T10:08:30.637

2 Transformation of categorical variables (binary vs numerical) 2018-11-04T17:15:20.940

2 how to extract the Top contributing labels/words in universal-sentence-encoder-large - TransformerModel? 2019-01-23T15:34:39.660

2 Target encoding with cross validation 2019-02-20T13:57:35.143

2 Why don't Target/LeaveOneOut Encoders work well for Regression Tasks? 2019-07-03T21:39:09.340

2 How can I count the number of occurrences of a category in dataset as part of an Sklearn Pipeline 2019-07-18T14:42:28.507

2 Could I add a one hot encoding to each feature representing "has data" versus "has no data" 2019-11-18T20:41:35.280

2 Different encoders applied to a dataset 2020-01-03T19:09:22.397

2 Magnifying or reducing the size of input groups into a neural network 2020-01-22T20:30:15.880

2 Memory efficient encoding logic for group categories 2020-02-14T13:18:54.363

2 Explanation about i//2 in positional encoding in tensorflow tutorial about transformers 2020-08-08T22:29:26.477

2 Choose-many categorical features: alternatives to one-hot encoding? 2020-11-21T01:32:51.847

1 The larger an encoding dimension in NLP the better? 2017-03-08T13:53:21.987

1 Faced problem while applying OneHotEncoder 2018-01-14T09:10:39.627

1 Differences of get_dummies and labelbinarizer? 2018-01-28T02:19:24.000

1 How to pre-process frequency of a series of signals? 2018-05-28T22:15:29.990

1 What happens if I do not encode the lables or classifiers in the data? 2018-09-14T18:00:21.077

1 One Hot Encoding of Age 2018-12-03T14:59:36.187

1 Encoding multiple observations from the same feature space 2019-01-10T12:22:19.750

1 Scaling label encoded values for Linear Algorithms 2019-01-15T05:56:45.067

1 How to encode H3 geohash in regression model 2019-02-13T18:05:45.160

1 Is it right to impute Train and Test set? 2019-05-20T17:53:24.693

1 String handling by OneHotEncoder 2019-08-02T10:31:40.897

1 Encode missing data and unseen data 2019-09-12T11:05:10.000

1 Does the predict function in machine learning understand categorical data 2019-11-06T21:11:46.150

1 What is the advantage of positional encoding over one hot encoding in a transformer model? 2019-11-12T05:49:57.037

1 Is it possible to know the output vectors of MLP Classifier of scikit learn? 2019-11-15T08:55:46.140

1 Validity of PU learning while using character-level encoding using CNNs for classifying text data 2019-11-29T06:34:24.243

1 Unsupervised encoding of categorical features 2019-12-11T10:26:06.230

1 How to encode a column containing both string and numbers 2020-02-05T04:43:08.843

1 LabelEncoder with a Multi-Layer Perceptron? 2020-02-10T23:11:23.883

1 Aggregating multiple encoded categorical values 2020-03-26T06:42:06.723

1 Encoding with OrdinalEncoder : how to give levels as user input? 2020-04-15T00:25:41.760

1 File path encoding to feature 2020-06-28T20:04:41.143

1 How to encode ordinal data before applying linear regression in STATA? 2020-07-29T05:21:55.010

1 is it better to correlate and encode or encode and correlate? 2020-09-12T17:58:52.357

1 How does the R implementation of RandomForest split nodes on categorical data? 2020-10-22T19:13:15.813

1 Encoding Tags for Random Forest 2020-11-16T10:24:29.077

1 Encoding Data for ML Modeling with Key Value Pair 2021-01-25T16:26:49.783

1 Multi-Feature One-Hot-Encoder with varying amount of feature instances 2021-01-29T11:54:16.557

0 Failure tolerant factor coding 2016-12-08T21:41:52.877

0 One hot encoding error "sort.list(y)..." 2018-01-09T21:33:24.960

0 Is it good practice to always remove highly correlated variables? 2018-03-01T12:03:44.437

0 How to use multiple encoders(one-hot and numerical) together for PCA 2018-07-02T09:29:37.013

0 How to transform dictionary data into a string vector? 2018-07-15T13:34:57.340

0 Should I use pandas get_dummies and create additional columns or use my own encoding code that keeps 1 column? 2018-10-05T15:11:27.707

0 How to deal with name strings in large data sets for ML? 2019-03-06T10:06:45.863

0 Binary Encoding of Ordinal Categories 2019-05-23T21:29:12.670

0 How does Byte Pair Encoding work? 2019-06-17T13:42:33.090

0 Pre-processing data to make predictions on deployed Sklearn model 2019-06-18T13:06:39.947

0 String indices must be integers 2019-06-23T11:20:34.433

0 Is random forest a kind of spatial feature encoding? 2019-07-15T10:18:22.770

0 Target encoding before split on skewed data 2019-11-07T10:55:00.903

0 Frequency/Count encoding 2019-11-07T13:45:48.827

0 Encoding Categorical Data Without Increasing the Dimension 2019-11-24T00:47:29.360

0 Binary Classification - One Hot Encoding preventing me using Test Set 2019-11-24T10:38:21.277

0 splitting into train test by train_test_split of float values? 2020-01-06T05:03:09.237