Tag: bert

26 What is GELU activation? 2019-04-18T08:06:24.200

21 How to get sentence embedding using BERT? 2019-11-04T15:22:32.240

19 BERT vs Word2VEC: Is bert disambiguating the meaning of the word vector? 2019-06-21T16:25:16.710

15 Can BERT do the next-word-predict task? 2019-02-28T08:37:42.190

11 What is purpose of the [CLS] token and why is its encoding output important? 2020-01-09T17:20:10.963

8 Bert Fine Tuning with additional features 2019-03-05T02:57:48.780

8 Preprocessing for Text Classification in Transformer Models (BERT variants) 2019-11-08T06:28:48.750

8 What are the good parameter ranges for BERT hyperparameters while finetuning it on a very small dataset? 2019-12-10T18:31:43.253

7 what is the first input to the decoder in a transformer model? 2019-05-11T08:36:07.907

7 BERT: it is possible to use it for topic modeling? 2019-06-05T17:07:41.717

7 Why is word prediction an obsession in Natural Language Processing? 2019-10-16T14:52:38.063

6 How is WordPiece tokenization helpful to effectively deal with rare words problem in NLP? 2019-03-27T16:54:59.150

6 What is whole word masking in the recent BERT model? 2019-06-15T23:13:57.290

6 Is BERT a language model? 2020-05-13T12:22:22.470

5 meaning of fine-tuning in nlp task 2019-05-27T15:48:21.827

5 BertPunc (punctuation restoration with BERT) 2020-02-19T23:06:57.957

5 Does BERT has any advantage over GPT3? 2020-09-12T04:37:50.197

4 Calculating cosine similarity between 3D arrays using Python 2019-06-18T10:36:27.590

4 How to add a CNN layer on top of BERT? 2019-06-24T21:26:21.947

4 How to load the pre-trained BERT model from local/colab directory? 2019-12-06T10:59:04.410

4 Bert for QuestionAnswering input exceeds 512 2020-09-14T12:59:36.467

4 how to run bert's pretrained model word embeddings faster? 2020-12-28T10:31:24.170

4 Is a BiLSTM layer required if we use BERT? 2021-01-06T07:05:20.717

3 How Transformer is Bidirectional - Machine Learning 2019-03-16T07:12:36.477

3 Bert: fine-tuning the entire pre-trained model end-to-end vs using contextual token vector 2019-05-27T17:01:51.937

3 Structuring a LSTM Layer 2019-08-19T17:53:22.890

3 Similarity of words using BERTMODEL 2019-11-15T17:06:44.160

3 NLP Transformers: How to get a fixed sentences embedding vectors size? 2019-11-25T15:13:00.457

3 where to store embeddings for similarity search? 2019-11-26T10:30:11.090

3 Why is the decoder not a part of BERT architecture? 2019-12-21T17:09:07.040

3 Multilingual Bert sentence vector captures language used more than meaning - working as interned? 2020-01-07T07:29:00.080

3 BERT word embedings for finding word definition 2020-02-05T22:30:01.350

3 What are the elements in a BERT word embedding? 2020-02-11T19:10:10.593

3 BERT in production 2020-02-27T17:16:12.383

3 How to generate a sentence with exactly N words? 2020-03-03T09:30:27.143

3 What should be the labels for subword tokens in BERT for NER task? 2020-03-13T13:32:57.290

3 German Chatbot or conversational AI 2020-05-30T10:55:41.627

3 Are there any objections to using the same (unlabelled) data for pre-training of a BERT-Based model and the downstream task? 2020-08-11T04:16:09.293

3 Trained BERT models perform unpredictably on test set 2020-12-11T10:23:51.530

3 Detecting grammatical errors with BERT 2021-01-06T09:48:55.037

2 Incrementally Train BERT with minimum QnA records - to get improved results 2019-03-16T10:30:21.220

2 Paragraph Generator using BERT or GPT 2019-03-20T16:22:00.647

2 What is the use of [SEP] in paper BERT? 2019-05-07T04:53:18.680

2 bert-as-service maximum sequence length 2019-05-09T18:06:36.927

2 How to classify neutral sentiments using BERT 2019-06-11T05:27:34.530

2 How can I add custom numerical features for training to BERT fine tuning? 2019-07-02T06:41:19.763

2 Predicting word from a set of words 2019-07-02T11:27:51.977

2 User profiling based on multiple posts 2019-12-30T10:08:29.337

2 Semantic text similarity using BERT 2020-02-14T12:01:07.367

2 Can I fine-tune BERT, ELMO or XLnet for Seq2Seq neural machine translation? 2020-02-24T08:40:38.953

2 Remove subwords from BERT output 2020-03-15T19:44:58.923

2 Does BERT use GLoVE? 2020-04-28T21:23:47.850

2 Can we use BERT for only word embedding and then use SVM/RNN to do intent classification? 2020-08-04T13:08:47.303

2 NLP SBert (Bert) for answer comparison STS 2020-08-20T00:19:19.017

2 How should I use BERT embeddings for clustering (as opposed to fine-tuning BERT model for a supervised task) 2020-08-21T02:00:07.510

2 What GPU size do I need to fine tune BERT base cased? 2020-08-26T13:48:40.433

2 Loss first decreases and then increases 2020-08-29T09:30:52.917

2 How pre-trained BERT model generates word embeddings for out of vocabulary words? 2020-11-17T19:34:13.260

2 What is the difference between BERT architecture and vanilla Transformer architecture 2020-11-30T03:34:44.230

2 Medical NER for French language 2021-02-11T03:10:55.093

1 BERT has a non deterministic behaviour 2019-04-17T07:57:15.443

1 Get long answers from BERT 2019-06-14T08:27:56.880

1 BERT : text classification and feature extractionn 2019-07-31T02:38:31.660

1 Can BERT embeddings be used to reproduce the original content of the text? 2019-10-21T15:38:14.290

1 BERT Model Evaluation Measure in terms of Syntax Correctness and Semantic Coherence 2019-11-14T03:29:42.570

1 BERT for non-textual sequence data 2019-11-14T08:55:22.413

1 Weight matrices in transformers 2019-12-05T10:34:50.910

1 How can I feed BERT to neural machine translation? 2019-12-06T11:50:33.677

1 Measuring quality of answers from QnA systems 2019-12-21T15:21:19.663

1 fine tune BERT in a small GPU 2019-12-29T21:39:53.723

1 Predicting Missing Word in Text 2020-01-09T17:07:08.743

1 How does BERT and GPT-2 encoding deal with token such as <|startoftext|>, <s> 2020-01-13T10:07:32.993

1 What is a 'hidden state' in BERT output? 2020-01-20T22:00:44.913

1 Multimodal end-to-end deep learning 2020-02-18T02:48:21.767

1 Generating synonyms or similar words from multiples word embeddings 2020-03-05T12:14:52.527

1 Fastest way for 1 vs all lookup on embeddings 2020-03-15T15:29:51.433

1 Difference between using BERT as a 'feature extractor' and fine tuning BERT with its layers fixed 2020-03-26T10:11:35.070

1 How can I tokenize a text file with BERT or something similar? 2020-04-05T14:48:44.393

1 Can BERT be used for predicting words? 2020-04-16T09:03:52.267

1 Training PCA on BERT word embedding: entire training dataset or each document? 2020-04-18T10:30:40.883

1 System Requirement to train BERT model 2020-04-19T05:51:39.783

1 How to convert subword PPL to word level PPL? 2020-04-29T20:09:43.183

1 Implementation of BERT using Tensorflow vs PyTorch 2020-05-07T18:57:06.730

1 Using BERT for input embeddings in a seq2seq model 2020-05-21T14:27:11.570

1 TensorFlow1.15, multi-GPU-1-machine, how to set batch_size? 2020-06-01T05:23:51.590

1 Next sentence prediction in RoBERTa 2020-06-29T20:55:34.947

1 What are the merges and vocab files used for in BERT-based models? 2020-06-30T20:40:39.060

1 Imbalanced Dataset (Transformers): How to Decide on Class Weights? 2020-07-21T11:38:25.123

1 Problem of continuous training - Supervised learning 2020-08-06T10:07:55.177

1 Splitting into multiple heads -- multihead self attention 2020-08-22T16:19:20.303

1 Can we use sentence transformers to embed sentences without labels? 2020-08-25T14:39:45.697

1 Does finetuning BERT involving updating all of the parameters or just the final classification layer? 2020-09-04T20:54:25.300

1 Question about BERT embeddings with high cosine similarity 2020-09-10T15:13:03.027

1 If i use use BERT embeddings for if cosine(sent1,sent2) > 0.9, then is it fair to assume s1 and s2 are similar 2020-10-12T13:16:08.563

1 How to apply pruning on a BERT model? 2020-10-20T12:20:10.860

1 NLP Bert model to to calculate text similarity, same sentence but not close similarity 2020-10-23T09:23:07.763

1 Can I fine-tune the BERT on a dissimilar/unrelated task? 2020-10-30T07:20:30.487

1 From where does BERT get the tokens it predicts? 2020-11-16T19:00:50.743

1 How do I handle class imbalance for text data when using pretrained models like BERT? 2020-12-31T14:09:09.513