Tag: language-model

38 What is the difference between model hyperparameters and model parameters? 2016-09-24T11:24:50.800

14 Are there any good out-of-the-box language models for python? 2018-09-20T13:34:22.520

11 Word2Vec embeddings with TF-IDF 2018-03-04T12:07:33.313

11 What is purpose of the [CLS] token and why is its encoding output important? 2020-01-09T17:20:10.963

10 What is generative and discriminative model? How are they used in Natural Language Processing? 2014-05-18T06:17:37.587

10 How to create a good list of stopwords 2015-05-24T21:45:02.207

6 What is whole word masking in the recent BERT model? 2019-06-15T23:13:57.290

6 Is BERT a language model? 2020-05-13T12:22:22.470

5 Can finite state machines be encoded as input/output for a neural network? 2016-07-21T09:48:54.850

4 Improve CoreNLP POS tagger and NER tagger? 2014-09-11T17:09:52.313

4 Words to numbers faster lookup 2017-01-16T17:04:05.850

4 How do we pass data to a RNN? 2018-02-18T01:26:48.557

4 How should I treat these non-English documents in the NLP task? 2019-04-29T21:43:59.923

3 How does Alexa utterance parsing work? 2016-03-06T18:33:58.750

3 how much text data is required for a meaningful use of word2vec 2016-07-27T15:38:08.720

3 What tools are available for programming language parsing for ML? 2017-01-29T11:34:11.150

3 LSTM training/prediction with no starting sequence 2018-08-04T20:06:46.823

3 Fine tune gpt2 via huggingface API for domain specific LM 2019-12-28T11:10:10.223

3 What are the elements in a BERT word embedding? 2020-02-11T19:10:10.593

3 In smoothing of n-gram model in NLP, why don't we consider start and end of sentence tokens? 2020-10-05T10:47:45.950

3 how to programmatically introduce grammatical errors in sentences 2021-01-14T09:45:41.063

2 Neural Networks for Predictive typing 2015-12-30T16:11:20.963

2 Stanford NER Training - Assign weight to each word 2016-03-28T13:43:29.393

2 In plain English, how to descibe i/o of the TensorFlow for language modelling? 2016-05-13T18:58:48.893

2 Fasttext exception error 2018-04-12T06:43:21.537

2 Build an Autocomplete model for document titles 2018-12-29T07:50:34.140

2 What does 'Linear regularities among words' mean? 2019-03-04T15:34:12.543

2 How to prepare the data for text generation task 2019-03-23T00:43:54.160

2 Any good Implementations of Bi-LSTM bahdanau attention in Keras? 2019-12-02T21:22:22.810

2 Evaluating Language Model on specific topic 2020-10-02T16:56:25.680

1 Given one language ngram model, how do I compare likelihoods of two texts of different length? 2016-10-30T22:28:13.740

1 Hidden Markov Models: Linking states to labels after EM training 2017-04-02T10:29:23.190

1 Word2Vec, softmax function 2018-03-02T20:33:17.210

1 How is maximizing L(lambda1, lamda2, lamda3) equivalent to minimizing perplexity? 2019-02-07T04:49:52.050

1 The principle of LM deep model 2019-03-22T09:35:50.487

1 Why is MLP working similar to RNN for text generation 2019-03-28T17:46:13.700

1 Why Heaps' Law Equation looks so different in this NLP course? 2019-04-09T15:00:00.133

1 Skip-gram trained on The Hobbit: no improvement in the similarity of the word representation 2019-09-04T14:52:51.850

1 Language modelling for Spell Checker 2019-10-17T05:32:52.107

1 BERT Model Evaluation Measure in terms of Syntax Correctness and Semantic Coherence 2019-11-14T03:29:42.570

1 Transfer learning between Language Model and classification 2019-11-25T11:29:15.457

1 The differences between BNf and JSGF in NLP? 2019-12-28T15:45:24.550

1 How to convert subword PPL to word level PPL? 2020-04-29T20:09:43.183

1 Comparing Language Model of two corpora 2020-04-30T20:13:40.567

1 Generate text using user-supplied keywords 2020-05-23T17:35:17.547

1 Effect of discounting parameter on Language Model Perplexity 2020-07-10T17:45:15.427

1 When using padding in sequence models, is Keras validation accuracy valid/ reliable? 2020-07-19T19:47:48.377

1 State-of-the-art Python packages that can evaluate language similarity 2020-08-22T03:34:01.423

1 For an n-Gram model with n>2, do we need more context at end of each sentence? 2020-10-05T15:06:24.877

1 Can I fine-tune the BERT on a dissimilar/unrelated task? 2020-10-30T07:20:30.487

1 From where does BERT get the tokens it predicts? 2020-11-16T19:00:50.743

1 Inference order in BERT masking task 2020-12-31T20:33:17.627

1 what's the motivation behind BERT masking 2 words in a sentence? 2021-01-24T20:03:49.967

1 Would there be any reason to pretrain BERT on specific texts? 2021-02-07T20:21:47.693

0 Diminishing returns in language identification data set size? 2016-11-12T23:53:47.347

0 NLP - extract sentence parts related to people 2018-06-12T11:57:36.197

0 How to feed data for ngram model? 2019-08-02T14:39:40.737

0 How to calculate perplexity in PyTorch? 2019-12-22T10:30:12.353

0 Feeding XLM-R embeddings to neural machine translation? 2020-03-18T08:19:33.093

0 How to find out what each of the layers in NN does? 2020-03-19T13:45:15.897

0 How can I make a whitespace tokenizer and use it to build a language model from scratch using transformers 2020-04-14T03:40:29.233

0 Dirichlet smoothing as an IDF component 2020-05-26T11:43:20.400

0 Why does English ELMo model give embeddings for non-English words? 2020-06-25T10:21:52.013

0 How to: Plot global mean precipitation data from the NDAA onto map regions? 2020-09-25T01:35:41.100

0 What is the difference between GPT blocks and Transformer Decoder blocks? 2020-11-16T09:54:05.917

0 Help understanding input to biaxial network for generating music 2020-11-18T01:55:07.773

0 Optimal input setup for character-level text classification RNN 2020-11-29T10:01:00.423

0 Where does the evaluation speed advantage of Transformer-XL come from? 2020-12-02T22:56:58.850

0 Where does BERT fit in the Machine Learning Hierarchy? 2020-12-04T06:36:56.273

0 BERT uses WordPiece, RoBERTa uses BPE 2020-12-11T19:10:22.927

0 Multilingual alternatives for med7 2021-02-23T10:00:14.840