How can i add weights in a bag of words model in text analysis?

1

2

I have a twitter sentiment analysis using bag of word approach from the training set. Now i want to add weights to certain words so that they are considered more important than others.

karan sindwani

Posted 2016-05-03T06:50:14.997

Reputation: 45

The answer to this question depends on what model you are using for your sentiment analysis algorithm. So what model are you using? Naive Bayes, LogReg, Recurrent Net? – Armen Aghajanyan – 2016-05-03T20:20:44.763

I am using Naive Bayes – karan sindwani – 2016-05-04T05:30:15.083

Answers

2

One possible solution is to introduce prior counts for words (higher counts for words that are more important) that could be added to the term-document matrix.

An alternative solution is to compute tf-idf features (weights that modify word counts based on frequency) and apply additional weighting to tf-idf with higher weights corresponding to important words.

Vadim Smolyakov

Posted 2016-05-03T06:50:14.997

Reputation: 586

1

If you are trying to add weights to rare or infrequent terms, which appear only in few texts, definetly you should use the tf-idf technique, which computes the frequency of each word on all the data set and after that computes a weight of each word in each text.

Another case, if you want to add weights to specific words, you just can modify the tf-idf technique.

Federico Caccia

Posted 2016-05-03T06:50:14.997

Reputation: 660