NLP: Getting the top 5 or top 10 predictions


I am working on a social networking application and I have to make its news feed better. For example: If someone searches for 'suggest me some good books', it should yield some names.

Now, I have used the Infersent algo (to begin with) in order for my model to be able to answer questions.

I am getting only the best output that my model could predict viz., 'Alchemist'.

I want at least 4 or 5 other outputs, other words, the top five predictions.

I know that Xgboost has the ability to do this activity in some sorts, but I am not sure how I should use that in my problem.

Any heads up?

My apologies, I cannot share any code but I would really appreciate ideas and suggestions.

Thank you,

Viphawee Wannarungsee

Posted 2019-12-10T10:12:05.680

Reputation: 1

Welcome to data science. Is it about news feed or search engine? – Piotr Rarus – 2019-12-10T11:42:02.937



There's predict_proba method in xgboost. You get probabilites for each class, sort them and take top 5.

Piotr Rarus

Posted 2019-12-10T10:12:05.680

Reputation: 721

predict_proba is available in pretty much every classifier. It gives the best probability for an input. I want about 5 probabilities for the same input. – Viphawee Wannarungsee – 2019-12-11T03:43:11.033

Doesn’t predict_proba return probabilities for all output categories in multinomial problems? – Dave – 2020-10-11T14:36:15.050


Welcome to datascience stackexchange. The answer you are probably looking for is, top 'k' most similar based on the search query.

I am going to assume here, that your input query is vectorized using the same vectorizer that was used to vectorize your catalog. Given these vectors, you could perform a cosine similarity for example with all the ones present in the catalog and could return back the top 'k' similar ones.

Let me know if this helps. I could elaborate more in case you have further questions.

Nischal Hp

Posted 2019-12-10T10:12:05.680

Reputation: 755