Advanced frequent pattern identification


I have some dictionary data like this:

Tag | Definition
Noun | a man who works in a restaurant, serving people food and drink
Noun | a person who does a specified type of work or who works in a specified way
Noun | a person who habitually seeks to harm or intimidate those whom they perceive as vulnerable.
Adj | relating to meaning in language or logic.
Adj | of or like snow, especially in being pure white
Adj | of or according to syntax.

My task is to find an appropriate 'Tag' based on a given 'Definition'.

There are obviously some patterns that we use when describing a noun and other patterns for describing an adjective. I would like some kind of algorithm or method for finding these patterns. I tried machine learning (e.g., Naive Bayes, SVM, Logistic Regression, MLP) but due to the fact that most of my samples are 'Noun', I could not reach a good accuracy. (5000 Nouns, 2000 Adjectives, 900 Verbs, 900 Other)

I am thinking of using another algorithm that does not require a lot of data or can handle imbalanced classes. I've seen FP-Growth and Apriori algorithms. But I am not sure if they are capable of handling advanced patterns like this:

A person who does X to Y.

By mentioning this pattern, I wanted to point out that there are some keywords in a pattern ('A', 'person', 'who', 'does', 'to') and some other words that should be ignored. Is there an algorithm for handling this situation?

Note: My objective is to find frequent ordered-sets.

Arman Malekzadeh

Posted 2020-03-22T11:01:45.877

Reputation: 101

No answers