## Similar algorithm to apriori to find unpopular sequential patterns

1

I am working with a dataset that looks like similar to this one but it is way larger (approx. 30.000 arrays):

sequences = [[car, house, bike, train],
[car, house, bike],
[apartment, train, car],
[building,flower, bike, train],...]


It is a bit difficult to explain but there are some items in the data, that occur pretty often and some dont. Nevertheless the items that occurs less often occur in sequences that I want to find as well. So in this case I would like to find all transportation possibilities that occur together, but I also want to find all housings that occur together or all plants ect. I know that plants and buildings dont occur very often in general so the support value of these itemsets in apriori are penalized by this. All my results somehow contain transportation possibilities as their support values are higher as they occur often within the data set.

Currently I am using MLxtend's apriori but I would be glad if there are algorithms that I could use that do not penalize less often occuring items in the data set.

Is there some library/algorithm I could use to solve this problem?

Have you tried lowering min_support parameter? Apriori prunes the result tree based on the minimum support threshold. Any similar method will put high-frequency results first, so you are bound to run into the same problem with your data. – Vlad_Z – 2020-06-27T06:47:31.460

Hi @Vlad_Z, yes I have lowered it to approx. 0.007 which resulted in extraordinary high memory usage and basically the same results but with larger filtered sequences. So that didn't help either. Thanks for the answer anyways! So maybe the way I am trying to solve the problem is not correct? – Eve Edomenko – 2020-06-29T09:35:00.100

Yes, at support rates this low you should be using a different algorithm, such as Apriori-Inverse (DOI: 10.1007/11430919_13). I will try to provide a more useful answer when/if possible. – Vlad_Z – 2020-06-29T10:42:57.603

Thank you very much @Vlad_Z, this is a great help for me – Eve Edomenko – 2020-06-29T12:47:02.533