I need to parse around 1.6k REGEX expressions such as the pair I am writing below.
I have also around 7k documents (1/2 page long each in average) that need to be parsed according to the REGEX expressions.
Right now I am using
library(rebus) library(stringr) regex_exp <- rebus::or1("(?i-mx:\\b(?:actroid\\b))", "(?i-mx:\\b(?:robot\\*w\\b)))") regex_exp <- BOUNDARY %R% regex_exp %R% BOUNDARY stringr::str_extract_all("This is my text talking about technology, but also about the actroid", regex_exp)
to found matches, but it takes approx. 3.5 minutes per file, which is of course not scalable.
Is there a more efficient library/method to parse regex expression in R? I am also naive about whether using reticulate to parse in Python and go back to R could be faster.