I have a dataset for multiclass classification of text data. The number of samples in training data are 1,20,000. If I extract features using TF-IDF vectoriser of the sklearn library it gives about 80,000 words as features. So, when I train a sklearn model for classification using Multinomial Naive Bayes on the Kaggle platform it exceeds the RAM limit. I now want to train the model using small sample batches of 20,000 samples and combine them. How can I do this? I want that instead of training using the whole dataset I use only a small chunk of the dataset. Also, I want to use ML model and not Deep learning or Neural Network model. I want to use Multinomial Naive Bayes as classification algorithm.
Note- I have preprocessed the data to remove stopwords, punctuations and lemmatized it before using feature extraction method.