What to do when feature engineering and parameter tuning don't add to the base model performance



I've been working on using LogisticRegression from scikit to try the Titanic Kaggle comp.

I've found something interesting, and that is that no amount of feature engineering and paramater tuning is changing my base model by more than a percentage point one way or the other.

At this point I'm completely self taught, so I figure one of two things is happening (maybe both):

1. I'm doing logistic regression all wrong

2. logistic regression isn't the right choice for the problem.

Are either of these two things true? My notebooks are below. They are a bit long, but if you are bored during quarantine, I'd appreciate feedback:

notebook 1: cleaning, feature engineering and comparing base models against one another.

notebook 2: hyper-parameter tuning with GridSearchCV


Posted 2020-04-05T23:22:42.943

Reputation: 279

This is not a question (no question mark at all is usually a good hint that you should reformat the question). Good questions could be :

  • "I have done multiplication and division of features, how could I further feature engineer to improve performance?"
  • "Is logistic regression a good choice when I have [describe your data] ?"
  • < – Rusoiba – 2020-04-06T22:53:35.067

@Rusoiba, despite it being obvious that I am asking for feedback on my process (as you clearly showed in your comment), I have added a question mark to serve the ever pedantic gods of StackExchange, may they live forever. – rocksNwaves – 2020-04-07T00:16:22.353

No answers