Too much inputs = overfitting?



First question : can I mix different sorts of inputs types for example, height and age (of course my inputs are normalized)? in general, can we mix different types of inputs in a neural network ?

Second question : can too much different inputs cause overfitting ?

I am using 120 inputs neurons and 20,000 train data and I am overfitting at 53% accuracy (bad)...

Thank you.

Fang 1Gao

Posted 2018-06-24T23:09:22.277

Reputation: 31

What do you mean by "overfitting at 53%". Do you mean your test accuracy is just 53% and training accuracy is good? – ab123 – 2018-06-25T07:11:38.950

@ab123 I mean that at 53%, my validation ~= 53%; but when training > 53%, my validation drop to 40%, etc... – Fang 1Gao – 2018-06-25T15:37:50.050



Yes, you can mix any different sort of inputs when the scales of the features are similar, which is achieved by normalising the feature vectors.

I assume you mean too many features when you say 'too much input'

If you mean the size (number of training examples) of input data, size of input data is not directly related to overfitting. Overfitting depends on model complexity. It happens when model tries to fit to the noise of the input data and hence becomes too specific that it can't generalize well to new training data.

Any model that is "sufficiently" complex (for eg. one that contains many hidden layers, large number of neurons in each layer, whose weights are not regularized) can easily converge to give very little loss on training data (unless it converges to a different sub-optimal local minima), but will give poor accuracy on test data. But in general, on the contrary, lack of enough data often leads to overfitting because the model tries to learn based on very few specimens which are less diverse. It's like showing a child a samples of balls containing only white and orange table tennis balls, and asking him/her to identify a blue colored ball.

Too many features can lead to overfitting because it can increase model complexity. There is greater chance of redundancy in features and of features that are not at all related to prediction.

For eg. if you're predicting quality of a tennis ball, the feature chosen as colour of the ball is irrelevant, but the network will learn from training examples and there is a chance that since people like yellow colored balls to play with, they play more often with them and those balls don't last long.


Posted 2018-06-24T23:09:22.277

Reputation: 207


Based on my experience so far, having too many features as inputs to your NN ,tends to degrade performance *full disclaimer i'm no expert, but smarter people than me have coined a term called The curse of dimensionality. Here is a paragraph I took from Medium Curse of dimensionality and feature reduction

The curse of dimensionality occurs because the sample density decreases exponentially with the increase of the dimensionality

Good now we know that having too many features is bad for our model performance (or feature to sample ratio which increases significantly) what can we do to solve it?

Right now I can think of 3 ways

  1. Feature Selection

  2. Feature Extraction

  3. Ensemble learning of different sub series of those features (yummmyyy :) )


Posted 2018-06-24T23:09:22.277

Reputation: 21