4

I'm training a Neural Network for pattern recognition. I have a matrix of examples of size ($N$x$4$) with $N$ examples and $4$ variables.

When I train the Network the Number of examples used for training vs. Cross-entropy, the performance of the Network it's not improving when adding new examples.

I suspect that the examples are highly correlated between them and adding new examples is not giving new information to the network, but I don't figure how to measure that correlation.

Each row of the matrix can be labeled $\mathbf{X}_{i}$. Suppose zero mean, the covariance matrix between examples would be $\mathbf{R}_{ij} = E[\mathbf{X}_{i} \mathbf{X}_{j}^T]$. I can also calculate the covariance matrix of $\mathbf{X}$, whose elements are $\mathbf{R}_{ij}$ but in that case I would have a matrix of $\sim 300000$x$300000$.

Is there a way of calculate some average correlation or an other measure of correlation between examples?.

Thanks.

EDIT: I don't want to know the correlation matrix of variables, I want to know the correlation between examples.

EDIT2: This is the learning curve:

What is the topology of your network? You have very few features and many rows, maybe increasing the complexity will help, although the (big) difference in CV and train scores is weird – Jan van der Vegt – 2016-11-22T08:26:18.613