Is it possible to detect which field does a rotated "kaggle" contest data come from?


Imagine I setup a Kaggle competition with normalized stock data (e.g. price, volume, etc) plus a random rotation matrix (i.e. so that it's less obvious what the features are). Is it possible for a contestants to figure out that the data is stock-related? If yes, how?


The contest will be presented as a general machine learning contest with no specific background info. The data will not be time-series - just a set of samples, each containing a list of unnamed features plus a binary label.

When I asked the question, I was wondering more about whether contestants might figure out by examine the data using data/statistical analysis (e.g. PCA) or unsupervised machine learning methods (e.g. clustering).


How will the competition be presented? "Here is some undescribed artificial data, please predict the label"? Will the data be presented as a time-series? – Neil Slater – 2017-07-03T12:35:18.783

Yeah some vague description like what you said. No, the data will not be presented as time-series. – Roy – 2017-07-04T20:49:43.730

