Is it possible to detect which field does a rotated "kaggle" contest data come from?


Imagine I setup a Kaggle competition with normalized stock data (e.g. price, volume, etc) plus a random rotation matrix (i.e. so that it's less obvious what the features are). Is it possible for a contestants to figure out that the data is stock-related? If yes, how?


The contest will be presented as a general machine learning contest with no specific background info. The data will not be time-series - just a set of samples, each containing a list of unnamed features plus a binary label.

When I asked the question, I was wondering more about whether contestants might figure out by examine the data using data/statistical analysis (e.g. PCA) or unsupervised machine learning methods (e.g. clustering).


Posted 2017-07-03T12:12:43.060

Reputation: 281

How will the competition be presented? "Here is some undescribed artificial data, please predict the label"? Will the data be presented as a time-series? – Neil Slater – 2017-07-03T12:35:18.783

Yeah some vague description like what you said. No, the data will not be presented as time-series. – Roy – 2017-07-04T20:49:43.730

No answers