Is there any kaggle competition for finding the feature for affecting revenue?

0

I have a question about finding the most significant feature of the data that affects the company revenue. I have data set that contains car share company data (car model, completed trips, lead_time, trip length, trip revenue, delivery_fee, net_revenue ...) columns.

Is there any kaggle competition out there doing EDA (Explotary data analysis) not prediction for finding the most significiant feature that affects the net_revenue or sales ? Could not find and uber of lyft data competition for this question!

I would appreciate If you happen to know such data set and share the link with me!

Thanks in advance!

Alexander

Posted 2019-08-24T05:00:00.030

Reputation: 103

Question was closed 2019-08-27T02:52:08.093

Answers

3

Is there any kaggle competition out there doing EDA (Explotary data analysis) not prediction for finding the most significiant feature that affects the net_revenue or sales ?

Although it is hard to prove a negative, I would say "no" to this.

Kaggle competitions are based on continuous metrics that can be ranked, such as getting best log loss, mean average precision etc.

Identifying the "most significant feature", although a potentially useful question to answer from a data set, is not something that can be ranked automatically and objectively in a competition.

It is possible you will find competitors performing that kind of analysis within a prediction challenge. It is fairly common to rank features by relevance, especially when using ML techniques that support doing so, such as random forests or xgboost. I am pretty sure I have seen forum posts in Kaggle discussing these things and searches for a "golden feature" which is a similar concept - usually some processed combination of one or more original features.

In addition, there have been a few competitions where the entrants were expected to produce supporting documentation. Discussion of feature relevance could be an important part of such entries. These are much rarer competitions at Kaggle than the more direct finding best predictive model.

So you may find competitions where identifying a "best feature" is part of the winning strategy, but the goal of the competion will likely be getting the best mean log loss on the test set.

Neil Slater

Posted 2019-08-24T05:00:00.030

Reputation: 24 613