Can you recommend a machine learning challenge that is suitable for novices?

4

3

I am looking for a challenge that is suitable for a group of novices who want to learn the basics of data science and machine learning. The challenge should match the following criteria:

  • is based on a real application or is at least realistic
  • has a clearly defined goal and partial progress is measurable
  • includes a machine learning component, but also other aspects of data science
  • should be doable within 3 to 6 weeks
  • is suitable for novices
  • it should be an actual challenge in the sense that you cannot just look up near-optimal solutions from the internet

clstaudt

Posted 2017-01-17T09:50:27.347

Reputation: 129

1Why is this tagged "Kaggle"? – Neil Slater – 2017-01-17T10:04:58.377

Because kaggle is a potential source, although I didn't really find something fitting among the current competitions. – clstaudt – 2017-01-17T15:00:51.010

Answers

1

Let me share my list of ML training resources:

  1. http://www.crowdanalytix.com/listContests
  2. http://datahack.analyticsvidhya.com/contest/all/
  3. http://www.chalearn.org/challenges.html (some links may be dead)

And I am sure you read this Reddit: http://www.reddit.com/r/MachineLearning/

Denis Rasulev

Posted 2017-01-17T09:50:27.347

Reputation: 101

2

You've already answered yourself by tagging kaggle.

Let me share the two competitions that I have every new hire in my team go through- and then have them keep improving their solurion for the next six months. These are a very well curated set of problems for a budding data scientist!

  1. Titanic: Machine Learning from Disaster https://www.kaggle.com/c/titanic This one helps bridge the gap between an analyst and a data scientist, as well as eases you into the world of code from Excel.

  2. Digit Recognizer https://www.kaggle.com/c/digit-recognizer The MNIST dataset has prepared almost every data scientist for the real world of dirty data, in a sweet, cushy manner. It makes very real the fact that not always will we have structured and organized data, but it should not deter us from gaining insight from it!

Happy Coding!

jackStinger

Posted 2017-01-17T09:50:27.347

Reputation: 141

Thank you, but I knew these and was hoping for recommendations for some less traveled paths... There must be so many good solutions out there that you just have to look up. That would exclude them (see edit to the question.) – clstaudt – 2017-01-17T15:00:12.057

1

I would highly recommend the platform - https://datahack.analyticsvidhya.com/contest/all/

AnalyticsVidhya is an amazing community for data science. Not only contests, but they also have technical articles on their blog (https://www.analyticsvidhya.com). Though I used to blog for them, but my opinion is not biased by that fact. I used to follow them for a couple of years before that and I felt lucky that I got a chance to contribute to the community.

In terms of interesting problems, I would recommend the following:

  1. The Smart Recruits

  2. The Creative Analyst

  3. Date Your Data

Sorry I'm new here and not allowed to post more than 2 links. You'll find these competitions on their website. The datasets are available and you can make submissions to benchmark yourself. Also, I would suggest sign up for their emails. They conduct short 2-3 day hackathons which offer great learning experiences.

Hope this help!

Aarshay Jain

Posted 2017-01-17T09:50:27.347

Reputation: 11

The interesting problems you recommended are closed, so our submissions will no longer be evaluated, right? – clstaudt – 2017-01-18T08:27:19.880

Some of these are open and others closed. But I think they conduct these competitions regularly (once a month I guess) so you can always participate in them. – Aarshay Jain – 2017-03-02T21:04:32.113