What do you think of Data Science certifications?

30

9

I've now seen two data science certification programs - the John Hopkins one available at Coursera and the Cloudera one.

I'm sure there are others out there.

The John Hopkins set of classes is focused on R as a toolset, but covers a range of topics:

  • R Programming
  • cleaning and obtaining data
  • Data Analysis
  • Reproducible Research
  • Statistical Inference
  • Regression Models
  • Machine Learning
  • Developing Data Products
  • And what looks to be a Project based completion task similar to Cloudera's Data Science Challenge

The Cloudera program looks thin on the surface, but looks to answer the two important questions - "Do you know the tools", "Can you apply the tools in the real world". Their program consists of:

  • Introduction to Data Science
  • Data Science Essentials Exam
  • Data Science Challenge (a real world data science project scenario)

I am not looking for a recommendation on a program or a quality comparison.

I am curious about other certifications out there, the topics they cover, and how seriously DS certifications are viewed at this point by the community.

EDIT: These are all great answers. I'm choosing the correct answer by votes.

Steve Kallestad

Posted 2014-06-12T10:52:03.410

Reputation: 2 446

3

This is too broad and primarily opinion based. Please take a look at http://datascience.stackexchange.com/help/dont-ask

asheeshr 2014-06-13T01:44:05.283

3@AsheeshR - We're averaging 2 questions a day and 2 answers per question. At this point the focus needs to be on encouraging participation and increasing interest.Steve Kallestad 2014-06-13T04:29:41.007

10Engagement at the expense of site quality is not the solution. Engagement is transient. Quality is much harder to alter later on.asheeshr 2014-06-13T05:05:00.603

@AsheeshR I'll agree with you if you can point me to a single instance of a QA site that is considered authoritative on less than 10 questions a day.Steve Kallestad 2014-06-13T05:11:38.660

4

[bicycles.se], [workplace.se], [money.se], [skeptics.se], [gamedev.se] all launched with less than 10 questions per day. Bicycles was launched with 4 per day because it was considered to be a high quality site.

asheeshr 2014-06-13T05:18:36.150

3Well... I guess I have to declare you the winner at this point. :)Steve Kallestad 2014-06-13T05:37:04.533

This discussion was useful to me. I have worked on the "edges" of data science with BI and analytics and have taken data mining and statistics courses for a Master's Degree several years ago. Right now I primarily work on the enterprise information management aspect of corporate data but would like to switch to a data science job. Therefore I am looking at best way to "present myself" on a resume and with training/experiences to potential future employer. – None – 2014-12-04T18:01:58.977

Answers

11

I did the first 2 courses and I'm planning to do all the others too. If you don't know R, it's a really good program. There are assignments and quizzes every week. Many people find some courses very difficult. You are going to have hard time if you don't have any programming experience (even if they say it's not required).

Just remember.. it's not because you can drive a car that you are a F1 pilot ;)

Patlaf

Posted 2014-06-12T10:52:03.410

Reputation: 146

29

As a former analytics manager and a current lead data scientist, I am very leery of the need for data science certificates. The term data scientist is pretty vague and the field of data science is in it's infancy. A certificates implies some sort of uniform standard which is just lacking in data science, it is still very much the wild west.

While a certificate is probably not going to hurt you, I think your time would be better spent developing the experience to know when to use a certain approach, and depth of understanding to be able to explain that approach to a non-technical audience.

neone4373

Posted 2014-06-12T10:52:03.410

Reputation: 781

2Sometimes experience is hard to gain if your current job is not focused on data science but on some related field (in my case statistics). I use the courses to gain some knowledge and stay on topic, which I cannot do in my daytime job.Christian Sauer 2014-06-13T07:16:19.973

1I agree fully, the courses are very valuable for giving you a starting point, and some structure to gain that experience. To get the most out of the Mooc I suggest taking a very specific example, lets say logistic regression, and really working through it with a different data set, double bonus if you do it in a language other than the one the course is taught in.neone4373 2014-06-13T13:36:51.167

That's a good idea. What#s missing for statistics in general is a training website. E.g. a set of databases, along with goals and possible results at the end. Something like khancademy, but more powerful ;)Christian Sauer 2014-06-13T14:07:52.810

10

The certification programs you mentioned are really entry level courses. Personally, I think these certificates show only person's persistence and they can be only useful to those who is applying for internships, not the real data science jobs.

Stanpol

Posted 2014-06-12T10:52:03.410

Reputation: 562

I agree. The course material is good to get you started but it is mostly entry level.Shagun Sodhani 2015-05-25T07:27:09.663

8

I lead data science teams for a major Internet company and I have screened hundreds of profiles and interviewed dozens for our teams around the world. Many candidates have passed the aforementioned courses and programs or bring similar credentials. Personally, I have also taken the courses, some are good, others are disappointing but none of them makes you a "data scientist".

In general, I agree with the others here. A certificate from Coursera or Cloudera just signalizes an interest but it does not move the needle. There is a lot more to consider and you can have a bigger impact by providing a comprehensive repository of your work (github profile for example) and by networking with other data scientists. Anyone hiring for a data science profile will always prefer to see your previous work and coding style/abilities.

Rodrigo Rivera

Posted 2014-06-12T10:52:03.410

Reputation: 171

7

There are multiple certifications going on, but they have different focus area and style of teaching.

I prefer The Analytics Edge on eDX lot more over John Hopkins specialization, as it is more intensive and hands on. The expectation in John Hopkins specialization is to put in 3 - 4 hours a week vs. 11 - 12 hours a week on Analytics Edge.

From an industry perspective, I take these certifications as a sign of interest and not level of knowledge a person possesses. There are too many dropouts in these MOOCs. I value other experience (like participating in Kaggle competitions) lot more than undergoing XYZ certification on MOOC.

Kunal

Posted 2014-06-12T10:52:03.410

Reputation: 286

2And what about stats.SE, datascience.SE profiles. Do you think they can say much about relevant level of knowledge?IharS 2014-06-13T08:26:14.040

What do dropouts have to do with it? Presumably, certification is contingent upon completing the course, not merely registering…Gala 2014-06-13T10:49:41.657

There are many people who mention that they are undergoing certification by doing a course on these MOOCs. You need to be careful with that.Kunal 2014-06-13T17:02:53.243

@Kunal It makes sense but your answer jumps from the “certification” to “dropouts” (who presumably don't have a certification). The key here is undergoing. It's a bit like being registered as a student or having a Kaggle account. None of this tells us whether you should value someone who did actually get a degree, complete a course or participate in a competition to the end.Gala 2014-06-16T11:02:19.743

5

Not sure about the cloud era one, but one of my friends joined the John Hopkins one and in his words it's "brilliant to get you started". It has also been recommended by a lot of people. I am planning to join it in few weeks. As far as seriousness is concerned, I don't think these certifications are gonna help you land a job, but they sure will help you learn.

Pensu

Posted 2014-06-12T10:52:03.410

Reputation: 361

3

@OP: Choosing answers by votes is the WORST idea.

Your question becomes a popularity contest. You should seek the right answer, I doubt you know what you are asking, know what you are looking for.

To answer your question:
Q: how seriously DS certifications are viewed at this point by the community.

A: What is your goal from taking these courses? For work, for school, for self-improvement, etc? Coursera classes are very applied, you will not learn much theory, they are intentionally reserved for classroom setting.

Nonetheless, Coursera classes are very useful. I'd say it is equivalent to one year of stat grad class, out of a two year Master program.

I am not sure of its industry recognition yet, because the problem of how did you actually take the course? How much time did you spend? It's a lot easier to get A's in these courses than a classroom paper-pencil exam. So, there is be a huge quality variation from person to person.

user13985

Posted 2014-06-12T10:52:03.410

Reputation: 227

Part of the question is meant to gauge whether or not the community placed value on certification. In some areas, certification is an absolute necessity. In others, certification doesn't matter at all. In still others, certifications by a particular company are held in high regard and competitive certifications are not. The other part was meant to understand the difference in topical focus of the certifications that are out there. Data Science is a broad term. Certifications are normally more focused. This is a bad question for QA format - it's more of a discussion, subject to opinion.Steve Kallestad 2014-06-14T03:26:47.133

My purpose in noting that I chose the answer by votes was to make it plain that all of the answers deserved reading. Everybody makes good points, including you way down here at the bottom. Somebody who is wondering about these things shouldn't limit themselves to the top one or two answers.Steve Kallestad 2014-06-14T03:29:39.203

Voting to find the right answer is a horrible idea. It is the wrong way to approach math. You clearly missed my point.user13985 2014-06-14T04:04:26.623

1

I am almost done with Johns Hopkins Data Science Specialization on Coursera (A course and a capstone left to graduate). I will just give you the pros and cons of it, trying to keep it as objective as possible:

Pros:

  • Structure around the learning process
  • You'll build a portfolio over time

Cons:

  • Different backgrounds needed for different courses. The first few courses don't assume previous knowledge. It suddenly gets not easy to understand in the conceptual courses. (Statistical Inference, Regression Analysis)
  • Taught by 3 professors. I think they are not on the same page about their potential audience and their abilities/needs/interests.

pbahr

Posted 2014-06-12T10:52:03.410

Reputation: 121

1

It really depends on the credibility of the institution granting the certificate. For example, Data Science Certification from a Harvard-based company is recognized by many industry partners and may make a good choice. You did not say what kind of certificate you are looking for?

Sumeet Nijjar

Posted 2014-06-12T10:52:03.410

Reputation: 1

0

I think the effect of the certification from coursera is dependent on the individual as well as the classes. The requirement says min 3-5 hours a week, if you put more, and the material do open up for a lot more than the 3-5 hours, then these classes and certifications can be equivalent to strong knowledge base and experience in the field. Science comes to those who request it.

Neveen

Posted 2014-06-12T10:52:03.410

Reputation: 1

0

The best way to be successful at getting the job that you want it to show that you can do it.

The MOOCs that you mention will give you a good grounding in the basics and should be enough to get you started solving your own machine learning/data science problems. Try a Kaggle competition or two, that is a great way to improve your skills, and a decent grade there will be of interest to a potential employer. Publish your results on Github using something like an iPython Notebook, which will allow your work to be easily seen and judged.

Try an analysis on other public data sets, like the UCI Bike Sharing Dataset, or the UCI Diabetes Treatment Dataset those are lots of fun to try, and show that you are keen and willing to develop your skills.

DrMcCleod

Posted 2014-06-12T10:52:03.410

Reputation: 136