I get asked this question all the time, so earlier this year I wrote an article (**What is Data Science?**) based on a presentation I've given a few times. Here's the gist...

First, a few definitions of data science offered by others:

**Josh Wills** from **Cloudera** says a data scientist is someone "who is better at statistics than any software engineer and better at software engineering than any statistician."

A frequently-heard **joke** is that a "Data Scientist" is a Data Analyst who lives in California.

According to **Big Data Borat**, Data Science is statistics on a Mac.

In **Drew Conway's** famous Data Science Venn Diagram, it's the intersection of Hacking Skills, Math & Statistics Knowledge, and Substantive Expertise.

Here's another good definition I found on the **ITProPortal** blog:

"A data scientist is someone who understands the domains of programming, machine learning, data mining, statistics, and hacking"

Here's how we define Data Science at **Altamira** (my current employer):

The bottom four rows are the **table stakes** -- the cost of admission just to play the game. These are foundational skills that all aspiring data scientists must obtain. Every data scientist must be a **competent programmer**. He or she must also have a solid grasp of math, statistics, and **analytic methodology**. Data science and "**big data**" go hand-in-hand, so all data scientists need to be familiar with frameworks for distributed computing. Finally, data scientists must have a basic understanding of the domains in which they operate, as well as excellent communications skills and the ability to **tell a good story with data**.

With these basics covered, the next step is to develop **deep expertise** in one or more of the vertical areas. "Data Science" is really an umbrella term for a collection of interrelated techniques and approaches taken from a variety of disciplines, including mathematics, statistics, computer science, and software engineering. The goal of these diverse methods is to **extract actionable intelligence** from data of all kinds, enabling clients to make better **data-driven decisions**. No one person can ever possibly master all aspects of data science; doing so would require multiple lifetimes of training and experience. The best data scientists are therefore "**T-shaped**" individuals -- that is, they possess a breadth of knowledge across all areas of data science, along with deep expertise in at least one. Accordingly, the best **data science teams** bring together a set of individuals with complementary skillsets spanning the **entire spectrum**.

1

It means different things to different people. Give it time and people may come to an agreement. Until then: Six categories of Data Scientists, 16 analytic disciplines compared to data science.

– Emre – 2014-12-06T08:12:06.7301This is quite open-ended and opinion based, which is viewed as off topic for StackExchange. – Sean Owen – 2014-12-06T13:29:27.300

Here's my opinion: http://thegrimmscientist.com/2014/05/05/what-is-data-science/

– TheGrimmScientist – 2014-12-28T05:30:05.677