## What is your favorite "data analysis" cartoon?

This is one of my favorites:

One entry per answer.

These cartoons are useful too; they can be included in a lecture on a particular topic where you are trying to explain a concept (e.g. correlation/causation above). A little humor can help to keep an audience engaged.

Data Science analogy to cartoon in OP. Data Scientist: I went to data science bootcamp and learned how to find correlations in big data. Those insights can be converted into big money. Statistician: But many of those correlations are spurious. Correlation does not imply causation. Data Scientist: Don't give me none of that century old statistics mumbo-jumbo. This is big data. That means the data has everything. So by definition, all relationships in the data are correct. I ring the cash register while you snooze and lose, grandpa.

This reminded me of a cartoon I am going to add to this thread.

This question is awesome! it's basically a best of list of xckd and dilbert

Was XKCD, so time for Dilbert:

On RANDU: "We guarantee that each number is random individually, but we don't guarantee that more than one of them is random."

Link not working, was it this one http://dilbert.com/strip/2001-10-25 ?

Absolutely love this one.

Did anyone else notice that the tour guide changes colors between the second and third frames?

Another from XKCD:

Mentioned here and here.

You can't read this one without the alt text. it said something like "But because of that we're totally breaking up"

My favourite Dilbert cartoon:

Definitively my favorite cartoon about Data Mining

One more Dilbert cartoon:

...

This one reminds me of the recent bailout in the States, where they just made up 700 billion number - they said they just wanted a really large number. :)

One of my favorites from xckd:

## Random Number

RFC 1149.5 specifies 4 as the standard IEEE-vetted random number.

Now this is hilarious.

But that isn't even prime!

From: A visual comparison of normal and paranormal distributions Matthew Freeman J Epidemiol Community Health 2006;60:6. Lower caption says 'Paranormal Distribution' - no idea why the graphical artifact is occuring.

I think this version of the joke works better (from http://www.oneweirdkerneltrick.com), though apparently this version was seven years earlier.

– Dougal – 2015-02-03T01:06:13.543

this isn't really funny. it's more of a twist on english terms

@zero "A twist on English terms" describes a great many jokes

Yeah - I think all those jokes suck. There is no underlying statistical humour. This joke should be put on the English stackexchange instead.

@Phil: In the 2-dimensional version linked by Dougal, the "paranormal distribution" is indeed a (bizarrely truncated) distribution; so the joke has some statistical content. It doesn't work in one dimension, where your comment certainly applies.

I just came across this and loved it:

'So, uh, we did the green study again and got no link. It was probably a--' 'RESEARCH CONFLICTED ON GREEN JELLY BEAN/ACNE LINK; MORE STUDY RECOMMENDED!'

xkcd: significant

This is by far my favorite cartoon of all time. It's super educational. It really gets to the heart of the definition of a p-value. In fact, I bet that less than 10% the students who pass a college freshman "intro to stats" class get this joke, and this makes me sad.

3

Maybe so! Fortunately for freshmen, @Glen_b has offered an excellent breakdown here.

– Nick Stauner – 2014-02-27T01:03:28.453

Great! But yellow appears twice :P

this is a pretty good joke as it clearly demonstrates why repeated multiple testing is dangerous. For anyone interested check out Bonferi correction to deal with this.

Who's the artist?

This cartoon was drawn by Ben Shabad

– ff524 – 2016-01-07T20:20:27.547

That's great. The standard way of dealing with outliers.

Another from xkcd #833:

And if you labeled your axes, I could tell you exactly how MUCH better.

By the third trimester, there will be hundreds of babies inside you.

Also from XKCD

This isn't technically a cartoon, but close enough:

That's definitely my favorite. I always have to stop on this and laugh when scrolling over this page. It's just so bad!!

There is this one on Bayesian learning:

What's the source?

It was taken from Mike West's website: http://www.stat.duke.edu/~mw/fineart.html

– ebony1 – 2010-08-31T16:23:32.740

this too:

To be honest, those are the bad physicists. The good physicists stick around and make a name for themselves.

Yet it's amazing how often it works...

This cartoon...speaks to me.

Nice. The importance of variance when thinking about a population.

Saturday Morning Breakfast Cereal

This has to be my favorite.

This cartoon makes me sad.

And another one from xkcd.

Title: Self-Description

The mouseover text:

The contents of any one panel are dependent on the contents of every panel including itself. The graph of panel dependencies is complete and bidirectional, and each node has a loop. The mouseover text has two hundred and forty-two characters.

Another one from xkcd:

Alt-text:

Hell, my eighth grade science class managed to conclusively reject it just based on a classroom experiment. It's pretty sad to hear about million-dollar research teams who can't even manage that.

this one is just great :-)

Looks like this one needs an updated image.

More about design and power than analysis, but I like this one

I liked this one:

This is probably fun to show in class as well...

A classic...

"Because medical research findings can be difficult to reconcile, are not always pre-digested, and can seem overwhelming to us casual observers, let us make fun of those who dedicate their lives to obtaining them."

@rolando2 As a medical researcher, I find the sensationalist incompetence of mainstream science reporters hilarious.

There's a listserv from HealthNewsReview devoted to evaluating media handling of health research findings.

@rolando2 As a statistician, I find the sensationalist incompetence of mainstream "data analysts" hilarious.

I found this from a NoSQL presentation, but the cartoon can be found directly at

http://browsertoolkit.com/fault-tolerance.png

Can you please explain this cartoon?

Found this one in the comments on Andrew Gelman's blog.

This chart is wrong, correct percentage is about fifty-fifty.

yeah but.... this one isn't true... it mostly depends on how you parameterize the time variable $t$... i guess if you go back far enough, but come on...

Source: unknown. Posted on flowingdata.com.

Allright, I think this one is hilarious- but let's see if it passes the Statistical Analysis Miller test.

## Fermirotica

I love how Google handles dimensional analysis. Stats are ballpark and vary wildly by time of day and whether your mom is in town.

Statistical voyeurism? And there we were wondering what to call the site...

From xkcd:

This is data analysis in the form of a cartoon, and I find it particularly poignant.

The universe is probably littered with the one-planet graves of cultures which made the sensible economic decision that there's no good reason to go into space--each discovered, studied, and remembered by the ones who made the irrational decision.

Very funny! (__)

Another one from xkcd:

Where are persimmons?

@EngrStudent: they're simultaneously off both ends of the tasty scale.

Needs replication.

I'm surprised Durian isn't on here...

@AnonymousType and easy!

Unsure what to make of the tomatoes ranking low on tasty: perhaps Randall does not know about heirloom varieties?

Bananas are always tasty.

From xkcd:

If some people who really believe that everything should be scientifically tested would actually walk their talk than they this comic might even show an event that actually happens.

Here's a somewhat more technical one.

I don't think this one was posted yet...

I like the percentages given in the third panel.

Another one from xkcd:

Hover Text:

Knuth Paper-Stack Notation: Write down the number on pages. Stack them. If the stack is too tall to fit in the room, write down the number of pages it would take to write down the number. THAT number won't fit in the room? Repeat. When a stack fits, write the number of iterations on a card. Pin it to the stack.

Was just about to add this. Thanks :)

Here is a very meaningful chart..

Saturday Morning Breakfast Cereal

And the votey (a sort of black-and-white epilogue unique to SMBC):

26

I like that it took me a second, and then another second to get the second joke. :)

This is not a cartoon, but a joke worth mentioning:

A statistic professor travels to a conference by plane. When he passes the security check, they discover a bomb in his carry-on-baggage. Of course, he is hauled off immediately for interrogation.

"I don't understand it!" the interrogating officer exclaims. "You're an accomplished professional, a caring family man, a pillar of your parish - and now you want to destroy that all by blowing up an airplane!"

"Sorry", the professor interrupts him. "I had never intended to blow up the plane."

"So, for what reason else did you try to bring a bomb on board?!"

"Let me explain. Statistics shows that the probability of a bomb being on an airplane is 1/1000. That's quite high if you think about it - so high that I wouldn't have any peace of mind on a flight."

"And what does this have to do with you bringing a bomb on board of a plane?"

"You see, since the probability of one bomb being on my plane is 1/1000, the chance that there are two bombs is 1/1000000. This way I am much safer..."

2

Reminds me of Baldrick's Bullet: http://www.youtube.com/watch?v=pKRxX3s3JlM

– Bitwise – 2013-07-14T01:00:41.907

It's kind of hard to enjoy a stat joke when it's screaming "bad stat" at the same time!

This isn't a cartoon. It belongs in a different question.

Independence!!!

@KHKim, we all know that, don't ruin a joke :)

But if you know that "there is a bomb" (yours) in the plane, which we may call event $A$, and you are willing to accept that the "existence of a second bomb" (event $B$) is independent of $A$, then $P(B\mid A)=P(B)=1/1000$. Always condition on what you know. And yeah, I deserve a $-1$ for screwing a good joke.

@Zen again, why do you explain this? Even the security check guy in the story understand this, intuitivelly... don't analyze a joke :-)

6

By the way, as far as I know, the person who first described this anecdote was none other than Hugo Steinhaus (of the Steinhaus-Banach theorem fame) in his "Mathematical Kaleidoscope".

– January – 2012-11-09T12:50:13.707

Thanks @January! Finally some interesting information :-) – Curious – 2013-01-17T10:38:35.700

Explaining Away

Since these are a rather sampling theoretic set of cartoons so far, here's one for the Bayesians. (Actually I set it as a class question last year.)

Link to the original: http://www.smbc-comics.com/?id=2740

Thanks. Changed the URL to the one recommended for embedding (not the one you suggest).

From SMBC:

Not Petulant. Literate. ;)

Not churlish. Petulant.

I wince more that idea that all perspectives are subjectively situated is sad.

The only time I hear the word "datum" used by scientists, it is in the geodetic sense. Perhaps I have a biased sample?

– GeoMatt22 – 2016-10-07T05:11:27.610

Is it churlish to wince when I read "data is"?

"The bridge of life"

I took this image from here. This is a "Painting commissioned by Karl Pearson", see. It is considered as a predecessor of the hazard function.

The 'Death' attempts to kill you at different ages using different sorts of weapons which are related to the "failure probability" at the corresponding age.

It would help dense people like me to see some brief explanation of how this is specifically related to data analysis. Also, please acknowledge (or at least link to) the source: give credit where credit is due.

@whuber Thanks for your comment. I added a bit of details in order to clarify its meaning and relationship with statistics.

Is this Avignon bridge?

@Tomas It looks similar indeed.

​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​

This is just one flag for torture never being funny, even indirectly or allusively.

@NickCox, sorry, my english is not enough to understand your sentence..

Personal opinion: Torture isn't funny.

@NickCox, come on, it's just numbers! :)

Your cartoon alludes to people suffering. In turn, I say: come on, you should not find that funny. I won't expand on the point.

@NickCox, how does my cartoon allude to people anyhow? Its about data, not people!

I can only infer that you don't know the meaning of the word "torture".

This is also why torture is no good for anyone, including the torturer.

@NickStauner I don't get this discussion. Is it that Nick Cox was rather sensitive, transfering the joke about suffering data to suffering people? How does your comment relate to this?

Oh, I wasn't commenting on the conversation, just the cartoon itself...

Ironically, this cartoon uses "torture" in a common, among statisticians, metaphorical sense of "analyze incorrectly". The humor comes from the fact that it just happens to recall a word meaning something else... But the cartoon isn't about torture, it's about abuse of statistical methods. Though, perhaps one might contend that abuse is also a serious and weighty matter, and for that matter, so is weight...

A Frequentists vs. Bayesians cartoon from XKCD!

Mouse-hover transcript:

'Detector! What would the Bayesian statistician say if I asked whether the--' [roll] 'I AM A NEUTRINO DETECTOR, NOT A LABYRINTH GUARD. SERIOUSLY, DID YOUR BRAIN FALL OUT?' [roll] '... Yes.'

I am not sure about this one ... the Frequentist Reasoning seems wrong to me but I cannot explain why :(.

Because of obvious trying to score a cheap point with deliberate misrepresentation, one of the few xkcd comics I don't like.

I'm biased towards the Bayesian interpretation, but the frequentist appears to me to be consistent with the standard frequentist interpretation. I hate null hypothesis significance testing, but if you want to go that route, it seems the null hypothesis is that the sun did not explode. If the null hypothesis is true, the chances of observing a "Yes" would indeed be $\frac{1}{36}$ (negligibly higher if you want to be pedantic and include the chance of a machine malfunction). So we've either seen a rare event and the sun is still there or the sun is gone. Many frequentists default to the latter.

Of course, using a threshold of $p < 0.05$ is ridiculous in this case, but unfortunately many frequentists don't think about other thresholds.

But if we follow the principle of "$H_0$ is what we want do disprove", then $H_0$=Sun did explode => prob is 1-1/36 => cannot reject $H_0$, which does not mean that $H_1$ is true. I guess this was an attempt to cover a physics joke, because the sun cannot become nova.

@steffen I would contend that using "$H_0$ = Sun did explode" is not really appropriate in this case. The null hypothesis is supposed to be the default position, so unless I have an incredibly strong reason to believe otherwise, my default will be that the sun did not explode.

@MichaelMcGowan good

– boscovich – 2012-11-11T09:27:13.190

And Larry Wasserman: http://normaldeviate.wordpress.com/2012/11/09/anti-xkcd/

– Momo – 2012-11-11T16:56:47.517

This one is a hit :)! I've seen it a few days ago.

an 'easy to digest' pie chart example for Rick Astley fans that my students seem to enjoy

13Are those mutually exclusive events? :-) – cardinal – 2013-10-21T01:32:23.763

Reminds me of the "Hey Jude" flowchart: https://flowingdata.com/2011/01/21/hey-jude-flowchart/

– Gaurav – 2017-01-25T22:00:17.567

No one put up a cartoon from the cartoon guide to statistics. I like many of them from there and I used a number of them in one of my books. The one that seems to get the most laughs when I use it in a lecture is the one with the statistician going out on a first date. Their comments and thoughts about the making decisions on the menu with the statistician assessing probabilities and the woman just choosing what she likes makes it really hilarious.

Is it this one, Michael? – whuber – 2012-05-04T22:06:22.660

1Yes. Thanks a lot Bill. I didn't have any idea how I could paste it in. Is uppose I could have scanned it in to a file and then tried pasting it. That would have been a lot of trouble. Is that what you did? There are a few more scenes in that one that are also pretty funny. But this gets the idea across. – Michael Chernick – 2012-05-07T21:06:25.360

Being familiar with this book, it was easy to look it up on the Web. I quickly found one of those versions (on a reseller page) that give you access to selected pages. This image is from such a page, so no scanning (on my part) was necessary. – whuber – 2012-05-07T21:11:24.387

Okay, sometimes that is possible but I think more often scanning would be necessary. – Michael Chernick – 2012-05-07T21:50:09.750

This one may be a little too real for anyone involved in academic research...

See the original here.

I like this XKCD, but Randall Munroe's implication about the value of p = exactly .050 is incorrect. http://stats.stackexchange.com/questions/60825/is-a-p-value-of-0-04993-enough-to-reject-null-hypothesis explains why this is the case.

– user1205901 – 2015-04-14T13:49:32.550

From xkcd:

Almost a Chi square...

As the CoKF approaches 0, productivity goes negative as you pull OTHER people into chair-spinning contests.

21?? This seems to have nothing to do with stats. (The curve is modeled after energy potentials in physics, not after anything in stats.) – whuber – 2012-05-04T22:11:08.550

Not a cartoon but the best way of not being confused about type I and II errors. And very funny IMHO

I wonder if it's OK to use %-points as an abbreviation of percentage points.

http://xkcd.com/985/

Sorry it is in Dutch! Translation

• Greetings new recruits! Welcome to the training camp for gladiators
• Be warned! One in three of you does not survive the training
• 33.3% ... That's not so bad

As a pointer to why we should think about conditional probabilities (and from then on, to why tools like logistic regression are useful for predicting risk) this is an excellent cartoon. – Silverfish – 2015-12-19T22:39:57.697

Not exactly data analysis but I had a chuckle.

From XKCD:

Though 100 years is longer than a lot of our resources.

Quite a trend (I am the one on the left with the laptop)

John Deering, Strange Brew

6Let epsilon be less than 0 – Dason – 2014-01-14T03:00:48.397

12I admit, I don't get it. – steffen – 2011-06-30T07:27:01.070

7I think it's that for any other type of presentation, you'd have started by telling a joke. But since mathematicians (or statisticians, here) only think and speak in terms of formulas, this was their (still lame) joke-analogue for opening a presentation. – AdamO – 2011-12-17T22:15:59.583

Yes, let's not jump to random forest before we work on some simpler branches!

My favorite was created by Emanuel Parzen, appearing in IMA preprint 663, but this illustrates my degenerate sense of humor.

Gorbachev says to Bush: "that's a very nice golfcart, Mr. President. Can it change how statistics is practiced?" etc. hahahah.

My favorite is Sidney Harris he has many great cartoons

This one might be useful when introducing the concept of experimental and control groups.

This one makes you think about the importance of thinking about conditional probabilities. Now I don't know what to make of the twist at the end.

Note: this is from SMBC (Saturday Morning Breakfast Cereal) by Zach Weiner.

21Shouldn't it be x-variable? – gung – 2012-03-25T14:34:44.957

Correlation does not imply causation!

Life...

"Hearing something a hundred times isn't better than seeing it once"

More of a math cartoon than a data analysis cartoon, but also one that makes you think a bit.

Source: http://www.gocomics.com/andertoons/2014/06/15#.U54J7iigS8A by Mark Anderson, June 15, 2014.

Overfitting -explanation in a picture (original cartoon)

2Giving a source would be good practice. Pity about the typo (split infinitives are acceptable to me). – Nick Cox – 2015-11-18T12:31:36.577

I created the cartoon. Got that result, found it amazing and added the text. What is the typo? How would you phrase the titles? – DaL – 2015-11-18T12:39:46.723

Fine; so you are fully entitled to claim "(original cartoon)". You fixed the typo I saw (allways for always). – Nick Cox – 2015-11-18T12:54:30.407

Yes. Thanks for the help. – DaL – 2015-11-18T13:13:04.220

[New Year] : http://robertgrantstats.co.uk/drawmydata.html

True if $P=NP$

True if $P \ne NP$

This is great one about solving NP-complete problems. They come up a lot on the job, like efficient scheduling or how to select the optimal configuration among a number of various options for which you have to search through them all to find the best one.

Think about it anytime you need to cop out of something difficult at work!

Statisticians aren't easily cowed.

Loose Parts by Dave Blazek 1/10/2018