The Don’ts of Storytelling with Numbers

Gone are the times when information was a privilege of the exclusive insiders and elite few. Data is a resource for the masses in the digital world of today. It has pervaded nearly about every field, so much so that basic statistical techniques now form a core course of almost all university programs. So how does this affect the field of journalism?

Navigating the data flood

Open public data, crowd-sourced data, data maintained in private databases; there are stories to be had everywhere! The skill lies in the crafting: the cleaning, combining and shaping of those vast arrays of digits and characters into a tale worth telling (see Paul Bradshaw’s inverted pyramid description of the data journalism process). 43_01-big-data-by-David-Fletcher-cloudtweaksThis in its essence forms the core of the emerging field of data journalism. A data journalist, much like a film-maker or animator, brings numbers to life and makes the stories they tell relatable to everyone. However, story-telling with numbers is a skill that needs to be honed. The vast inexperience of dealing with numbers often places journalists in two kinds of dangers:

  • Not knowing how to find a story
  • Presenting a faulty story

The former issue can be addressed by some simple quests, such as looking for outliers, correlations and other patterns. The latter problem of presenting a faulty story however, is a still more vital one, mainly because it poses a threat to the very building blocks of journalism: reliability and veracity. A faulty data analysis can lead to various wrong interpretations, and a serious misrepresentation of the issue at hand.

The quintessential example of a story gone wrong

While searching for an interesting example to explain what can go wrong in the process of data-powered storytelling, I hit the jackpot!

Here’s a story that makes a perfect laboratory specimen for budding data journalists to dissect: Stop forcing people to wear bike helmets. This report is abundant with examples of faulty data analysis. I’ll focus on just three of those numerous blunders to set the scene here.

Claim 1 (sweeping statement based on inconclusive evidence): “If you don’t feel like wearing a helmet while biking, that’s fine”

Venturing beyond the off-handedness of the remark itself (“that’s fine”?), none of the arguments stated in the text actually support this claim. Take for example: “(…) study after study has shown, you’re better off with a helmet if you’re in an accident” (counter-intuitive); or “While they do protect your head during accidents, there’s some evidence that helmets make it more likely you’ll get in an accident in the first place” (faulty reasoning).

Claim 2 (assuming correlation for causality): “The data on whether helmets reduce total accidents is ambiguous”

Why should wearing helmets reduce the number of accidents? It might reduce the chance of a fatality or severe incident when a person in a bike accident suffers from a head injury in the first place. But there is absolutely no reason for it to causally imply a reduction in total accidents.

Claim 3 (unrepresentative sampling): “Drivers seem to be less cautious around helmeted bikers.”

This might have made an interesting argument, had it not been based on the observations of just one researcher on his single 200 mile bike ride.

For more on this particular case, or other similar cases of bad data journalism, see Alberto Cairo’s blog.

The Don’ts of storytelling with numbers

Although deeper knowledge of statistics could have gone a long way in making this particular helmet-opposing journalist aware of the errors in his reporting, a few simple rules of thumb can serve as a vital guide to journalists to avoid common, yet big fallacies in presenting data-based stories. Here they are, in the form of 3 crucial Don’ts:

  1. Don’t make sweeping claims when confronted with ambiguous information and widely-researched countering arguments.
  2. Don’t mistake correlation for causality. Just because two variables are related, does not mean that they necessarily influence each other.
  3. Don’t base a scientific argument on the results observed with an underrepresented sample, particularly if that sample accounts for only one person!

Finally, as Uncle Ben famously told good-old Spidey, “With great power comes great responsibility!”

Numbers exert a certain power; that of reliability and revealing implicit stories beyond the limits of immediate vision and comprehension. However, storytelling with numbers comes with its own set of responsibilities. We should be aware that a story based on a faulty data analysis can lead to numerous misinterpretations and hasty conclusions. Worse still, with phrasing that’s catchy enough, the story might be shared widely, thus spreading the misinformation still further. Interestingly, the article cited above has so far been shared 15,000 times of Facebook and 1,915 on Twitter, and is possibly doing its fair share of brain damage around the world.

2531223186_8a443e9b35

Advertisements

6 thoughts on “The Don’ts of Storytelling with Numbers

  1. The big data project cartoon and “…. particularly if that sample accounts for only one person!” made me smile 🙂 Thanks for that.
    I tried to find if there is any statistic related to the percent of young people who wear helmet and those who don’t. It could also show how accurate the data/assumption was. I found another article like this, according to which teenage drivers become victims of car accidents less than 20-24 year old people. There the trick was that teenagers in general drive less, which the authors hadn’t taken into consideration.

    Liked by 1 person

  2. Haha, liked the cartoons! Where did you find them? I think this blog post is very different from the one that you wrote before, which I like. I think you make a very clear point with your example, and the 3 Don’t are very useful to journalists who read your blog. And as a result… I found it very hard to come up with a question… 😉 How about you, Joli and Tatevik?

    Liked by 1 person

    • Thanks Esther! It did help me a lot to get feedback from you all with that, made me plan my article better. Hopefully it will get still better with time.

      Like

  3. I really like your way of writing! You make your point very clear and it’s all easy to follow. I think it’s interesting to read your analysis of the helmet-article! I think it’s great that you keep referring to this to explain other things in the rest of your blog. The only thing I’m missing a little, is your own view on all of this. I’m really curious to hear what you think is the biggest don’t in presenting data- based stories and if you’ve ever had any trouble with this topic yourself?

    Liked by 1 person

  4. Hi Joli, Once again, your question is really valuable in expressing my point further. I honestly do not believe that there is one “don’t” that is the most important. I feel that all the issues I pointed out in the example above can lead to equally wrong and serious misinterpretations. In fact, there were a lot more “don’ts” that I conveniently left out of this article.
    But that’s the difficult part as well for journalists, there isn’t really a panacea: one cure and then you don’t have to worry. We have to beware of all of these possible sources of error in our stories.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s