Graphs, Deception and the Dream of Objectivity

“Statistics are like bikinis. What they reveal is suggestive, but what they conceal is vital.”        Aaron Levenstein

“Do people lie with graphs”: The phrase fetched me 32,900,000 results on Google in 0.43 seconds, today. Yes, I am aware, as you should be, that these numbers do not prove a thing. We can however, safely place a guess that ‘lying’ is a word that comes up frequently in connection with data visualizations, such as graphs. But then again, this would be based on my assumption that 32,900,000 results is a lot.

See where this is leading to?

All statistics have a context, one that is dynamic and ever-changing, and oftentimes subjective. Attempting to create a data visualization is like taking a snapshot that captures one instance of a moving image, in a bigger world. It is no surprise then that data visualizations can often lead to faulty claims or misinterpretations, either intentionally or unintentionally. Even simple choices in visualizations, such as the axes or time period can affect the particular story that is drawn from it.

Subjectivity in visualizations

“The information the visualization reveals shapes the perception of the reader.”      Alberto Cairo

Following the much-talked about gang rape incident in Delhi on 16 December 2012, there was a spate of articles on the high rates of rapes cases in India. One such story was carried by the New York Times which went ahead and made some severe, but in my opinion, relevant claims that “India must work on changing a culture where women are routinely devalued.” (Rape in the World’s Largest Democracy – The New York Times Editorial, 28 December 2012)

When viewed in isolation, it is indeed the case that the number of reported rape cases in India have been increasing every year. But then place India in context of the rest of the world. In his blog post Lies, Damned Lies, Rape and Statistics, Sharad Goel compares the number of police-recorded rape cases across various other countries of the world, and shows how India features at the lower end of the figure, particularly in comparison to rape cases in the United States, a subject that the NY Times did not bring into the picture. Notice also that the number of rape cases are reported by Goel as a proportion per 100,000 people. This is yet another vital choice that might have led to a different interpretation had the figures been absolute, for the population of India by far exceeds most of the other countries that score above it on the graph.

Number of police-recorded rape offenses per 100,000 people (2005-2009)

Does this tell a different story than the one presented by NY Times? Yes it does. But so does this fact: Compared to earlier when rape incidents in India were kept secret to keep the family reputation intact, more and more Indian women are now officially and openly reporting rape. Sharad Goel’s graph illustrates the number of police-recorded cases. However, many incidents of rape in India go unreported, and many are dismissed by the police themselves amidst the prevalent bureaucracy and corruption. This would mean that the actual number of rape cases in India might be severely underreported in Goel’s graph, as he himself responsibly mentions.

Avoiding the trap of deceptive perspectives

Well, the question that follows then is: can we avoid the trap of possible deceptions that data and visualizations can pose when viewed from just a single perspective? I believe we can. However, I also know that it takes practice, experience, and conscious processing and questioning of what we see in the visual in front of us. Below are a list of some important elements to look at when presented with or presenting a data visualization. To make them simpler to remember, I’ve collated them as the ABCDE of data visualizations:

Axes: Check what measures the graph represents and whether they make sense in the given context. It is also important to check at what point the numbers start and in which direction they run. In the example above, the axes representing only police-reported rape cases instead of the actual total of rape cases in a country, could lead to misinterpretations

Baseline: Baselines determine the point of comparison. They can be a zero point, a previous time, an ideal state or a predicted state. In other words, they vary across story perspectives. In the above example, if the rape cases were measured for India alone, the comparison would be made between subsequent years. In the multi-country graph however, comparisons are made across the countries.

Change rates: Check whether the measures are represented in absolute values or as ratios and percentages. This can influence the way you interpret the data. As explained above, the total number of rape cases as opposed to the number of rape cases per 100,000 population, would have led to different interpretations due to differing population sizes between countries.

Dates: The time series over which the data is gathered forms an important part of the story. Some contexts like stock markets may show interesting trends across single weeks and dates, whereas other contexts like population growth operate over longer time periods. The graph above stops at 2009, which is already 5 years behind in time.

Excluded information: Look out for any related information that the graph doesn’t cover but that could influence the interpretation. Once again, in the case above, the fact that a lot of rape cases go unreported in India, might be a vital element to keep in mind while drawing inferences.

In pursuit of objectivity

In as far as journalistic stories stem from individual perspectives, I’m of the opinion that an inarguable objectivity remains unachievable. It has been eluding us for decades and will continue to do so. Data however, being irrefutable, are often considered uniquely representative of the truth. The above example illustrates that even when working with data and visualizations, the story that journalists decide to tell and visualize will be coloured to some extent by their individual perspectives, the data that they have access to, and the elements they might have overlooked.

Although the ABCs above can be a vital guideline in avoiding deceptive interpretations, they are by no means fool-proof, or even comprehensive for that matter. Each story comes with its own context, and data and visualizations should be interpreted on a case-by-case basis. For the rest, I think that all that what we as journalists can do is:

(a) Openly present our limitations and admit to our errors should they occur (ii.e., practice transparency)

(b) Follow the norms of responsible reporting, and present arguments with good reason and evidence

Advertisements

5 thoughts on “Graphs, Deception and the Dream of Objectivity

  1. Hi Kriti! Again, nice blog to read! I really liked it that you picked one example and that you returned to that example a couple of times (for example in the ABCDE section). In the end you pose that journalists should: openly present our limitations and admit to our errors should they occur (ii.e., practice transparency). I found this example on the Internet of a company that admits their mistake and apologized for that: http://www.informationisbeautiful.net/2010/correction-apology-planes-or-volcano/
    Do you think that they did it in a good way?

    Liked by 1 person

  2. Hi Kriti! Loved to read your blog! I liked that you used some relevant quotes. I think you are really clear in what you would like to say and you openly express your opinion.
    Replying on the question from Esther, I think they did well by admitting their mistake and openly telling why it happened. I agree that’s better then just changing the text or deleting the article. I don’t really like the way they’ve done it, though. I think the graph still isn’t that clear and it’s taking me a long time to find out what exactly they mean. Esther, what do you think?

    Liked by 1 person

  3. That’s a really nice example Esther. I agree with Joli that one really had to be aware of what particular infographic they were talking about and what it was conveying to appreciate their apology. But I really do appreciate the frank honesty that they were “badly wrong” and that they should have been more careful. I guess as the reader I am willing to forgive them then, because I know that making such errors is human.

    It got me thinking though that maybe there should be continuous second check procedures in place. So at every step, maybe they could have two different people take a look at the progress. Then again, that takes up so many more resources and also time. Nevertheless, there should atleast be some last verification procedure in place before a graphic like this is published and becomes viral.

    Like

  4. Honestly, for me this was your best blog post! 🙂 Quotes, examples, cartoon were very relevant and interesting. And I agree, that absolute objectivity is unachievable not only in visualisation, but also in a text, because at least three elements of the article development process are subjective. Every article, even based on extremely accurate data, with accurate visualisation is still 1. produced by a journalist who can not be absolutely objective just because he has certain knowledge, opinion, attitudes, 2. in a certain context 3. for a certain group of readers.

    Liked by 1 person

    • That’s a big compliment! Thank you!
      I really like the way you summarized the unavoidable subjectivity in those 3 points. It captures the whole debate quite nicely.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s