In fact, the misrepresentation by the data visualization gets even worse. Sometimes you’ll see—USA Today used this a lot—where they’ll have three-dimensional pictures. They represent the point of data as a three-dimensional object. I’ve got a picture in front of me here now where they’re showing the price of oil and it’s represented by a barrel of oil. And the barrel is three dimensional.
You can see the top of it, and three of the shots have not only height, and width, but they also have depth. So, that lie factor is increased by order of magnitude. So, if we look at this you would think, “Oh, the price of oil does seem to be steadily going upward”.
If you were going by the amount of oil that you actually could fit in this virtual barrel here the most amount you can fit in there is going up at a much, much greater rate than the actual numbers themselves. And people will justify this in a lot of ways. Often in the graphs they’ll actually print the numbers.
So, somebody could look at those numbers and decide for themselves even though the picture’s misleading, and say, “At least I have these numbers right” as if not lying in this part of the graph justifies lying in that part of the graph which I don’t really buy into. And they’ll also say, “Well, at least we have got the general direction of growth right. It is going upward. So what if we’re off a little bit in terms of the actual volume?”
#1 Ranking: Read how InetSoft was rated #1 for user adoption in G2's user survey-based index |
|
Read More |
But these aren’t off by a little bit. This picture of oil lies back here by about 48 times. It’s a significant difference between what the picture represents and what the data represents. Do I think they do it on purpose meaning that they’re trying to get a reaction out of people in a certain way, or do they just do it because they want to. It’s an exaggeration. They want to show the exaggerated growth of this based on their bias or perspective.
Well, it depends on who’s actually presenting the data. Sometimes you will see people that will actually deliberately lie and try to mislead the audience which is kind of self-defeating. If I see a lie in a graph in a document it causes me to call into question all of the data in the document. So, once you’re caught in a lie it can really destroy your credibility.
So, some people will do that, but I don’t think it serves them as they want it to. It’s really self-defeating. This particular graph we’re looking at here is from Time Magazine, I don’t think Time Magazine is in the business of trying to mislead people.
I think what happens with places like some of these larger media outlets is they don’t really trust their readers. There’s this myth that data is boring—that if we just present the data itself that won’t be enough to hold people’s interest. And so, we need to dress it up. We need to add a little pizzazz to it.
So, they move that job of representing the data away from the people that actually understand the data over to the Art Department. In the Art Department, they know art. They don’t know data. They don’t know anything about statistics, so they’re just drawing pictures. And so, I think what is happening now is the people responsible for putting these out really don’t understand the impact of how they’re misleading with this art.
A natural question now is so, how does this affect me as a software developer that works for a company or as a consultant for clients?
That’s a great question. So, when I first read this book I’d been writing reports and doing graphs for a long time. And it caused me to just step back and say, “You know what? I need to think about this stuff. I need to think about how am I representing this data”. The most common way of making charts is through Excel. Excel is one of the most popular pieces of software in the world.
Pull that data into Excel, and you’ve got these nice wizards that generate charts. Sometimes those charts will be generated in three dimensions, but three dimensions can be misleading. You just have to be aware that the way you represent that data could be misleading your users.
And the other thing is if you’re having trouble using your data to effectively communicate to your audience, maybe you need different data. Maybe you’ve selected a wrong data source. Trust that the data can tell a story, and then represent that story accurately with your graphs. These are some lessons that I took away from this, and I’ll get to some more things later on about some of the things that you can do right to improve your graphs that I also can actually apply to sequences of reporting services, for example.