During numerous appearances on CNBC last year, producers there gave me a pet name.
“Big Data Hater.”
Whenever they wanted to do a segment on the wonders of Big Data, they trotted out experts talking about how much money was to be made by dumping everything into a spreadsheet. Then they trotted me out to take the “other side” of the story.
This is absurd, of course. As I would tell anyone who would listen, hating data is like hating atoms. I don’t hate data. In fact, I was one of the first computer-assisted journalists in the country. Back in the 1990s, I was learning how to join tables that compared school bus driver lists with DWI conviction lists. And I was playing baseball when Michael Lewis wrote Moneyball. I hated fat, old scouts who only trusted their considerable guts when picking which teen-agers would get a shot at The Show. Moneyball broke the good old boy’s network and gave a whole new generation of not-so-obvious baseball talents a chance.
We now live in an age where many folks think, “If you can’t count it, it’s not real.” And that’s doing almost as much damage as those old scouts. There’s only one thing worse that having no information in a situation: Having a tiny bit of information, and using it poorly. Corporate America is teeming with such situations now. Consulting firms wielding spreadsheets rampage through executive offices selling digital snake oil and run away long before the consequences flare up.
Data is hardly infallible, just as pictures Do lie. This is not an excuse to ignore data, mind you. After all, I predicted Duke’s early exit from this year’s NCAA tournament based on my data analysis of previous tournaments. But data always needs to be placed in proper context. It needs to be interviewed with hard questions. Otherwise, it simply creates a hall of mirrors. It makes you run around and around in circles, all the while convinced your are heading the right way.
Counting the wrong things: Why do airlines mess up one planeload’s entire day when a single flight has a mechanical problem, rather than simply shift everyone one flight later? Because the FAA rewards that behavior. It’s all about percentage of on-time flights. That’s why it’s OK to screw over 100s of passengers, delaying them 24 hours or more. When you create a metric, people often simply conform to that metric. That’s why so many are ineffective
Risk miscalculation: when shark attacks rose by 25 percent in 2010, it meant there were 16 more attacks, almost all blamed on one very angry shark. You should be scared of driving in the cab to the airport, not flying. And you should be a lot more afraid of the food you eat than sharks. But people misuse numbers all the time to worry about the wrong things.
Opportunity cost: Given most people a choice, and they will pick a known unpleasant experience over an unknown every time. Many folks are very bad at measuring what they are giving up whenever they make choices. Go on vacation here, and you can’t go there. Take this job, and you can’t take that one. Stay in place at work and…you’ll never have an adventure. The way to avoid asking yourself “what might have been” is to carefully think about “what might be.”
Magical thinking: Remember that day you hit every light just right and got to work in 16 minutes? Well, that day is not today. In fact, it may never happen again. Yet you still leave with only 16 minutes to go to work. Why? You are a victim of magical thinking. Many folks get stuck – or they are perpetually late — because they can’t help being overly optimistic in all their endeavors.
Overweighting grandma: We used to joke that news is what happens to your editor on the way to work. No matter what reporters say, if an editor’s block has potholes, the town has a major pothole problem. Everyone is like this. If your grandma says at dinner that there’s too much color pink in TV these days, you’ll suddenly shy away from using pink in your next design. It’s human nature. A similar quality is called “recency.” You might read 27 reports going into a big meeting, but if someone says something to you as you walk into the room, you will give that person’s thought equal weight to all those reports.
Accidental reinforcement: Imagine you’ve been put a rat in a trap maze. If the rat solves the maze, she gets a piece of food dispensed from above. The idea here is to teach the rat to solve the maze as quickly as possible. But let’s say you work in a lab that’s particularly hard-hit by funding cutbacks, and your rate maze hasn’t been updated since the 1960s. It’s been acting up lately, but you have a test to run, so you drop the critter into the maze. She’s skittish, and so the first thing she does is panic and bang into a maze wall. Your aging trap shakes, and your food dispenser erroneously delivers a food pellet right where the rat is now seeing stars.
What happens next? The rat jealously scoops up the crumb, devours it with satisfaction, and then starts banging her head into the wall, expecting more food. You try explaining, cajoling, begging, even picking the rat up and putting her at the end of the maze, where more food awaits. No matter. She learned quickly that banging walls=food, and she’s going to keep on trying that, again and again. This is the torture called accidental reinforcement.
Often, an early success is the worst thing that can happen to a person.
If you, like me, enjoy data, then you probably realize that many of these maladies are SOLVED by GOOD data. Understanding that grandma’s observation represents a small sample size, you can burn out your observational bias with good data. However, good data is hard to find, and I hope you also recognize that in nearly all these flaws, dirty data is to blame. Given that we rarely have access to great data when making life choices, maintaining a healthy skepticism to the data you do get is essential to avoid falling into the data idolatry trap.