In God We Trust, All Others Bring Data. Not So Fast!
In science, both hard and social (sociology, economics, etc.), there is a growing and expensive problem for research.
Simply put, early trials, statistical studies and so forth, often fail when broadly applied - whether that be in public policy for the social scientist, or large scale drug testing for the scientist.
Drug companies, in particular, are concerned. They begin with a theory that their scientists have about what might cure a particular disease or condition. Then the company spends a bit of money on tests and so forth and "it" looks encouraging.
Given those early results, they spend (sometimes hundreds of millions of dollars) on R&D and more testing. Here as well, the results look good. So now they do a bigger, more comprehensive test and... they come up with a dud. A very expensive dud.
Why? Two things: path forking and small data sets. What is that? (Be patient, this applies to business too.)
You may vaguely remember this from a long ago statistics class: "It" is statistically significant if there is a greater than 95% chance that a finding is not random or a result of just noise.
Well, not quite. What the test really says is that
given the data sample examined, there is a 95% chance that a finding is not random or a result of just noise.
"Given the data sample." That is the key. And there are often two problems with the data sample.
First, the sample is small. This means that for any result to cross that 95% threshold and be statistically significant, the finding itself must also be significant. A 60+% difference may be required and numbers like that grab our attention. But it may not be real - we mistakenly think this new finding is really important and powerful when, in fact, it's just a mathematical necessity given the small size of the sample used.
The second problem is that the finding only holds true for this data set. And to get to this data set, lots of (sometimes arbitrary) decisions were made. That's called "path forking."
Often, the data was picked because it was easy to get. Or, since we are trying to test a particular theory, we look for data that we think will apply. The point is,
the data set may not be sufficiently representative of the world at large. Said differently, we look at and take action based on a given study or analysis that was statistically significant, but we ignore similar studies that found nothing.
Ok, so much for drug companies and scientists, even the "social" ones.
What does this mean for your business world? Lots of business decisions are now made on so called "business analytics." You can get an MBA in that stuff these days. But it is bigger than that - think market research, manufacturing defect data and so forth. It can be simpler and smaller too. Here, as with drug companies,
the presence of statistical significance does not necessarily tell the entire story or reliably predict future outcomes. So here is what to look for when conducting or reviewing data-based analysis and recommendations.
- Is the data being analyzed representative of the entire pool of data you need to make your decision? What biases came into play regarding data selection? For example, looking at your own data about existing customers is not the same as looking at data for all potential customers in your market.
- What are the limits inherent in looking at readily usable data? I run into this problem frequently. For example, I am currently working with a commercial building contractor for whom I forecast cash flow weekly. While we use the average DSO for a particular customer/project to forecast payments, we recognize that the data set is small. Projects take 12 to 16 weeks from start to finish and invoices are monthly, so we don't have a lot of payment history available. And, it seems, even for the different projects with the same general contractor, payment timing varies. Because of the small sample size, when it comes to extrapolating this finding across all future cash flows, we do so with some caution.
Here's what to do:- When reviewing the analysis, always probe the source of the underlying data with an eye towards recognizing the limitations on generalization it may impose. Do that in the context of the decision you are making. Is it representative? For example, we may use what works for existing customers to sell to the ones we don't have. Analysis is done and so forth, but how good is the assumption that the customers you don't have will behave the same way as the ones you do?
- Test whenever possible. For example, years ago at Kraft Foods, packaging developed a squeezable container for Miracle Whip. They did lots of lab tests. When we made the product samples, rather than rolling out straight to a test market, the product was handed out to employees to use for 60 days. On the first day, the bottle fell from the refrigerator onto the VP of Marketing and Miracle Whip splattered everywhere. Seems real people drop bottles differently than the machine in the lab. The lab test data was irrelevant.
- Consider the nature of the decision. When I was in the car rental business, we had a huge problem with customers wrecking our cars. We were stuck with big bills to fix them, not to mention covering hospital and legal bills for our customers and others (our customers were usually at fault). In New York City, Hertz led the industry by announcing surcharges by zip code. We analyzed our data, and found similar trends. But we knew that because of our tiny market share, we had a data set too small to mean anything. So we relied on Hertz's much bigger data set and followed along, with a better marketing spin, because our losses were so big.
We then took the lead at Baltimore Washington International (BWI) airport where we had horrendous accident losses (worse than in New York). While our data sets were small, and the room for error high, we raised rates for all locals from anywhere in Maryland. We didn't have the data to identify zip codes with statistically significant bad accident rates - the sample sizes were too small to mean anything at the zip code level. However, all zip codes looked bad, just not in a statistically significant way. Since the cost of a wreck was sky high, we punted trying to find good local drivers: No return flight out of BWI, no rental car. It worked. Just because you don't have enough data, don't be afraid to act when the costs of not acting are too high.
- Buy outside data when available. If you can purchase data covering a wider universe than just your own, you'll get a broader, more reliable picture. In my rent a car example above, we had the option of buying accident data reported by many insurers.
Research and the collection of data are important in making sound decisions, there's no doubt about that.
That said, remember that when making these decisions, you must be careful to understand and identify the limitations baked into your analysis. Absent that insight, you may spend lots of money on the "new cure" and come up with butkus, just like big pharma.
One last thought. We make lots of decisions based on mental models - of markets, organization, competitors, etc. Often, these models are based on prior experience.
Talk about limited data sets!!!
P.S. For a short and easy read on this topic from the perspective of a career statistician, click
here. Or, listen to the podcast interview of the author above,
here.