Is it possible to have too much of a good thing? The answer is a resounding “yes,” whether the good thing is double-chocolate, triple-layer fudge cake, or data. Knowledge management professionals in business, government and academia are all coming to realize in this Information Age that there is such a thing as too much information, and it does not solely apply to your friend’s vivid description of his recent blind date.
An insightful article in Wired, “Biology’s Big Problem: There’s Too Much Data to Handle,” discusses this phenomenon and its impact on scientists involved in genetics research. There is some irony in the cause of this informational crisis in biology, which has largely resulted from a dramatic drop in the costs—in money and time—of gene sequencing.
Techniques for determining the sequence of base pairs in the genome of an organism were developed over 30 years ago. But these approaches were very expensive and time-consuming. In 2006, the X Prize Foundation offered a $10 million prize to spur researchers to improve genome sequencing techniques. The prize was never awarded, but scientists innovated nevertheless. So much so that the costs of gene sequencing have plummeted, falling even faster than the cost of the computer technology needed simply to store and transmit—let alone begin to analyze—the ever-growing mountains of data.
There’s a certain degree of academic inertia that hinders efforts to change direction and focus in scientific research. Sequencing is much cheaper than it once was, but it’s still not free, and research funding is limited. When funding agencies like the National Institutes of Health are seeing ever-growing bang for their buck in terms of sheer volume of data generated, individuals making granting decisions have strong incentives to keep throwing money at data generation, ignoring scientists who need funding to analyze that data with techniques that have not dropped in cost so dramatically.
The Spy Who Loved Me Too Much
While the problem of too much information in scientific research has gotten some attention and discussion, far more prominent recently are revelations of the enormous volume of data collected by the National Security Agency (NSA). While this obviously raises many important questions about civil liberties and government invasiveness, it also brings with it more pragmatic problems. It’s not just that what the NSA is doing is objectionable for its violation of personal privacy. It’s also preventing it from doing its job.
A Wall Street Journal article quotes William Binney, a retired NSA computer scientist who helped develop some of the Internet-snooping techniques used at the Agency. “What they are doing is making themselves dysfunctional by taking all this data,” he said at a 2013 privacy conference in Switzerland.
Some of the documents released by Edward Snowden included internal memos in which the Agency itself acknowledges these concerns. One of the leaked reports describing the NSA’s efforts to track foreign cell phones noted that this massive data collection endeavor was “outpacing our ability to ingest, process and store” the collected information.
Less Is More (Money)
Governments and academia are adjusting to the new paradigm in information technology. Some businesses have been quicker, recognizing that big numbers aren’t very impressive if they don’t include that bottom line.
Companies that have successfully addressed the problem of too much information include Boeing, Nike, Macy’s, and Land Rover, as detailed in a 2013 Slashdot article. The informational overhaul these companies initiated was time-consuming and in most cases involved bringing in outside consultants to objectively evaluate the inefficiencies inherent in their knowledge management approaches. But in the end these efforts paid off handsomely.
One key approach was to centralize their informational processing, reducing both redundancies and discrepancies. Another was to develop and implement software to better organize the information they had. And of course they were able to streamline their analysis of business data by recognizing that much of it was not relevant and was simply making the appraisal far more complicated. When the pertinent data was separated from this informational dead weight, analysis became much more efficient. If you must search for needles in haystacks, try to make the haystack as small as possible.
Efficient Knowledge Management
A&E’s popular documentary program Hoarders delighted and horrified millions of viewers with its vivid depictions of people with a psychological disorder compelling them to hold on to just about everything they ever owned, whether they needed it or not, until their lives were made unmanageable by the sheer volume of useless garbage they could not bring themselves to part with.
But while we view these hoarders with a blend of contempt and pity, governments, scientific institutions and multi-billion dollar companies often find themselves in the same predicament, just with data instead of a living room piled to the ceiling with cardboard boxes full of junk. If you are drowning in data, consider partnering with knowledge management experts who have experience successfully trimming down haystacks. You might find a lot of great needles in there.
Andrew Breslin is the author of two novels, Mother’s Milk, published in 2005 by ENC Press, and Practical Applications of Game Theory, currently being published in serial form at Imaginaire, the Journal of Mathematical Fiction. He blogs and reviews books at Goodreads. Some of his short fiction can be found on his website. When he isn’t writing he enjoys playing the banjo, chess, idolizing his cat, and thinking about math.