In a prior post I alluded to the fact that the buzzword 鈥淏ig Data鈥 is just a new term for 鈥渄ata mining.鈥 The potential for big data analytics to discover new things about us is frightening from a privacy perspective, as I discussed. But, it can also be鈥攍et鈥檚 be honest鈥攙ery cool.
For example, this 鈥攂ased on the texts of millions of books鈥攃harts the shifting usages of the phrase 鈥渢he United States is鈥 versus 鈥渢he United States are.鈥 The scientist Stephen Wolfram created a lot of buzz with a about how he crunched over 20 years of his own e-mail traffic and other data to reveal patterns in his daily activities.
Such examples and like them are definitely nifty. Yet they are also a bit... empty. They don鈥檛 quite seem revolutionary. And in fact a number of commentators have raised the question about whether big data is a trendy topic du jour that is being . The question is whether big data is just a compelling idea that will turn out to have few truly transformative applications鈥攁n intellectual trend that temporarily grips the imagination. (The fleeting fascination with 鈥渃haos theory鈥 in the 1990s comes to mind.)
Or, like the Aztecs using wheels but only for children鈥檚 , have we stumbled upon a great tool that we are only starting to figure out how to really, fully exploit?
Ultimately we can鈥檛 know yet. But I recently came across this passage which is very relevant to this question. It鈥檚 the great anthropologist Clifford Geertz paraphrasing the philosopher Susanne Langer; Geertz is a concept in anthropology, but I suspect this pretty much captures the status of Big Data as well:
In her book, Philosophy in a New Key, Susanne Langer remarks that certain ideas burst upon the intellectual landscape with a tremendous force. They resolve so many fundamental problems at once that they seem also to promise that they will resolve all fundamental problems, clarify all obscure issues. Everyone snaps them up as the open sesame of some new positive science, the conceptual center-point around which a comprehensive system of analysis can be built. The sudden vogue of such a grande ide茅, crowding out almost everything else for a while, is due, she says, 鈥渢o the fact that all sensitive and active minds turn at once to exploiting it. We try it in every connection, for every purpose, experiment with possible stretches of its strict meaning, with generalizations and derivatives.鈥
After we have become familiar with the new idea, however, after it has become part of our general stock of theoretical concepts, our expectations are brought more into balance with its actual uses, and its excessive popularity is ended. A few zealots persist in the old key-to-the-universe view of it; but less driven thinkers settle down after a while to the problems the idea has really generated. They try to apply it and extend it where it applies and where it is capable of extension; and they desist where it does not apply or cannot be extended. It becomes, if it was, in truth, a seminal idea in the first place, a permanent and enduring part of our intellectual armory. But it no longer has the grandiose, all-promising scope, the infinite versatility of apparent application, it once had. The second law of thermodynamics, or the principle of natural selection, or the notion of unconscious motivation, or the organization of the means of production does not explain everything, not even everything human, but it still explains something; and our attention shifts to isolating just what that something is, to disentangling ourselves from a lot of pseudoscience to which, in the first flush of its celebrity, it has also given rise.
We鈥檝e certainly seen attempts to over-use the neat-o concepts of data mining in recent years鈥攕uch as the embrace by parts of our security establishment of Total Information Awareness and related notions that pattern-based data mining can be used to identify terrorists. Here鈥檚 looking forward to the day when 鈥渓ess driven thinkers鈥 within those agencies 鈥渟ettle down鈥 to a realistic view of what data mining can do.
Which is not to say that it can鈥檛 actually do a lot of things, good and bad鈥攁nd that we don鈥檛 need better privacy protections.