The pile of digital data is growing, doubling every 18 months or less. That pile is the new gold, drawing data miners hoping to strike it rich by finding patterns and uncovering insights that can lead to more efficient markets, higher productivity, safer streets, and the much loved increased profits.
Stephen Baker's new book, The Numerati (Houghton Mifflin), introduces some of the data miners, or numerati, who are leading efforts to probe the depths of the global data dump.
He profiles several numerati, focusing more on the personalities and potential use cases than the arcane details of the computer science and mathematics. Baker, who has written for BusinessWeek for more than 20 years, paints a rich portrait of how the flood of data and the efforts of the numerati will transform shopping, marketing, politics, health care, matchmaking, work, medicine, and other disciplines.
"Just as they've helped medical researchers find genetic markers pointing to certain types of breast cancer and Huntington's disease, they might tell grocers what type of fruit to promote to buyers of canned food or what kinds of magazines dog-food buyers tend to read," he wrote.
IBM researcher and featured numerati Samer Takriti is building detailed mathematical models of 50,000 of his colleagues. Baker describes Takriti's ultimate goal as follows:
"The goal here is to build entire models, complete with each person's quirks, daily commute, and allies and enemies. These models might one day include whether they eat beef or pork, how seriously they take the Sabbath, whether a bee sting or a peanut sauce could lay them low. No doubt, some of them thrive even in the filthy air in Beijing or Mexico City, while others wheeze. If so, the models would eventually include this detail, among countless others. Takriti's job is to depict flesh-and-blood humans as math."
In practical application, data processed by numerati from calendars, instant messaging, e-mail, cell phones, social networks, project records, resumes, and other sources could render a digital portrait of each worker. Machines could handily determine the optimal group for a specific project, taking into account budgetary, geographical, and other constraints.
The data could also be used to ferret out employees who aren't fulfilling their productivity quotient or are bypassing the chain of command. Companies have technology installed to monitor e-mail for spam, porn, and other abuses, they might as well use it to see what people in the company are thinking, Baker told me in a conversation last week. He acknowledged the significant privacy issues that go along with unleashing numerati on the world of data and addresses the issue in his book:
"At work, perhaps more than anywhere else, we are in danger of becoming data serfs--slaves to the information we produce," he wrote.
"Part of what needs to be calculated is how much this freaks out workers. It impacts productivity and the morale of employees. If a big technology company gets a reputation for monitoring every keystroke, the smart people will choose to work elsewhere. Companies have to figure out what works and what is overkill or freaks people out," Baker told me.
He states in his book that the "mathematical modeling of humanity....promises to be one of the great undertakings of the twenty-first century." This concept could be applied to Google and other companies who are extracting and analyzing billions of digital signals generated by individuals and groups.
Just because computer science and applied math makes data divination possible, the means don't necessarily justify the ends. The same technology used to determine the mathematical model of a terrorist or poor performer in the workplace can be used to violate the privacy and rights of unsuspecting, innocent people.
Baker told me in our conversation that we need tools to decide what information to share and with whom. Some of the social networks and major Web sites are working on that problem, but the solutions so far are inadequate. We'll need a generally accepted Bill of Rights for personal data to give the numerati and their overseers guidance on how to avoid "evil" in the evolving digital world. Of course, that is wishful, optimistic, and, perhaps, naive thinking.