IBM wants to scramble your data

It's all in the name of security--and commerce. The company says its new software could benefit both merchants and customers.

June 17, 2002 1:08 a.m. PT

3 min read

IBM has developed new software that it hopes will make you feel safer about your privacy online.

The software takes personal information and scrambles it before forwarding it to merchants. On the merchant end, the software can unscramble the data enough for a company to mine for a marketing campaign--without revealing any individual's personal information.

If adopted by merchants and consumers, the new software could benefit both groups, said Rakesh Agrawal, a researcher at IBM's Privacy Research Institute. Consumers could get marketing messages targeted to them without worrying about sacrificing their private information. And merchants and marketers could get useful data without worrying about whether consumers were giving false information.

"Consumers are not willing to give up their good data because of privacy concerns, and in the process, they're basically lying," Agrawal said. "We can institute lying by making it happen scientifically.

"This software will do a better job of coming up with a random value for age than (a customer) would."

Many top companies online and off have been investing in customer relationship management (CRM) software to try to market their goods and services more effectively. Such software can find the customers most likely to buy a particular book or to want to consolidate their debt on a single credit card.

But such software is only as good as the data it works with, and many companies are plagued with "dirty data." Some of the faulty data are the result of hardware failures or mistakes in data entry. Other false data are the result of consumers entering the wrong data in the first place.

Online merchants have long been struggling with how to balance their desire for more information from consumers with people's privacy concerns. Many have posted privacy statements saying they won't share customers' personal data, and some have allowed consumers to opt in or out of having their information shared.

But privacy remains a top concern among many consumer advocates. Many have advocated federal laws to protect privacy rights. A law introduced last month by Sen. Ernest "Fritz" Hollings, D-S.C., would do just that and includes liability provisions for companies that violate online privacy laws. Business groups oppose the law, saying it would hinder online commerce.

IBM's new software, written in Java, would attempt to navigate a course between the two sides while trying to meet the needs of both.

The software, which could be built into a Web browser or offered as a browser plug-in, would randomize information before it is transmitted to a merchant. The program might scramble the amount of salary a person makes by adding $15,000 to it, for instance, or scramble a person's age by subtracting five years.

On the merchant side, the software aggregates all the data, coming up with a distribution of values. By knowing the range of values with which each entry has been randomized, the program can take the raw, scrambled data distribution and come up with a more accurate distribution of aggregate values. Merchants can then mine that data to find target customer groups.

By communicating between the merchant and the client, the software also can allow merchants to send targeted advertisements and marketing messages to customers without ever holding their personal information, Agrawal said.

Although IBM doesn't have any customers for the software, it hopes to start testing the software soon, said company spokeswoman Kendra R. Collins. The software could be used not only with online merchants, but also by medical researchers dealing with sensitive health information, Agrawal said.

The software will not directly address the issue of consumers entering false data, Agrawal acknowledged. But if customers have confidence that their private information will be protected, they will be more likely to enter their real data, he said.

"If I can have this trust, then I have an incentive when I fill in a form to put in the true values, because I know this value cannot be reconstructed," he said.

IBM launched its privacy institute earlier this year. The data-scrambling software is the first project to be announced by the institute.