Smoking out photo hoaxes with software

Did Brad Pitt really meet up with aliens at the World Economic Forum? Software for detecting photo fraud may have the answer. Photos: Pictures that lie

Michael Kanellos Staff Writer, CNET News.com

Michael Kanellos is editor at large at CNET News.com, where he covers hardware, research and development, start-ups and the tech industry overseas.

See full bio

Michael Kanellos

Feb. 1, 2006 6:03 a.m. PT

5 min read

Dartmouth College professor Hany Farid is no fan of Josef Stalin, but he acknowledges that the photo retouching done during the Soviet era was top notch.

"That was impressive work. I've seen some of the originals," Farid said. The Soviets just didn't airbrush their victims out, he added. They painted in new backgrounds on the negatives.

Farid's interest in photo retouching isn't just historical. The professor of computer science and applied mathematics runs the university's Image Science Group, which has emerged as one of the chief research centers in the U.S. for developing software to detect manipulation in digital photographs.

While some of the group's software is now used by the FBI and large media organizations such as Reuters, a version written in Java will come out soon that will be easier to use and thereby allow more police and media organizations to sniff out fraud. The current software is written in Matlab, a numerical computing environment.

"I hope to have a beta out in the next six months," Farid said. "Right now, you need someone who is reasonably well-trained to use it."

Photo manipulation is a lot more common than you might think, according to L. Frank Kenney, an analyst at Gartner. That Newsweek cover of Martha Stewart on her release from prison? It's Martha's head, but a model's body. Some people believe hip hop artist Tupac Shakur remains alive, in part because of the images that have cropped up since his reported death in 1996.

Although it's difficult to estimate the size of the market for fraud detection tools, the demand is substantial, according to Kenney.

"How much is the presidency of a country worth, or control of a company? People tend not to read the retractions," he said. "Once the stuff is indelibly embedded in your memory, it is tough to get out."

The Journal of Cell Biology, a premier academic journal, estimates that around 25 percent of manuscripts accepted for publication contain at least one image that has been "inappropriately manipulated" and must be resubmitted. That means it has been touched up, although in the vast majority of cases, the author is only trying to clean the background and the changes do not affect the scientific efficacy of the results. Still, around 1 percent of accepted articles contain manipulated images that do significantly affect the results, said executive editor Mike Rossner. Those papers get rejected.

"Our goal is to have an accurate interpretation of data as possible," Rossner said. "These (images) are (of) things like radioactivity detected on a piece of X-ray film."

Law enforcement officials have also had to turn to the software to prosecute child pornographers. In 2002, the Supreme Court in Ashcroft v. Free Speech Coalition overturned parts of the Child Pornography Protection Act for being overly broad, ruling that only images of actual minors, and not computer-generated simulations, are illegal.

Since that decision, a common defense has become that the images found on a hard drive are artificially created.

"The burden is now on the prosecution. These cases used to be slam dunks," Farid said.

How it works
Fraud detection software for images essentially searches for photographic anomalies that the human brain ignores or can't detect.

Humans, for instance, ignore lighting irregularities in two-dimensional images. While the direction of light can be re-adjusted in 3D images from video games, it is difficult to harmonize in 2D photographs. The light in the famous doctored photo that puts Sen. John Kerry next to actress Jane Fonda at a protest rally actually comes from two different directions.

"The lighting is off by 40 degrees," Farid said. "We are insensitive to it, but computers detect it."

Although modern researchers have in clinical studies documented humans' ability to filter out lighting incongruities, 15th-century painters were aware of the way humans process images and exploited that knowledge to create seemingly realistic lighting effects that would have been nearly impossible to replicate in real life.

"The lighting is totally bizarre in some Renaissance paintings," he said.

The software also seeks out areas in photographs where applications like Adobe Photoshop fill in pixels. Every time the photos get mashed together, some modification of one or both of the images is required. Sometimes one person is blown up in size while a second might be rotated slightly. These changes leave empty pixels in the frame.

Photo-retouching applications use probability algorithms to fill in those pixels with colors and imagery and thus make them look realistic. Conversely, Farid's software employs probability to ferret out which of these fringe pixels are fill-ins.

"We're asking, from a mathematical and statistical perspective, can you quantify the manipulation," he said. "There are statistical correlations that don't occur naturally."

The quality of forgeries and touch-up jobs varies widely, but it continually improves. Farid gets consulting requests all the time. Some people call him to see if a photo of an item on eBay has been retouched. Others want advice on the genuineness of photos from online dating services. The Image Science Group has also collaborated with the Metropolitan Museum of Art in New York to determine if certain drawings were actually made by Flemish painter Bruegel or were forgeries.

One of the most recent celebrated cases of fraud--South Korean scientist Hwang Woo-suk's claim that he cloned stem cells--actually didn't need specialized software. Spots and artifacts in the background visible to the naked eye showed that the images of cells that came from the supposedly cloned dog were duplicated, Farid noted.

Farid's interest in fraud detection is somewhat random. As a post-doctoral student at MIT seven years ago, he was meandering through the library looking for something to read. He grabbed the Federal Rules of Evidence, a compendium of laws governing the admission of evidence in trials in federal court.

The rules, at the time, allowed digital images of original photographs to be admitted in court as long as they accurately reflected the original. The footnotes that accompany the rules, however, acknowledged that manipulation was a problem and that government did not yet have a way to deal with it.

When Farid started researching the scientific literature, he found little on fraud detection in digital imagery.

Will you be able to get a copy of the Java-based version of the Image Science Group's applications? Probably not. One of the dilemmas of this type of software is that the more widespread the distribution, the more chance forgers will exploit it to their advantage. Police organizations and news media outlets will likely get access to the application, but he's still unsure of how far he will extend distribution beyond that.

And although Farid charges a fee when asked to serve as a consultant, the software will be made freely available under an open-source license. He doesn't even have plans to form a company around his work. A significant amount of the research, after all, was funded by federal grants.

"Taxpayers," he said, "are paying me to do this research and it needs to go back out."