Facebook: New AI tech spots hate speech faster

"Self-supervised learning" lets AI adapt faster so Facebook can spot problems sooner in text, video and photos, the company says.

CTO Mike Schroepfer discusses Facebook's use of "self-supervised" AI technology at the company's F8 developer conference.
Screenshot by Stephen Shankland/CNET

Facebook's AI engineers have embraced a technology called self-supervised learning so the social network's technology can adapt faster to challenges like spotting new forms of hate speech.

Artificial intelligence is sweeping the tech industry, and beyond, as the new method for getting computers to recognize patterns and make decisions catches on. With today's AI technology called deep learning, you can get a computer to recognize a cat by training it with lots of pictures of cats, instead of figuring out how to define cat characteristics like two eyes, pointy ears and whiskers.

Self-supervised learning, though, needs vastly less training data than regular AI training, which cuts the time needed to assemble training data and train a system. For example, self-supervised learning methods have cut the amount of training data needed by a factor of 10, Manohar Paluri, an AI research leader at Facebook, said Wednesday at the company's F8 developer conference.

And that speed is critical to making Facebook fun and safe, not a cesspool of toxic comments, misinformation, abuse and scams.

"It's really easy to lose hope, to pack up and go home," Facebook Chief Technology Officer Mike Schroepfer said in a keynote speech. "But we can't do that. We're here to bring a better future to people with technology."

Fixing Facebook with AI

Paluri boasted that Facebook's AI is improving many problems on the world's largest social network: bullying, hate speech, violence, terrorist propaganda, child nudity, spam, adult content and fake accounts.

But it's got a long way to go, as speakers acknowledged at the conference, especially in recognizing problematic videos like those of the New Zealand mosque shootings in March. And that doesn't even touch on the privacy problems Facebook Chief Executive Mark Zuckerberg said he's trying to fix. Facebook executives blended some contrition with their usual brashness at the conference, an indication that they know they're not yet out of the woods.

Facebook increasingly relies on AI to fix problems like spam and hate speech, CTO Mike Schroepfer said at the company's F8 developer conference.

Facebook increasingly relies on AI to fix problems like spam and hate speech, CTO Mike Schroepfer said at the company's F8 developer conference.

Screenshot by Stephen Shankland/CNET

Using AI to help fix some of its problems is a natural idea for engineering-focused Facebook. It's an AI giant, applying the technology to tasks as difficult as debugging its own software, and employing pioneer Yann LeCun, one of three winners of the prestigious Turing Prize for his AI work this year.

Facebook isn't alone in pursuing AI, which is spreading well beyond the tech world. A survey from consulting firm Deloitte publicized Wednesday found that 57 percent of businesses around the world adopting the technology early expect AI to transform their business -- and are often investing now to try to get ahead of an expected broader transformation.

But though AI can fix computer science problems, it also adds new ones, like the difficulties in eradicating AI bias, which can reinforce the problems or the advantages some classes of people have in society.

How does self-supervised learning work?

Self-supervised learning is a new twist on the crucial training phase of AI.

Today's AI training data is typically "supervised," which means it relies on carefully labeled training data. That data is hard to amass -- especially in the vast quantities needed to best train AI systems. Labeled cat photos are abundant, but companies using AI have to spot everything from fraudulent credit card transactions to computer bugs.

Facebook uses self-supervised AI training technology to process speech, text, video and photos, said Manohar Paluri, an AI research leader, at Facebook's F8 conference.

Screenshot by Stephen Shankland/CNET

With self-supervised learning, AI uses training data that's unlabeled, Schroepfer said. But it's not totally raw data. Instead, some bits are removed, like words from text or rectangles of pixels from photos.

That lets AI systems learn patterns by figuring out how to reconstruct what's missing, and it's easier to supply the "massive volumes of data" that're so useful for tasks like natural language processing (NLP), or understanding human speech and text. Facebook also is using self-supervised learning in handling photos, videos and text, Schroepfer said.

"You generate the training set and the answers all at once," Schroepfer said. "Because you're using so much data, these NLP systems are starting to catch deeper and more nuanced understanding of language."