X

Facebook's AI is flagging more hate speech before you report it

The social network says its automated tools are improving, but it's still under pressure to do more to tackle harmful content.

Queenie Wong Former Senior Writer
Queenie Wong was a senior writer for CNET News, focusing on social media companies including Facebook's parent company Meta, Twitter and TikTok. Before joining CNET, she worked for The Mercury News in San Jose and the Statesman Journal in Salem, Oregon. A native of Southern California, she took her first journalism class in middle school.
Expertise I've been writing about social media since 2015 but have previously covered politics, crime and education. I also have a degree in studio art. Credentials
  • 2022 Eddie award for consumer analysis
Queenie Wong
4 min read
facebook-f8-2019-mike-schreopfer-0032

Facebook's chief technology officer, Mike Schroepfer, oversees the social network's efforts to develop automated tools to detect harmful content.

James Martin/CNET

Facebook said Thursday that it's catching more hate speech before users report it, because of improvements to its artificial intelligence technology

From July to September, Facebook's AI tools proactively detected 94.7% of the hate speech removed by the company, up from 80.5% in the same period last year, Facebook said. The social network attributed the uptick to improvement in its automated tools, including better training of the machines. In the third quarter, Facebook took action against 22.1 million pieces of content for hate speech. The company's photo service, Instagram, took action against 6.5 million pieces of hate speech content.

"My goal is to continue to push this technology forward so that as few -- hopefully at some point zero -- people in the world have to encounter any of this content," Mike Schroepfer, Facebook's chief technology officer, said about posts that violate the social network's community standards. He made the comments during a press call.

For the first time, Facebook also shared new data that indicates to the social network how many harmful posts are slipping through the cracks. There are 10 to 11 views of hate speech out of every 10,000 views of Facebook content, the company said.

The social media giant, which uses a mix of human reviewers and technology to remove harmful content, has been under fire from civil rights activists and politicians who say Facebook isn't enforcing its rules against speech that directly attacks a person based on race, gender or other protected characteristics. Major brands this year paused spending on Facebook ads to pressure the company to do more to tackle hate speech, which they say is still slipping through on the social network.

At the same time, content moderators who contract with Facebook are demanding better working conditions. On Wednesday, more than 200 content moderators sent a letter to Facebook calling for better pay and mental health benefits as some workers are forced back to the office amid the coronavirus pandemic. The moderators said Facebook's AI systems were still missing risky content such as posts about self-harm. "Facebook's algorithms are years away from achieving the necessary level of sophistication to moderate content automatically," the letter said. Some moderators have also sued Facebook, alleging that the job of reviewing offensive content took a toll on their mental health.

Guy Rosen, Facebook's vice president of integrity, said during a press call that the majority of content moderators still work from home. Some offensive content, though, might be too graphic to be reviewed around family members, so people have to return to the office. The company has safety measures such as social distancing, hand sanitizer and mandatory temperature checks for workers who have to return, he said.

Facebook didn't share data about the accuracy of its AI systems, but Schroepfer said it depends on the type of content that's being reviewed. Machines have a higher bar for removing hate speech than ad content because "accidentally taking down someone's post can be devastating," he said.

Schroepfer also acknowledged that the company still has work to do. "I'm not naive about this," he said. "I'm not at all saying that technology is the solution to all these problems." The company also has to improve policy definitions and some content still requires human analysis because it's so nuanced. Hateful memes, for example, can be difficult for a machine to detect because this requires understanding how words work with an image. The phrase "You belong here" with an image of a playground wouldn't violate Facebook's rules. The same phrase with a photo of a graveyard, however, could be used to target a group of people and therefore would be considered hate speech. Schroepfer said he doesn't anticipate Facebook will reduce human reviewers in the short or long term but said AI can help speed up content moderation.

There are also challenges that come with using AI to detect misinformation. A user could add a border to an image with misinformation, or blur words, to evade detection.

Social networks have grappled with an onslaught of misinformation about the US election and the COVID-19 pandemic. From March through Nov. 3, Facebook removed more than 265,000 pieces from Facebook and Instagram for voter interference. During the same period, Facebook displayed warnings on more than 180 million pieces of content debunked by third-party fact-checkers. The company also added new labels under election content, directing users to its voting information center, but it's unclear how effective they were in reducing the spread of misinformation.

From March to October, Facebook took down more than 12 million pieces of content on Facebook and Instagram that had the potential to lead to physical harm. The company said it displayed warnings on 167 million pieces of content about the novel coronavirus that had been debunked by fact-checkers.

Facebook also said Thursday that it'll update its online rules, known as community standards, monthly and that it's providing more details about existing rules.